Duansg opened a new pull request, #3897:
URL: https://github.com/apache/hertzbeat/pull/3897

   ## What's changed?
   
   Close #3871
   
   ### Background
   
   When using one-time tasks or private collectors, if a single metric 
collection returns multiple data entries, the Collector will call 
`responseSyncJobData` after all metrics are collected to bundle all results 
into a single List. `ArrowUtil.serializeMetricsData` then serializes this List 
into an Arrow stream containing multiple `VectorSchemaRoot` objects.Upon 
receiving the data, the Manager invokes `ArrowUtil.deserializeMetricsData`. In 
unpatched versions, this triggers an `Unexpected end of input` exception, as 
shown below:
   
   > Page waiting timeout
   
   <img width="1505" height="638" alt="error_arrow" 
src="https://github.com/user-attachments/assets/b994ae70-2c7e-453c-9720-0406aeb72f06";
 />
   
   > Console error
   
   ```
   Caused by: java.io.IOException: Unexpected end of input. Missing schema.
        at 
org.apache.arrow.vector.ipc.ArrowStreamReader.readSchema(ArrowStreamReader.java:207)
        at 
org.apache.arrow.vector.ipc.ArrowReader.initialize(ArrowReader.java:178)
        at 
org.apache.arrow.vector.ipc.ArrowReader.ensureInitialized(ArrowReader.java:171)
        at 
org.apache.arrow.vector.ipc.ArrowReader.getVectorSchemaRoot(ArrowReader.java:68)
        at 
org.apache.hertzbeat.common.util.ArrowUtil.deserializeMultipleRoots(ArrowUtil.java:98)
        ... 40 common frames omitted
   ```
   
   > Mock error
   ```
    @look 
org.apache.hertzbeat.common.util.ArrowUtilTest#testSerializeAndDeserializeMetricsData
   ```
   
   ### Reasons
   `ArrowStreamReader` pre-reads data during deserialization, causing 
positional offsets when consecutively reading multiple `Arrow` streams and 
resulting in an `Unexpected end of input` exception.
   
   ### Modification details
   1. Improved the serialization/deserialization logic of ArrowUtil by adding 
length-prefixed data, ensuring each Arrow stream can be read with precision.
   2. Developed the reproducible test case ArrowUtilTest and passed the test.
   
   ## Checklist
   
   - [x]  I have read the [Contributing 
Guide](https://hertzbeat.apache.org/docs/community/code_style_and_quality_guide)
   - [ ]  I have written the necessary doc or comment.
   - [x]  I have added the necessary unit tests and all cases have passed.
   
   ## Add or update API
   
   - [ ] I have added the necessary [e2e 
tests](https://github.com/apache/hertzbeat/tree/master/e2e) and all cases have 
passed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to