yongkyunlee opened a new issue, #8212:
URL: https://github.com/apache/arrow-rs/issues/8212

   **Describe the bug**
   
   When decoding Avro data with nullable fields, the decoder panics during 
flush() if any intermediate records fail to decode. This occurs even though the 
decoder correctly returns an error for the malformed record and successfully 
processes subsequent valid records.
   
   **To Reproduce**
   
   ```
   use arrow_avro::reader::*;
   
   // Create a nullable Int32 decoder with NullSecond ordering
   let avro_type = AvroDataType::new(
       Codec::Int32,
       Default::default(),
       Some(Nullability::NullSecond),
   );
   let mut decoder = Decoder::try_new(&avro_type).unwrap();
   
   // Row 1: Valid null value (branch = 1 for NullSecond)
   let row1 = vec![0x02]; // varint encoding of 1
   
   // Row 2: Invalid non-null - branch indicates non-null but missing int32 
payload
   let row2_malformed = vec![0x00]; // varint encoding of 0, but no following 
int32
   
   // Row 3: Valid non-null value
   let row3 = vec![0x00, 0x54]; // branch 0 + varint encoding of 42
   
   // Process rows
   decoder.decode(&mut AvroCursor::new(&row1)).unwrap(); // Success: null
   decoder.decode(&mut AvroCursor::new(&row2_malformed)).is_err(); // Error: 
incomplete
   decoder.decode(&mut AvroCursor::new(&row3)).unwrap(); // Success: 42
   
   // This panics with the buggy code due to bitmap/values mismatch
   let array = decoder.flush(None).unwrap(); // PANIC!
   ```
   
   **Expected behavior**
   
   The decoder should maintain internal consistency even when individual record 
decodes fail. After processing the three rows above:
   - flush() should succeed and return an array with 2 elements: [null, 42]
   - Row2's decode error should not corrupt the decoder's state
   
   **Additional context**
   
   This bug affects production systems processing Avro data with nullable 
fields. When malformed or truncated Avro data is encountered, instead of 
gracefully handling the error and continuing with valid records, the decoder 
panics during flush due to internal state corruption.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to