thorfour opened a new issue, #10213:
URL: https://github.com/apache/arrow-rs/issues/10213

   ### Describe the bug
   
   When the IPC writer encodes a record that has a schema with a 
`Dict(Dict(...,...))` encoded column, the StreamReader cannot decode it. It 
throws a 'Buffer count mismatched with metadata' error. 
   
   ```bash
   running 1 test
   test tests::dict_of_dict_ipc_error ... FAILED
   
   failures:
   
   ---- tests::dict_of_dict_ipc_error stdout ----
   Error: IpcError("Buffer count mismatched with metadata")
   
   
   failures:
       tests::dict_of_dict_ipc_error
   
   test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered 
out; finished in 0.00s
   ```
   
   ### To Reproduce
   
   ```rust
   #[test]
       fn dict_of_dict_ipc_error() -> std::result::Result<(), ArrowError> {
           use arrow_ipc::reader::StreamReader;
           use arrow_ipc::writer::StreamWriter;
   
           fn ipc_roundtrip(batch: &RecordBatch) -> 
std::result::Result<RecordBatch, ArrowError> {
               let mut buf = Vec::new();
               {
                   let mut writer = StreamWriter::try_new(&mut buf, 
&batch.schema())?;
                   writer.write(batch)?;
                   writer.finish()?;
               }
               StreamReader::try_new(buf.as_slice(), None)?
                   .next()
                   .expect("one batch")
           }
   
           let single = DataType::Dictionary(Box::new(DataType::UInt32), 
Box::new(DataType::Utf8));
           let dod = DataType::Dictionary(Box::new(DataType::UInt32), 
Box::new(single.clone()));
           let original = dict_of_dict();
           let declared = Arc::new(Schema::new(vec![Field::new("f", dod, 
true)]));
           let batch =
               RecordBatch::try_new(Arc::clone(&declared), 
vec![Arc::clone(&original)]).unwrap();
   
           // Reproduces the bug: dict-of-dict cannot round-trip through Arrow 
IPC.
           ipc_roundtrip(&batch)?;
   
           Ok(())
       }
   ```
   
   ### Expected behavior
   
    I would expect it to encode/decode it correctly, or at the very least throw 
an error that this schema is not supported by IPC.
   
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to