timsaucer opened a new pull request, #7437:
URL: https://github.com/apache/arrow-rs/pull/7437

   # Which issue does this PR close?
   
   None, but I can open one if necessary.
   
   # Rationale for this change
    
   The ordering of metadata is not consistent since it uses a HashMap. It can 
be useful in unit tests to verify an output from a known hash of it's 
serialized values. With metadata this is not consistent.
   
   # What changes are included in this PR?
   
   Adds ordering to the hashmap keys when encoding.
   
   # Are there any user-facing changes?
   
   No.
   
   # Example
   
   If you run this example multiple times, you will see the encoding changes 
from run to run based on the non-deterministic ordering of the hashmap iterator.
   
   ```rust
   use std::{hash::Hasher, sync::Arc};
   
   use arrow::{array::RecordBatch, datatypes::Schema};
   
   fn main() {
       let schema = Arc::new(
           Schema::empty().with_metadata(
               [
                   ("a".to_owned(), "1".to_owned()), //
                   ("b".to_owned(), "2".to_owned()), //
                   ("c".to_owned(), "3".to_owned()), //
                   ("d".to_owned(), "4".to_owned()), //
                   ("e".to_owned(), "5".to_owned()), //
               ]
               .into_iter()
               .collect(),
           ),
       );
       let batch = RecordBatch::new_empty(schema.clone());
   
       dbg!(&batch.schema().metadata().keys());
   
       let mut bytes = Vec::new();
       let mut w = arrow::ipc::writer::StreamWriter::try_new(&mut bytes, 
&schema).unwrap();
       w.write(&batch).unwrap();
       w.finish().unwrap();
   
       let mut h = std::hash::DefaultHasher::new();
       h.write(&bytes);
       let h = h.finish();
   
       eprintln!("{} bytes -- h = {h:x}", bytes.len());
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to