JakeDern commented on code in PR #8001: URL: https://github.com/apache/arrow-rs/pull/8001#discussion_r2263730490
########## arrow-ipc/src/writer.rs: ########## @@ -760,34 +812,108 @@ impl DictionaryTracker { /// * If the tracker has not been configured to error on replacement or this dictionary /// has never been seen before, return `Ok(true)` to indicate that the dictionary was just /// inserted. - pub fn insert(&mut self, dict_id: i64, column: &ArrayRef) -> Result<bool, ArrowError> { - let dict_data = column.to_data(); - let dict_values = &dict_data.child_data()[0]; - - // If a dictionary with this id was already emitted, check if it was the same. - if let Some(last) = self.written.get(&dict_id) { - if ArrayData::ptr_eq(&last.child_data()[0], dict_values) { - // Same dictionary values => no need to emit it again - return Ok(false); + pub fn insert( Review Comment: Do you think it could make sense to configure this on the tracker via some `with_dictionary_handling(&mut self, handling: DictionaryHandling)` function? We could default it to `DictionaryHandling::Resend` in the existing constructor. I'm not sure how much sense it actually makes to be able to change this on different calls to the same tracker. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org