JakeDern commented on code in PR #8001:
URL: https://github.com/apache/arrow-rs/pull/8001#discussion_r2263730490


##########
arrow-ipc/src/writer.rs:
##########
@@ -760,34 +812,108 @@ impl DictionaryTracker {
     /// * If the tracker has not been configured to error on replacement or 
this dictionary
     ///   has never been seen before, return `Ok(true)` to indicate that the 
dictionary was just
     ///   inserted.
-    pub fn insert(&mut self, dict_id: i64, column: &ArrayRef) -> Result<bool, 
ArrowError> {
-        let dict_data = column.to_data();
-        let dict_values = &dict_data.child_data()[0];
-
-        // If a dictionary with this id was already emitted, check if it was 
the same.
-        if let Some(last) = self.written.get(&dict_id) {
-            if ArrayData::ptr_eq(&last.child_data()[0], dict_values) {
-                // Same dictionary values => no need to emit it again
-                return Ok(false);
+    pub fn insert(

Review Comment:
   Do you think it could make sense to configure this on the tracker via some 
`with_dictionary_handling(&mut self, handling: DictionaryHandling)` function? 
We could default it to `DictionaryHandling::Resend` in the existing 
constructor. 
   
   I'm not sure how much sense it actually makes to be able to change this on 
different calls to the same tracker.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to