albertlockett opened a new pull request, #8095:
URL: https://github.com/apache/arrow-rs/pull/8095

   # Which issue does this PR close?
   
   - Closes #8012 
   
   # Rationale for this change
   
   In https://github.com/apache/arrow-rs/pull/8005 we loosened the restriction 
that the Arrow data types for some column need to be exactly the same between 
batches, by adding compatibility between dictionary and native arrays. 
   
   At the time, there was a [worthwhile 
suggestion](https://github.com/apache/arrow-rs/pull/8005#pullrequestreview-3058034840)
 that we extend this compatibility definition to include arrays that contain 
the same type of value (e.g. between String, StringView and LargeString). This 
PR adds this change.
   
   # What changes are included in this PR?
   
   This PR now has the Parquet ArrowWriter consider the following Arrow data 
types compatible:
   - String, StringView, LargeString
   - Binary, BinaryView, LargeBinary
   
   It also improves the logic around detecting if dictionary values are 
compatible. Before, we only had compatibility between a Dictionary and a Native 
array, but now we also consider compatible Dictionary types if they have 
compatible keys.
   
   # Are these changes tested?
   
   Yes there are unit tests
   
   # Are there any user-facing changes?
   
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to