ch-sc commented on pull request #8715:
URL: https://github.com/apache/arrow/pull/8715#issuecomment-731255930


   Thank you all for the quick feedback!
   
   > Is the only intended change here to not compare the name of fields? Or do 
you have other ideas in mind for the future?
   
   Well, I guess what approach to take when comparing depends on the scenario. 
When calling `try_new()` to create a new record batch I didn't expect this 
behavior. But I can see the benefits of including more information in the 
comparison logic; e.g. nullable. 
   In the implementation I followed what I thought would be the expected 
behavior. Also, it aligned with the comment in the code that I cited in the 
ticket: _"list types can have different names, but we only need the data types 
to be the same"_
   
   > Is it easy to explain in what circumstance you have DataTypes that are the 
same except for the field name?
   
   Keeping track of such metadata adds complexity to the user especially in 
nested scenarios. I would assume in the majority of cases this information is 
not relevant when comparing. BTW this is not restricted to field names only, 
but also dictionary ids and dictionary ordering.
   
   I would second the suggestion from @alamb:
   
   > 2. Update the PartialCmp implementation for DataType to ignore Field names 
for all DataType == DataType comparisons
   
   ...or have 2 implementations for including metadata or not. I saw a similar 
approach in C++ for RecordBatch: 
   ```c
   bool RecordBatch::Equals(const RecordBatch& other, bool check_metadata)
   ```
   
   However, we should be clear to the users what is actually compared and what 
is not. If we end up in a scenario where some attributes play a role and some 
do not this is really hard to understand for anyone.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to