jorgecarleitao edited a comment on pull request #8715:
URL: https://github.com/apache/arrow/pull/8715#issuecomment-730555090


   Hey @ch-sc , thanks for your PR!
   
   @nevi-me, could you help here? I am a bit worried about introducing another 
comparison of datatypes, but I was unable to find anything in the specification 
stating that a DataType of a ListArray should have a field name.
   
   My main concern here is that this change would allow the following: I 
receive a `RecordBatch`, and I want to verify that it is consistent. While 
doing so, I end up reaching to the conclusion that 
`batch.schema().field(0).data_type() != batch.column(0).data_type()`. IMO this 
goes against the whole idea of having a `RecordBatch` in the first place.
   
   OTOH, I also understand the motivation for this change: if the field name is 
irrelevant, then it should not be used in the comparison.
   
   My feeling is that if we need to introduce a different comparison, this 
often hints that there is useless information on the `DataType` that we should 
eliminate. If it is useless but needs to stay for some reason, then my 
suggestion is that we implement a custom `PartialEq` that ignores it, so that 
there is a common source of truth about whether two datatypes are equal.
   
   What are your thoughts @nevi-me, @alamb and @ch-sc ?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to