jorgecarleitao edited a comment on pull request #8590:
URL: https://github.com/apache/arrow/pull/8590#issuecomment-721818365


   Could you describe where do we need to compare buffers?
   
   I am asking because I have worked on this problem before, and in my last 
iteration of this, I abandoned the idea of comparing buffers: a `Buffer` is 
purely a physical representation of data without any logical interpretation: 
two buffers from two different arrays can be equal and the arrays still be 
different (e.g. due to the nullability or offset), and the opposite is also 
true: two arrays can be equal but have different buffers (e.g. if the child 
data is different, or if the datatype is different).
   
   My current hypothesis is that equality should be done via `ArrayData`, which 
contains all the relevant data _and_ logical representation of that data -- 
`DataType`, to make it possible to logically compare two arrays. I fielded 
#8541 with that idea, which IMO will also help the parquet work. I think that 
@nevi-me also had some ideas in mind.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to