Wes McKinney created ARROW-6359:
-----------------------------------

             Summary: [C++] Raw data equality in arrays vs. semantic value 
equality
                 Key: ARROW-6359
                 URL: https://issues.apache.org/jira/browse/ARROW-6359
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Wes McKinney


I have observed a conflict in requirements / expectations in our {{Equals}} 
functions. The initial implementations of these functions would compare the raw 
bytes found in non-null data slots, in addition to the validity bitmaps in each 
array, and their respective children, taken slice offsets and so forth into 
account. 

Recently we have been adding type-specific value comparison semantics to these 
comparisons, notably propagating {{NaN != NaN}}. This has led to such issues as 
ARROW-6043. 

Rather than creating "one true way" to compare array contents, I would suggest 
introducing functions that perform slightly different comparisons:

* Raw data comparison, skipping masked null values
* Raw data comparison, comparing all buffer contents (up to the semantic 
"extent" of an array -- so ignoring the contents of padding, or excess buffer 
contents when dealing with slices)

thoughts? 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to