[ 
https://issues.apache.org/jira/browse/ARROW-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16915914#comment-16915914
 ] 

Benjamin Kietzman commented on ARROW-6359:
------------------------------------------

Comparing values under an unset bit in the null bitmask seems like an 
antipattern. Specifically that sounds like that will lead to an expanding set 
of APIs to provide guarantees about what will be in a null slot.

Is there a use case for this other than {{NaN == NaN}} vs {{NaN != NaN}}?

> [C++] Raw data equality in arrays vs. semantic value equality
> -------------------------------------------------------------
>
>                 Key: ARROW-6359
>                 URL: https://issues.apache.org/jira/browse/ARROW-6359
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>
> I have observed a conflict in requirements / expectations in our {{Equals}} 
> functions. The initial implementations of these functions would compare the 
> raw bytes found in non-null data slots, in addition to the validity bitmaps 
> in each array, and their respective children, taken slice offsets and so 
> forth into account. 
> Recently we have been adding type-specific value comparison semantics to 
> these comparisons, notably propagating {{NaN != NaN}}. This has led to such 
> issues as ARROW-6043. 
> Rather than creating "one true way" to compare array contents, I would 
> suggest introducing functions that perform slightly different comparisons:
> * Raw data comparison, skipping masked null values
> * Raw data comparison, comparing all buffer contents (up to the semantic 
> "extent" of an array -- so ignoring the contents of padding, or excess buffer 
> contents when dealing with slices)
> thoughts? 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to