[
https://issues.apache.org/jira/browse/ARROW-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16915968#comment-16915968
]
Francois Saint-Jacques commented on ARROW-6359:
-----------------------------------------------
We have to be careful about un-initialized memory, as it is considered an
undefined behavior to read from it.
> [C++] Raw data equality in arrays vs. semantic value equality
> -------------------------------------------------------------
>
> Key: ARROW-6359
> URL: https://issues.apache.org/jira/browse/ARROW-6359
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Wes McKinney
> Priority: Major
>
> I have observed a conflict in requirements / expectations in our {{Equals}}
> functions. The initial implementations of these functions would compare the
> raw bytes found in non-null data slots, in addition to the validity bitmaps
> in each array, and their respective children, taken slice offsets and so
> forth into account.
> Recently we have been adding type-specific value comparison semantics to
> these comparisons, notably propagating {{NaN != NaN}}. This has led to such
> issues as ARROW-6043.
> Rather than creating "one true way" to compare array contents, I would
> suggest introducing functions that perform slightly different comparisons:
> * Raw data comparison, skipping masked null values
> * Raw data comparison, comparing all buffer contents (up to the semantic
> "extent" of an array -- so ignoring the contents of padding, or excess buffer
> contents when dealing with slices)
> thoughts?
--
This message was sent by Atlassian Jira
(v8.3.2#803003)