[
https://issues.apache.org/jira/browse/ARROW-13364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17382453#comment-17382453
]
Eduardo Ponce commented on ARROW-13364:
---------------------------------------
R uses *NA* to represent a missing value, equivalent to having a *NULL* bit set
in Arrow.
Coercing NaN to logical or integer type gives an NA of the appropriate type,
but coercion to character gives the string "NaN". NaN values are incomparable
so tests of equality or collation involving NaN will result in NA.
w.r.t. R's behavior for
{code:R}
> NaN > 5
[1] NA
{code}
it does not conforms to IEEE 754. My speculation is that internally the result
is *NaN* but when coerced as a logical type becomes *NA*.
> [C++] Should NaN comparison return false or NaN/NA?
> ---------------------------------------------------
>
> Key: ARROW-13364
> URL: https://issues.apache.org/jira/browse/ARROW-13364
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Jonathan Keane
> Priority: Major
>
> In working on ARROW-12964 we ran into some corner behaviors with {{NaN}} that
> don't match our (and R's) expectations. It appears that (any?) comparison
> with `NaN` results in false:
> {code:r}
> > Scalar$create(NaN) > 5
> Scalar
> false
> {code}
> though at least in R this would result in an NA value:
> {code:r}
> > NaN > 5
> [1] NA
> {code}
> The current behavior _does_ match numpy's behavior:
> {code:python}
> >>> np.nan > 5
> False
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)