[ 
https://issues.apache.org/jira/browse/CALCITE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265522#comment-17265522
 ] 

Julian Hyde commented on CALCITE-4467:
--------------------------------------

A [discussion in 
CockroachDB|https://github.com/cockroachdb/cockroach/issues/18860] noted that 
Oracle behaves the same as PostgreSQL, noted problems with NaN values in unique 
indexes. I also wonder whether NaN values should be grouped in GROUP BY, and 
whether if we supported NaN we would also need to support -INF, +INF, signed 
zero and denormal numbers.

I also found a tale of [data corruption in SQL 
Server|https://www.red-gate.com/simple-talk/blogs/from-nan-to-infinity-and-beyond/]
 that sounds very pertinent for systems such as Hive that can load data via 
non-SQL paths.

{{RexSimplify}} is not the only place that simplification happens; it happens 
in a hundred places, many of them in {{RelOptRule}} instances. I bet the {{NOT 
IN}} rewrite is affected, for example.

> Incorrect simplification for 'NaN' value
> ----------------------------------------
>
>                 Key: CALCITE-4467
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4467
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{RexSimplify}} simplifies {{x = x}} to {{null or x is not null}} (similarly 
> <= and >=), and {{x != x}} to {{null and x is null}} (similarly < and >).
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L363
> This may not be applicable in some cases. For instance, if the type of x is 
> floating-point, x could be 'NaN'. While some RDBMS consider 'NaN' = 'NaN' 
> (e.g., Postgres), some others consider 'NaN' != 'NaN' following the IEEE 754 
> standard. For the latest, the rewriting above will result in incorrect 
> results.
> I think we should simply ignore this simplification for floating-point type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to