[ 
https://issues.apache.org/jira/browse/ORC-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106556#comment-17106556
 ] 

Gopal Vijayaraghavan commented on ORC-629:
------------------------------------------

Thanks Panos, I will wait for the tests and this change is probably easier to 
replicate in C++ for compat.

Here's the c++ version of the check.

{code}
      case PredicateDataType::FLOAT: {
        if (colStats.has_doublestatistics() &&
            colStats.doublestatistics().has_minimum() &&
            colStats.doublestatistics().has_maximum()) {
          const auto& stats = colStats.doublestatistics();
          result = evaluatePredicateRange(
            mOperator,
            literal2Double(mLiterals),
            stats.minimum(),
            stats.maximum(),
            colStats.hasnull());
        }
        break;
      }
{code}

> PPD: Floating point NaN is not transitive across comparisons
> ------------------------------------------------------------
>
>                 Key: ORC-629
>                 URL: https://issues.apache.org/jira/browse/ORC-629
>             Project: ORC
>          Issue Type: Bug
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>
> Range comparisons don't work right for columns which start with Double.NaN as 
> the first row (min == max == NaN). 
> 1 < NaN is false.
> 1 > NaN is false.
> {code}
> File Version: 0.12 with ORC_135
> Rows: 3
> Compression: ZLIB
> Compression size: 32768
> Type: 
> struct<operation:int,originalTransaction:bigint,bucket:int,rowId:bigint,currentTransaction:bigint,row:struct<c:double>>
> Stripe Statistics:
>   Stripe 1:
>     Column 0: count: 3 hasNull: false
>     Column 1: count: 3 hasNull: false bytesOnDisk: 5 min: 0 max: 0 sum: 0
>     Column 2: count: 3 hasNull: false bytesOnDisk: 5 min: 1 max: 1 sum: 3
>     Column 3: count: 3 hasNull: false bytesOnDisk: 8 min: 536870912 max: 
> 536870912 sum: 1610612736
>     Column 4: count: 3 hasNull: false bytesOnDisk: 7 min: 0 max: 2 sum: 3
>     Column 5: count: 3 hasNull: false bytesOnDisk: 5 min: 1 max: 1 sum: 3
>     Column 6: count: 3 hasNull: false
>     Column 7: count: 3 hasNull: false bytesOnDisk: 19 min: NaN max: NaN sum: 
> NaN
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to