Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9358 )

Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics 
contain NaN
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9358/2/be/src/exec/parquet-column-stats.cc
File be/src/exec/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/9358/2/be/src/exec/parquet-column-stats.cc@109
PS2, Line 109:       ChangeNaNToInf<float>(slot, stats_field);
I think we can just return false here if a value is NaN. That way the caller 
will know that it shouldn't use the stats (see comment of the ReadFromThrift()).


http://gerrit.cloudera.org:8080/#/c/9358/2/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test:

http://gerrit.cloudera.org:8080/#/c/9358/2/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@497
PS2, Line 497: insert into test_nan values (cast('NaN' as double)), (42);
Can you also add tests where NaN is inserted in the middle of two values? For 
those, "where not val >= 0" should be covered, too.



--
To view, visit http://gerrit.cloudera.org:8080/9358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455
Gerrit-Change-Number: 9358
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Feb 2018 22:17:30 +0000
Gerrit-HasComments: Yes

Reply via email to