Dan Hecht has posted comments on this change.

Change subject: IMPALA-4363: Add Parquet timestamp validation
......................................................................


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 526:     parent_->parse_status_ = Status(*msg);
the claim is we are pulling this out to avoid status construction code, but 
didn't we already construct a status if we call this function (in 
LogOrReturnError())?


PS6, Line 598:  
nit: extraneous space


Line 599:   if (!src->IsValidDate()) {
UNLIKELY


http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

PS6, Line 512: 
was this an incorrect comment?


PS6, Line 449: void* slot
shouldn't this be 'tuple'.  but the fact that this compiled probably means that 
this method isn't actually ever called or defined (i.e. the one in 
BaseScalarColumnReader), and so this declaration should just be deleted.


http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 156:     return date_.day_number() >= MIN_DAY_NUMBER && date_.day_number() 
<= MAX_DAY_NUMBER;
are we sure these are the only invalidate TimestampValue's that may have been 
copied in?  Are there any invalid time_ values?


http://gerrit.cloudera.org:8080/#/c/4968/6/testdata/bad_parquet_data/README
File testdata/bad_parquet_data/README:

PS6, Line 16: out-of-range-timestamp.parq
> The parquet file is no longer in this folder, so I will remove this comment
not sure what you mean -- aren't you adding a parquet file to test this? could 
you just make whatever change you are proposing.


http://gerrit.cloudera.org:8080/#/c/4968/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test
File 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test:

PS6, Line 7: Parquet file '$NAM
why is the message repeated?


-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Laszlo Gaal <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-HasComments: Yes

Reply via email to