Dan Hecht has posted comments on this change. Change subject: IMPALA-4363: Add Parquet timestamp validation ......................................................................
Patch Set 6: (8 comments) http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 526: parent_->parse_status_ = Status(*msg); the claim is we are pulling this out to avoid status construction code, but didn't we already construct a status if we call this function (in LogOrReturnError())? PS6, Line 598: nit: extraneous space Line 599: if (!src->IsValidDate()) { UNLIKELY http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: PS6, Line 512: was this an incorrect comment? PS6, Line 449: void* slot shouldn't this be 'tuple'. but the fact that this compiled probably means that this method isn't actually ever called or defined (i.e. the one in BaseScalarColumnReader), and so this declaration should just be deleted. http://gerrit.cloudera.org:8080/#/c/4968/6/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: Line 156: return date_.day_number() >= MIN_DAY_NUMBER && date_.day_number() <= MAX_DAY_NUMBER; are we sure these are the only invalidate TimestampValue's that may have been copied in? Are there any invalid time_ values? http://gerrit.cloudera.org:8080/#/c/4968/6/testdata/bad_parquet_data/README File testdata/bad_parquet_data/README: PS6, Line 16: out-of-range-timestamp.parq > The parquet file is no longer in this folder, so I will remove this comment not sure what you mean -- aren't you adding a parquet file to test this? could you just make whatever change you are proposing. http://gerrit.cloudera.org:8080/#/c/4968/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test File testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test: PS6, Line 7: Parquet file '$NAM why is the message repeated? -- To view, visit http://gerrit.cloudera.org:8080/4968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Laszlo Gaal <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-HasComments: Yes
