Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12168 )
Change subject: IMPALA-6503: Support reading complex types from ORC ...................................................................... Patch Set 17: (8 comments) I'm pretty happy with this, I have some fairly minor comments but very happy with the code and the functional coverage from existing tests. Do other reviewers plan to continue the review? Haven't seen activity for a bit. http://gerrit.cloudera.org:8080/#/c/12168/17//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12168/17//COMMIT_MSG@30 PS17, Line 30: Tests: Did you run exhaustive tests? I think there are some nested types tests that aren't run in core. I can run it on our infrastructure if that would help. http://gerrit.cloudera.org:8080/#/c/12168/12/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/12168/12/be/src/exec/hdfs-orc-scanner.cc@593 PS12, Line 593: DCHECK(parse_status_.ok()); > If we come here, the previous batch should have been drained. The relative I should have remembered that, I think I reviewed the original version of the code in the Parquet scanner. Lots of state to keep in my head. http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/hdfs-orc-scanner.cc@379 PS17, Line 379: inline Doubt inline provides any measurable benefit it. I'd remove it, but ok to ignore this comment. http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/hdfs-orc-scanner.cc@402 PS17, Line 402: while (!pos_slots_.empty()) { I think the data flow would be clearer if pos_slots was a local and ResolveColumns() took it as an output argument. http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/orc-column-readers.cc File be/src/exec/orc-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/orc-column-readers.cc@30 PS17, Line 30: inline inline seems unnecessary http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/orc-metadata-utils.h File be/src/exec/orc-metadata-utils.h: http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/orc-metadata-utils.h@48 PS17, Line 48: const HdfsTableDescriptor& tbl_desc_; All the members are not assigned except for constructor, so you could add an additional const to the pointers, e.g. const orc::Type* const root_. http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/orc-metadata-utils.cc File be/src/exec/orc-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/orc-metadata-utils.cc@130 PS17, Line 130: || type.type == TYPE_INT || type.type == TYPE_BIGINT) Can you put braces on the multi-line if statements here and below. http://gerrit.cloudera.org:8080/#/c/12168/17/be/src/exec/orc-metadata-utils.cc@175 PS17, Line 175: It "Decimal value" instead of "It" -- To view, visit http://gerrit.cloudera.org:8080/12168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I244dc9d2b3e425393f90e45632cb8cdbea6cf790 Gerrit-Change-Number: 12168 Gerrit-PatchSet: 17 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Mon, 25 Feb 2019 23:52:51 +0000 Gerrit-HasComments: Yes
