Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15395 )
Change subject: IMPALA-9042: Milestone 1: properly scan files that has full ACID schema ...................................................................... Patch Set 8: (7 comments) PS8 is a rebase. http://gerrit.cloudera.org:8080/#/c/15395/6/be/src/exec/orc-metadata-utils.h File be/src/exec/orc-metadata-utils.h: http://gerrit.cloudera.org:8080/#/c/15395/6/be/src/exec/orc-metadata-utils.h@51 PS6, Line 51: private: > These could be const I think, since they're only set in the constructor Done http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc File be/src/exec/orc-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc@231 PS3, Line 231: return Status(Substitute( > I think it would be good to validate that the schema is as expected, just s Turned it into a validator function that returns a Status, calling it from HdfsOrcScanner::Open(). http://gerrit.cloudera.org:8080/#/c/15395/6/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java: http://gerrit.cloudera.org:8080/#/c/15395/6/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@323 PS6, Line 323: table.getMetaStoreTable().getParameters()); > nit: indentation is a bit off Done http://gerrit.cloudera.org:8080/#/c/15395/6/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@329 PS6, Line 329: Preconditions.checkState(col.getName().equals("row__id")); > nit: multi-line if should use parentheses. Maybe this would be better to re Done, also converted the 'equals("row__id")' check to a Precondition. http://gerrit.cloudera.org:8080/#/c/15395/5/testdata/workloads/functional-query/queries/QueryTest/describe-path.test File testdata/workloads/functional-query/queries/QueryTest/describe-path.test: http://gerrit.cloudera.org:8080/#/c/15395/5/testdata/workloads/functional-query/queries/QueryTest/describe-path.test@143 PS5, Line 143: describe functional_orc_def.alltypes > I'm not sure if making this visible is a good idea. What does hive do? Yeah, I wasn't sure about this, but it was handy during development. Hive doesn't reveal the hidden fields in DESCRIBE, so I switched to hiding them too. However, when explicitly asked for it, currently I still allow describe on the row__id field, i.e.: DESCRIBE <table_name>.row__id; +---------------------+--------+---------+ | name | type | comment | +---------------------+--------+---------+ | operation | int | | | originaltransaction | bigint | | | bucket | int | | | rowid | bigint | | | currenttransaction | bigint | | +---------------------+--------+---------+ I think it's not wrong since we also allow the user to explicitly query the fields of the row__id column. Hive only allows DESCRIBE on tables. http://gerrit.cloudera.org:8080/#/c/15395/5/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/15395/5/tests/query_test/test_scanners_fuzz.py@197 PS5, Line 197: > flake8: E501 line too long (91 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/15395/5/tests/query_test/test_scanners_fuzz.py@283 PS5, Line 283: > flake8: E261 at least two spaces before inline comment Done -- To view, visit http://gerrit.cloudera.org:8080/15395 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb Gerrit-Change-Number: 15395 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Norbert Luksa <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 27 Mar 2020 16:29:32 +0000 Gerrit-HasComments: Yes
