[Impala-ASF-CR] IMPALA-9280: Fix parsing of timestamp with dash before TZH
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15130 ) Change subject: IMPALA-9280: Fix parsing of timestamp with dash before TZH .. Patch Set 2: (3 comments) Thanks for taking care of this! http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc@332 PS2, Line 332: && cur_pos_move_count > cur_tok_move_count && *(*current_pos) != '+' Counting the processed separator tokens doesn't seem the best approach for me here. Also I don't see why it is necessary to have more separators in the input than in the format to use the last minus sign as a part of TZH instead of taking it as a separator. Previously this if condition said "If a TZH token follows the separator sequence and the last separator in the sequence was a minus char then I'll use that minus char as the sign of the TZH". I'd extend this: "If a TZH token follows the separator sequence and the last separator in the sequence was a minus char and the TZH token doesn't start with a plus char then ..." Please make sure that in case the minus char is actually taken as sign of TZH instead of a separator then there is still at least one separator remaining in the input sequence to match the separator sequence in the token list. http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/timestamp-test.cc File be/src/runtime/timestamp-test.cc: http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/timestamp-test.cc@1071 PS2, Line 1071: TEST(TimestampTest, IsoSqlWithTZ) { Almost all of these formatting tests are in tests/query_test/test_cast_with_format.py I think we should keep them together. Please consider moving this coverage. http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/timestamp-test.cc@1078 PS2, Line 1078: HH:MI-TZH As far as I know we don't really support dateless timestamps even though Impala accepts them. Additionally I think we should remove the support of them from Impala at one point (AFAIK JDBC/ODBC connections throw parse error for dateless timestamps currently). So might be safer not to use them even in tests as once we drop the support we have to rewrite less code. -- To view, visit http://gerrit.cloudera.org:8080/15130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 Gerrit-Change-Number: 15130 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 09:00:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5546/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 6 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 12:49:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. IMPALA-9279: Update the Kudu version to include VARCHAR support Before this change the preferred way of getting Kudu was to pull it in from the specified CDH build (even if USE_CDP_HIVE was set to true). Optionally by setting USE_CDH_KUDU to false, one could force Impala to use the native toolchain Kudu. But even then, the Kudu Java artifacts would be downloaded from CDH. Since Kudu VARCHAR support won't be backported to CDH, this behavior blocks the Impala side of the Kudu/Impala VARCHAR integration. With this change: 1. Using the native toolchain Kudu (including the Java artifacts) is the default behavior. From now on USE_CDH_KUDU will be set to false by default. Impala can be forced to fall back on using the CDH Kudu by explicitly setting USE_CDH_KUDU to true. 2. Kudu version is updated to include the VARCHAR support. Testing: Ran exhaustive tests with USE_CDH_KUDU=true and USE_CDH_KUDU=false. Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a --- M bin/bootstrap_toolchain.py M bin/impala-config.sh M impala-parent/pom.xml 3 files changed, 43 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/15134/2 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5547/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Thu, 30 Jan 2020 13:27:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15134/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/15134/1/bin/impala-config.sh@726 PS1, Line 726: export IMPALA_TOOLCHAIN_KUDU_MAVEN_REPOSITORY="file://${IMPALA_KUDU_JAVA_HOME}/repository" line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Thu, 30 Jan 2020 12:43:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15134 Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. IMPALA-9279: Update the Kudu version to include VARCHAR support Before this change the preferred way of getting Kudu was to pull it in from the specified CDH build (even if USE_CDP_HIVE was set to true). Optionally by setting USE_CDH_KUDU to false, one could force Impala to use the native toolchain Kudu. But even then, the Kudu Java artifacts would be downloaded from CDH. Since Kudu VARCHAR support won't be backported to CDH, this behavior blocks the Impala side of the Kudu/Impala VARCHAR integration. With this change: 1. Using the native toolchain Kudu (including the Java artifacts) is the default behavior. From now on USE_CDH_KUDU will be set to false by default. Impala can be forced to fall back on using the CDH Kudu by explicitly setting USE_CDH_KUDU to true. 2. Kudu version is updated to include the VARCHAR support. Testing: Ran exhaustive tests with USE_CDH_KUDU=true and USE_CDH_KUDU=false. Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a --- M bin/bootstrap_toolchain.py M bin/impala-config.sh M impala-parent/pom.xml 3 files changed, 42 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/15134/1 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15087 ) Change subject: IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/15087/5/be/src/exec/parquet/parquet-column-stats.cc File be/src/exec/parquet/parquet-column-stats.cc: http://gerrit.cloudera.org:8080/#/c/15087/5/be/src/exec/parquet/parquet-column-stats.cc@51 PS5, Line 51: has_paired_stats This variable seems redundant as it is set to true in all cases of the switch. Using paired_stats_value's nullness seems enough to express whether the paired stat exists. http://gerrit.cloudera.org:8080/#/c/15087/5/be/src/exec/parquet/parquet-column-stats.cc@116 PS5, Line 116: has_paired_stats I think that this could be simplified a lot - if has_paired_stats is false, and the type is a TINYINT/SMALLINT, then we can simply return false without parsing the stats. The same applies to the case when the stat exists, but we couldn't parse it (around line 106) - we should immediately return and in the rest of the switch case we should assume that the paired stats exists and was read successfully. http://gerrit.cloudera.org:8080/#/c/15087/5/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test: http://gerrit.cloudera.org:8080/#/c/15087/5/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@691 PS5, Line 691: PARQUET_READ_STATISTICS PARQUET_READ_STATISTICS=1 is the default, so it is possibe to omit these lines. http://gerrit.cloudera.org:8080/#/c/15087/5/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@692 PS5, Line 692: < Please add a test with = and > to be sure that those work too. -- To view, visit http://gerrit.cloudera.org:8080/15087 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1 Gerrit-Change-Number: 15087 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 30 Jan 2020 11:34:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Norbert Luksa has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 6: (5 comments) Thank you all for taking a look, addressed your comments. http://gerrit.cloudera.org:8080/#/c/15051/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15051/3//COMMIT_MSG@31 PS3, Line 31: 39.06s > nit: probably 'a desktop PC', or 'a desktop PC with CPU X' Done http://gerrit.cloudera.org:8080/#/c/15051/2/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15051/2/be/src/exec/hdfs-orc-scanner.cc@613 PS2, Line 613: > why did you restore the original narrow try-block? I do not call the ORC library anymore in the UpdateInputBatch, therefore the change here is not necessary. http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc@625 PS3, Line 625: orc_root_batch_ = row_reader_->createRowBatch(row_batch->capacity()); > nit: this extra new line is not needed Done http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc@683 PS3, Line 683: > As we have discussed offline, this is no longer correct. Using the current Thank you, Csaba for the detailed comment. Added mempools for the dictionary and the non-dictionary cases, looks like it works now. http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/orc-column-readers.h File be/src/exec/orc-column-readers.h: http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/orc-column-readers.h@198 PS3, Line 198: HdfsOrcScanne > Looking at the ORC library, it should be possible, since it passes the dict Added mempools for the scanner, now we can reuse the blobs. -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 6 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 12:04:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Norbert Luksa has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. IMPALA-9226: Improve string allocations of the ORC scanner Currently the OrcColumnReader copies values from the orc::StringVectorBatch one-by-one. Since ORC 1.6, the blob which contains the pointed values is moved to the StringVectorBatch, so we can copy it. This commit beside the above improvement also enables the LazyEncoding option for the ORC reader. This way, for stripes with DICTIONARY_ENCODING[_V2], EncodedStringVectorBatch contains the data in a dictionaryBlob from which the data can be acquired with the given indices and lengths. Tests: * Run ORC scanner tests (query_tests/test_scanners.py::TestOrc) and tpch query tests. * Tested performance on tpch.lineitem table with scale=25, running queries that selects min of string columns. Some results: col_name | encoding | before | after | speedup = l_comment DIRECT 16.42s 14.38s 14% l_shipinstruct DICTIONARY 5.26s3.80s 32% l_commitdate DICTIONARY 5.46s5.19s 5% all string col BOTH 39.06s 32.18s 21% The queries were run on a desktop PC with MT_DOP and NUM_NODES set to 1. * Also run TPC-H queries on the TPC-H benchmark where some queries' runtime improved by around 10-15%. Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/orc-column-readers.cc M be/src/exec/orc-column-readers.h 4 files changed, 137 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/15051/6 -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 6 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Norbert Luksa has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc@683 PS3, Line 683: dst_batch->tuple_data_pool() > Thank you, Csaba for the detailed comment. As I see (and as you also suggested offline), we do not need to pass the mempool here, right? -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 3 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 12:31:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9149: part 2: Re-enable Ranger-related EE tests
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15088 ) Change subject: IMPALA-9149: part 2: Re-enable Ranger-related EE tests .. Patch Set 4: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/15088/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15088/4//COMMIT_MSG@47 PS4, Line 47: provide Sentry with a customized user-to-groups : mapping. fyi Sentry tests have their own mechanism of to add custom user->group mappings if needed, see: https://github.com/apache/impala/blob/79c5f87565467074697a7d98e01c9742f7228991/fe/src/test/resources/sentry-site.xml.py#L61 https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/testutil/TestSentryGroupMapper.java -- To view, visit http://gerrit.cloudera.org:8080/15088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17420d7ff9beacd1b4d2ad72b68b8b54983e60cb Gerrit-Change-Number: 15088 Gerrit-PatchSet: 4 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 30 Jan 2020 12:53:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/15132/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15132/3//COMMIT_MSG@124 PS3, Line 124: No effort has been made yet to come up with a way to integrate testing of the > I think IMPALA-8508 could help here by shipping a version of python 3 in th It's also possible that python 3 is already installed on the workers that run our tests. There's a part of me that wonders, since we're providing a publicly pip-installable tarball on PyPI, is there even a need to keep generating the tarball? Could using the locally available tarball simply be replaced with "pip install impala-shell"? (Though I'm also sure there are ramifications to doing that that are not occurring to me.) -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 5 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 21:33:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/15123 ) Change subject: IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege. .. Patch Set 2: Code-Review+1 > Patch Set 2: > > Build Successful > > https://jenkins.impala.io/job/gerrit-code-review-checks/5540/ : Initial code > review checks passed. Use gerrit-verify-dryrun-external or > gerrit-verify-dryrun to run full precommit tests. Thanks Wenzhe for fixing this! The code looks good to me after our discussion offline. -- To view, visit http://gerrit.cloudera.org:8080/15123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577 Gerrit-Change-Number: 15123 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 30 Jan 2020 21:08:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5904: Add full tsan option and fix several TSAN bugs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add full_tsan option and fix several TSAN bugs .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5554/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 21:11:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15053 ) Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 21:47:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5557/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 22:06:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/15132/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15132/3//COMMIT_MSG@124 PS3, Line 124: No effort has been made yet to come up with a way to integrate testing of the > It's also possible that python 3 is already installed on the workers that r Bad wording made that last question hard to parse. Sorry. Should be: "...is there even a need to keep generating the local build/tarball?" -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 5 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 22:06:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4224: part 2: refactor filter routing table to support sinks
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/15083 ) Change subject: IMPALA-4224: part 2: refactor filter routing table to support sinks .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15083 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 Gerrit-Change-Number: 15083 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 31 Jan 2020 02:02:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/15132/3/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/15132/3/shell/impala_client.py@669 PS3, Line 669: print('{0} {1}'.format(str(e), type(e)), file=sys.stderr) > Python 2.6, the default in RHEL 6, doesn't support formatting without posit Done http://gerrit.cloudera.org:8080/#/c/15132/3/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15132/3/shell/impala_shell.py@342 PS3, Line 342: > flake8: E501 line too long (91 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/15132/3/shell/impala_shell.py@529 PS3, Line 529: > flake8: E226 missing whitespace around arithmetic operator Done http://gerrit.cloudera.org:8080/#/c/15132/3/shell/impala_shell.py@844 PS3, Line 844: > flake8: F841 local variable 'e' is assigned to but never used Done http://gerrit.cloudera.org:8080/#/c/15132/4/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15132/4/shell/impala_shell.py@341 PS4, Line 341: > flake8: E129 visually indented line with same indent as next logical line Done http://gerrit.cloudera.org:8080/#/c/15132/3/tests/shell/test_shell_interactive.py File tests/shell/test_shell_interactive.py: http://gerrit.cloudera.org:8080/#/c/15132/3/tests/shell/test_shell_interactive.py@411 PS3, Line 411: assert history_entry in result.stderr, "'%s' not in '%s'" % (history_entry, > Why the change to stdout? Done. Admittedly, I got a little sloppy here. I missed a print_to_stderr call in one spot, noticed that the test failed as written, but that it passed with stdout. I meant to go back and track down what happened, but just lost track of it. -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 5 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 21:27:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8587: Show inherited privileges with Ranger show grant
Fang-Yu Rao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15111 Change subject: IMPALA-8587: Show inherited privileges with Ranger show grant .. IMPALA-8587: Show inherited privileges with Ranger show grant Previously when executing a SHOW GRANT statement on a resource with Ranger authorization enabled, Impala would not show inherited privileges. For example, consider a user 'foo' with database-level privileges granted by: GRANT SELECT ON DATABASE db TO USER foo; If later on we would like to retrieve the table-level privileges associated with the user 'foo' by: SHOW GRANT USER foo ON TABLE db.table; We would not see any result before this change. After this change, the related privileges including the inherited privileges with regard to the specified resource will be shown. In our example described above, we will see the following result and therefore the result returned by SHOW GRANT statement is more informative than the case in which only the privileges on 'db'.'table' were shown. Notice that in the following returned result, we are also able to know the specified user's privileges on any other table under the database 'db'. +++--+---++-+-+---+--+---+ | principal_type | principal_name | database | table | column | uri | udf | privilege | grant_option | create_time | +++--+---++-+-+---+--+---+ | USER | foo| db | * | * | | | select| false| 1580174954746 | +++--+---++-+-+---+--+---+ Testing - Ran all FE tests - Ran all authorization E2E tests - Added E2E tests in test_ranger verifying functionality Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899 --- M fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java M tests/authorization/test_ranger.py 2 files changed, 175 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/15111/3 -- To view, visit http://gerrit.cloudera.org:8080/15111 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899 Gerrit-Change-Number: 15111 Gerrit-PatchSet: 3 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 5: > Patch Set 5: > > (1 comment) Bad wording made that last question hard to parse. Sorry. Should be: "...is there even a need to keep generating the local build/tarball?" -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 5 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 22:05:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6772: Enable test scanners fuzz for ORC
Hello Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15062 to look at the new patch set (#7). Change subject: IMPALA-6772: Enable test_scanners_fuzz for ORC .. IMPALA-6772: Enable test_scanners_fuzz for ORC Add test coverage for randomly corrupt ORC files by adding orc in tests of test_scanners_fuzz.py. Also add two additional queries for nested types. Tests: - Ran test_scanners_fuzz.py 780 rounds (took 43h). - Ran test_scanners_fuzz.py for orc/def/block 1081 rounds (took 24h). Change-Id: I3233e5d9f555029d954b5ddd5858ea194afc06bf --- M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M tests/query_test/test_scanners_fuzz.py 3 files changed, 80 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15062/7 -- To view, visit http://gerrit.cloudera.org:8080/15062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3233e5d9f555029d954b5ddd5858ea194afc06bf Gerrit-Change-Number: 15062 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15132 to look at the new patch set (#5). Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. IMPALA-3343: WIP - Make impala-shell compatible with python 3. This is a WIP patch for making the impala-shell code cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass all python e2e tests repsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x, irrespective of the shell version being tested. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - Many of the simpler changes derive from the fact that a few built-in functions and/or types have either been removed or have else changed in python 3.x, E.g., xrange and basestring are both gone, dict.iteritems() has been removed, dict.items() behaves differently, the unicode() function and the method str.decode() have both been removed, etc. Also, catching exceptions using "Exception, e" no longer works, and (as most know), using print() as a function is required now. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11 gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for testing a version of the impala-shell in a python3-based virtualenv. - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell on Python 3.7.5 with no authentication Opened TCP connection to localhost:21000 ... OR Starting Impala Shell on Python 2.7.12 with no authentication Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result of this, the way we handled strings in the impala-shell code had become equally loose and inconsistent -- mainly in the form of frequent and liberal use of str.encode() and str.decode() -- but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. At first, I tried to go one-by-one to the site of each failure, and correct by checking instance type and re-encoding as necessary, but this only led to even more str.encode() calls littering the code, which just seemed like a code-smell. (Wiki "code smell" if you don't know the term.) What ultimately seemed like a better approach was to try to weed out as many existing spurious str.encode() and str.decode() calls as I could. I'm not sure I got it as clean as it could be, but I think what I'm submitting is much cleaner than the original code. Testing: To test the changes, I ran the e2e shell tests
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15053 ) Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5479/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 22:13:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5558/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 5 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 22:12:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/15132/4/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15132/4/shell/impala_shell.py@341 PS4, Line 341: flake8: E129 visually indented line with same indent as next logical line -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 21:19:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
Hello Grant Henke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15132 to look at the new patch set (#4). Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. IMPALA-3343: WIP - Make impala-shell compatible with python 3. This is a WIP patch for making the impala-shell code cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass all python e2e tests repsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x, irrespective of the shell version being tested. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - Many of the simpler changes derive from the fact that a few built-in functions and/or types have either been removed or have else changed in python 3.x, E.g., xrange and basestring are both gone, dict.iteritems() has been removed, dict.items() behaves differently, the unicode() function and the method str.decode() have both been removed, etc. Also, catching exceptions using "Exception, e" no longer works, and (as most know), using print() as a function is required now. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11 gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for testing a version of the impala-shell in a python3-based virtualenv. - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell on Python 3.7.5 with no authentication Opened TCP connection to localhost:21000 ... OR Starting Impala Shell on Python 2.7.12 with no authentication Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result of this, the way we handled strings in the impala-shell code had become equally loose and inconsistent -- mainly in the form of frequent and liberal use of str.encode() and str.decode() -- but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. At first, I tried to go one-by-one to the site of each failure, and correct by checking instance type and re-encoding as necessary, but this only led to even more str.encode() calls littering the code, which just seemed like a code-smell. (Wiki "code smell" if you don't know the term.) What ultimately seemed like a better approach was to try to weed out as many existing spurious str.encode() and str.decode() calls as I could. I'm not sure I got it as clean as it could be, but I think what I'm submitting is much cleaner than the original code. Testing: To test the changes, I ran the e2e shell tests
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15137 ) Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.h File be/src/scheduling/cluster-membership-mgr.h: http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.h@203 PS1, Line 203: /// Returns true if the 'be_desc' is in any of the 'executor_groups', false otherwise. Maybe note this is intended for use in DCHECKs http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.cc@257 PS1, Line 257: if (be_desc.is_quiescing && !existing.is_quiescing && existing.is_executor) { Maybe also add a DCHECK that if 'existing.is_quiescing' is true then 'be_desc.is_quiescing' is also true, based on Bikram asying that once a node starts quiescing it will never transfer back. It would also of course be nice if we understood what other updates are possible and could document them, while you're here, but not a big deal if you don't want to spend time on figuring it out. http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.cc@266 PS1, Line 266: } else { formatting -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 30 Jan 2020 21:19:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5477/ -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Thu, 30 Jan 2020 21:41:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15053 ) Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5556/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 21:41:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15132/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15132/3//COMMIT_MSG@124 PS3, Line 124: No effort has been made yet to come up with a way to integrate testing of the That's a good point. > There's a part of me that wonders, since we're providing a publicly > pip-installable tarball on PyPI, is there even a need to keep generating the > tarball? Could using the locally available tarball simply be replaced with > "pip install impala-shell"? (Though I'm also sure there are ramifications to > doing that that are not occurring to me.) I can think of two reasons (not sure if they're good reasons, but they are reasons) * someone doesn't like using pip for some reason (the python packaging ecosystem is pretty fragmented). * someone has other infrastructure that consumes the tarball, e.g. deployment scripts, dockerfiles, packaging scripts. I believe my employer has some of those that wouldn't be easily convertible to use a different way of packaging the shell. That's no real reason for the Apache project to keep the maintenance burden, but it wouldn't be surprising if there were more consumers out there. -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 22:09:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5904: Add full tsan option and fix several TSAN bugs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add full_tsan option and fix several TSAN bugs .. Patch Set 2: Code-Review+1 (4 comments) I had one more change requested. Assume others might want to take a look. http://gerrit.cloudera.org:8080/#/c/15116/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15116/1//COMMIT_MSG@10 PS1, Line 10: existing -tsan build flag. -full_tsan is equivalent to the current -tsan > From what I have read online, running with ignore_noninstrumented_modules d Interesting, thank you! http://gerrit.cloudera.org:8080/#/c/15116/2/be/src/runtime/bufferpool/reservation-tracker.h File be/src/runtime/bufferpool/reservation-tracker.h: http://gerrit.cloudera.org:8080/#/c/15116/2/be/src/runtime/bufferpool/reservation-tracker.h@322 PS2, Line 322: int64_t child_reservations_; We should probably change this too for consistency (I think the same considerations apply as for the other two). http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/system-allocator.cc@124 PS1, Line 124: if (rc != 0) { > hmm I thought I did, but I guess not. removed this block entirely since the I assigned IMPALA-6259 to you since you're fixing it http://gerrit.cloudera.org:8080/#/c/15116/2/be/src/runtime/io/disk-io-mgr-stress.h File be/src/runtime/io/disk-io-mgr-stress.h: http://gerrit.cloudera.org:8080/#/c/15116/2/be/src/runtime/io/disk-io-mgr-stress.h@a100 PS2, Line 100: Reminds me of https://blog.regehr.org/archives/28 and https://www.kernel.org/doc/html/latest/process/volatile-considered-harmful.html - volatile is pretty much never useful for multithreaded code -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jan 2020 00:54:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15137 ) Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 31 Jan 2020 01:58:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15137 ) Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5480/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 31 Jan 2020 01:58:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15140 Change subject: IMPALA-9348: Add flag to disable column masking .. IMPALA-9348: Add flag to disable column masking Add flag --enable_column_masking which defaults to false to disable column masking feature until this feature becomes mature. Tests: - Recover legacy FE tests for verifying error messages. Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java M tests/authorization/test_ranger.py 10 files changed, 170 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/15140/1 -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5478/ -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 5 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jan 2020 02:51:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8587: Show inherited privileges with Ranger show grant
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15111 ) Change subject: IMPALA-8587: Show inherited privileges with Ranger show grant .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5559/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15111 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899 Gerrit-Change-Number: 15111 Gerrit-PatchSet: 3 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 30 Jan 2020 22:25:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15137 ) Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.h File be/src/scheduling/cluster-membership-mgr.h: http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.h@203 PS1, Line 203: /// Returns true if the 'be_desc' is in any of the 'executor_groups', false otherwise. > Maybe note this is intended for use in DCHECKs Done http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.cc@257 PS1, Line 257: if (be_desc.is_quiescing && !existing.is_quiescing && existing.is_executor) { > Maybe also add a DCHECK that if 'existing.is_quiescing' is true then 'be_de Added the DCHECK. I tried debugging it, but couldn't really pinpoint the issue. It only happens on cluster startup. Removing nodes / adding nodes doesn't seem to trigger it. Only graceful shutdown. I think either way it is probably safest to make the blacklisting logic specific to quiescing nodes in case we change the triggers for an UPDATE at some time in the future. http://gerrit.cloudera.org:8080/#/c/15137/1/be/src/scheduling/cluster-membership-mgr.cc@266 PS1, Line 266: } else { > formatting Done -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 30 Jan 2020 23:07:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Hello Thomas Tauber-Marshall, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15137 to look at the new patch set (#2). Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist Currently, the ClusterMembershipMgr will remove a node from the blacklist whenever there is an "update" for a backend from the statestore. Updates are typically restricted to updates about the quiescing status of a node. The ClusterMembershipMgr should un-blacklist quiescing nodes since quiescing nodes are not part of any executor groups and will eventually be removed from the cluster membership. Thus, there is no reason they need to remain on the blacklist. However, other updates to a backend (e.g. updates that are not related to the quiescing status of a node) should not cause that node to be un-blacklisted. Doing so could cause a node to be un-blacklisted, but not added back to any executor groups, creating a state where a node is part of the cluster membership, but not part of any executor groups (or the blacklist). This patch fixes the aforementioned issue by only un-blacklisting an updated node in ClusterMembershipMgr::UpdateMembership when the node starts quiescing. Added some DCHECKs to ensure the consistency of the blacklist and the list of executor groups. Testing: * Ran core tests * Ran test_executor_groups.p, test_restart_services.py, and test_blacklist.py with --exploration_strategy=exhaustive locally Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 --- M be/src/scheduling/cluster-membership-mgr.cc M be/src/scheduling/cluster-membership-mgr.h 2 files changed, 41 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/15137/2 -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15137 ) Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 30 Jan 2020 23:54:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/15123 ) Change subject: IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege. .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/15123/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15123/2//COMMIT_MSG@10 PS2, Line 10: Register all privilege requests made from the withClause analyzer : when analyze function throw AnalysisException. nit: can you instead briefly explain what it fixed http://gerrit.cloudera.org:8080/#/c/15123/2//COMMIT_MSG@19 PS2, Line 19: pre-review-test on Jenkins, including FE tests, BE tests, :EE tests, JDBS test and cluster tests. nit: passed all core tests http://gerrit.cloudera.org:8080/#/c/15123/2/fe/src/main/java/org/apache/impala/analysis/WithClause.java File fe/src/main/java/org/apache/impala/analysis/WithClause.java: http://gerrit.cloudera.org:8080/#/c/15123/2/fe/src/main/java/org/apache/impala/analysis/WithClause.java@82 PS2, Line 82: try { : view.getQueryStmt().analyze(viewAnalyzer); : } catch (AnalysisException e) { : // Register all privilege requests made from the root analyzer. : for (PrivilegeRequest req: withClauseAnalyzer.getPrivilegeReqs()) { : analyzer.registerPrivReq(req); : } : throw e; Its not quite clear why we only have to repeat this step if an exceptions is encountered. Can add a small comment here and/or refactor to the code so that privilege requests are made regardless of exception (maybe in a final block) http://gerrit.cloudera.org:8080/#/c/15123/2/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java File fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java: http://gerrit.cloudera.org:8080/#/c/15123/2/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@597 PS2, Line 597: // Select from non-existent database with "With" clause. : authorize("with t as (select id from nodb.alltypes) select * from t") : .error(selectError("nodb.alltypes")); : : // Select from non-existent table with "With" clause. : authorize("with t as (select id from functional.notbl) select * from t") : .error(selectError("functional.notbl")); do we have the same test but with privileges?? -- To view, visit http://gerrit.cloudera.org:8080/15123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577 Gerrit-Change-Number: 15123 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 31 Jan 2020 00:47:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments
Hello Zoltan Borok-Nagy, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14859 to look at the new patch set (#32). Change subject: IMPALA-4224: execute separate join builds fragments .. IMPALA-4224: execute separate join builds fragments This enables parallel plans with the join build in a separate fragment and fixes all of the ensuing fallout. After this change, mt_dop plans with joins have separate build fragments. There is still a 1:1 relationship between join nodes and builders, so the builders are only accessed by the join node's thread after it is handed off. This lets us defer the work required to make PhjBuilder and NljBuilder safe to be shared between nodes. Planner changes: * Combined the parallel and distributed planning code paths. * Misc fixes to generate reasonable thrift structures in the query exec requests, i.e. containing the right nodes. * Fixes to resource calculations for the separate build plans. ** Calculate separate join/build resource consumption. ** Simplified the resource estimation by calculating resource consumption for each fragment separately, and assuming that all fragments hit their peak resource consumption at the same time. IMPALA-9255 is the follow-on to make the resource estimation more accurate. Scheduler changes: * Various fixes to handle multiple TPlanExecInfos correctly, which are generated by the planner for the different cohorts. * Add logic to colocate build fragments with parent fragments. Runtime filter changes: * Build sinks now produce runtime filters, which required planner and coordinator fixes to handle. accordingly. DataSink changes: * Close the input plan tree before calling FlushFinal() to release resources. This depends on Send() not holding onto references to input batches, which was true except for NljBuilder. This invariant is documented. Join builder changes: * Add a common base class for PhjBuilder and NljBuilder with functions to handle synchronisation with the join node. * Close plan tree earlier in FragmentInstanceState::Exec() so that peak resource requirements are lower. * The NLJ always copies input batches, so that it can close its input tree. JoinNode changes: * Join node blocks waiting for build-side to be ready, then eventually signals that it's done, allowing the builder to be cleaned up. * NLJ and PHJ nodes handle both the integrated builder and the external builder. There is a 1:1 relationship between the node and the builder, so we don't deal with thread safety yet. * Buffer reservations are transferred between the builder and join node when running with the separate builder. This is not really necessary right now, since it is all single-threaded, but will be important for the shared broadcast. - The builder transfers memory for probe buffers to the join node at the end of each build phase. - At end of each probe phase, reservation needs to be handed back to builder (or released). ExecSummary changes: * The summary logic was modified to handle connecting fragments via join builds. The logic is an extension of what was used for exchanges. Testing: * Enable --unlock_mt_dop for end-to-end tests * Migrate some tests to run as part of end-to-end tests instead of custom cluster. * Add mt_dop dimension to various end-to-end tests to provide coverage of join queries, spill-to-disk and cancellation. * Ran a single node TPC-H and TPC-DS stress test with mt_dop=0 and mt_dop=4. Perf: * Ran TPC-H scale factor 30 locally with mt_dop=0. No significant change. Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58 --- M be/src/exec/CMakeLists.txt M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/exec-node.h A be/src/exec/join-builder.cc A be/src/exec/join-builder.h M be/src/exec/nested-loop-join-builder.cc M be/src/exec/nested-loop-join-builder.h M be/src/exec/nested-loop-join-node.cc M be/src/exec/nested-loop-join-node.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/runtime/bufferpool/buffer-pool-internal.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/initial-reservations.cc M be/src/runtime/row-batch.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/spillable-row-batch-queue.h M be/src/util/summary-util.cc M bin/run-all-tests.sh M common/thrift/DataSinks.thrift M common/thrift/ExecStats.thrift M
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15137 ) Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks// : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 30 Jan 2020 21:16:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343: WIP - Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15132 ) Change subject: IMPALA-3343: WIP - Make impala-shell compatible with python 3. .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5478/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15132 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462 Gerrit-Change-Number: 15132 Gerrit-PatchSet: 5 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 22:02:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15137 ) Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5560/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 30 Jan 2020 23:52:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Norbert Luksa has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. IMPALA-9226: Improve string allocations of the ORC scanner Currently the OrcColumnReader copies values from the orc::StringVectorBatch one-by-one. Since ORC 1.6, the blob which contains the pointed values is moved to the StringVectorBatch, so we can copy it. This commit beside the above improvement also enables the LazyEncoding option for the ORC reader. This way, for stripes with DICTIONARY_ENCODING[_V2], EncodedStringVectorBatch contains the data in a dictionaryBlob from which the data can be acquired with the given indices and lengths. Tests: * Run ORC scanner tests (query_tests/test_scanners.py::TestOrc) and tpch query tests. * Tested performance on tpch.lineitem table with scale=25, running queries that selects min of string columns. Some results: col_name | encoding | before | after | speedup = l_comment DIRECT 16.42s 14.38s 14% l_shipinstruct DICTIONARY 5.26s3.80s 32% l_commitdate DICTIONARY 5.46s5.19s 5% all string col BOTH 39.06s 32.18s 21% The queries were run on a desktop PC with MT_DOP and NUM_NODES set to 1. * Also run TPC-H queries on the TPC-H benchmark where some queries' runtime improved by around 10-15%, while there were no regression for the others. Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/orc-column-readers.cc M be/src/exec/orc-column-readers.h 4 files changed, 135 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/15051/7 -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 7 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9149: part 2: Re-enable Ranger-related EE tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15088 ) Change subject: IMPALA-9149: part 2: Re-enable Ranger-related EE tests .. IMPALA-9149: part 2: Re-enable Ranger-related EE tests In IMPALA-9047, we disabled some Ranger-related FE and BE tests due to changes in Ranger's behavior after upgrading Ranger from 1.2 to 2.0. This patch aims to re-enable those disabled EE tests in tests/authorization/test_authorized_proxy.py and tests/authorization/test_ranger.py to increase Impala's test coverage of authorization via Ranger. The Ranger-related tests in test_authorized_proxy.py test Impala's delegation for clients. Two types of delegation are supported in Impala, i.e., a user can delegate the execution of a query to either 1) another user, or 2) a group of users. In the former case, Ranger will check whether or not the delegated user specified in the option 'authorized_proxy_user_config' possesses sufficient privileges to access the resources, whereas in the latter case, before checking the delegated group is granted sufficient privileges, Ranger will check with the help of Impala whether or not the delegated user specified in 'authorized_proxy_user_config' belongs to the delegated group specified in 'authorized_proxy_group_config' in the underlying OS. This type of delegation requires Impala to retrieve the groups the delegated user belongs to from the underlying OS and thus if the delegated user does not exist in the underlying OS, Impala would inform Ranger that the delegated user does not belong to any group, which in turn would fail the authorization even though in the policies on the Ranger server, the delegated user belongs to the delegated group and the delegated group is granted sufficient privileges. The re-enabled Ranger tests in test_authorized_proxy.py involve queries in which the delegated user, i.e., 'non_owner', does not exist in the underlying OS. We use 'non_owner' as the delegated user instead of getuser() so that we will have to explicitly grant 'non_owner' sufficient privileges of accessing the resources. To avoid the need for creating an actual delegated user and its corresponding delegated groups in the underlying OS when running the EE tests, we added to 'impalad_args' an additional option, i.e., 'use_customized_user_groups_mapper_for_ranger', which, when set to true, allows Impala to use a customized user-to-groups mapping when performing authorization via Ranger. On the other hand, we set the delegated user to getuser() when running the respective Sentry related tests to avoid the need for having to provide Sentry with a customized user-to-groups mapping. To re-enable test_legacy_catalog_ownership() in test_ranger.py, we removed in _test_ownership() a test query that was expected to fail the authorization in Ranger 1.2 but passes the authorization in Ranger 2.0. This is due to the fact that in Ranger 2.0, a user does not have to be explicitly granted the privileges of accessing a resource as long as the user is the owner of the resource. Testing: - Passed FE tests. - Passed the tests in test_authorized_proxy.py. - Passed the tests in test_ranger.py. Change-Id: I17420d7ff9beacd1b4d2ad72b68b8b54983e60cb Reviewed-on: http://gerrit.cloudera.org:8080/15088 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M tests/authorization/test_authorized_proxy.py M tests/authorization/test_ranger.py 7 files changed, 90 insertions(+), 103 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I17420d7ff9beacd1b4d2ad72b68b8b54983e60cb Gerrit-Change-Number: 15088 Gerrit-PatchSet: 6 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-9149: part 2: Re-enable Ranger-related EE tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15088 ) Change subject: IMPALA-9149: part 2: Re-enable Ranger-related EE tests .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17420d7ff9beacd1b4d2ad72b68b8b54983e60cb Gerrit-Change-Number: 15088 Gerrit-PatchSet: 5 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 30 Jan 2020 19:13:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15140/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/15140/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@193 PS1, Line 193: if (!isColumnMaskingEnabled_ authorizeColumnMask() is the legacy codes before IMPALA-9009. I just add the new flag inside this function. It doesn't belong to the new code of IMPALA-9009. It will block any access to columns with column masking policies if the flag is false. > BTW, the check for if (!isColumnMaskingEnabled_) is being done twice ..is > that needed ? Oops, I'll remove the check here. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 04:05:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 2: Code-Review+1 > Patch Set 2: > > (1 comment) > > l I see the reasoning for this now..you are reverting to the pre-IMPALA-9009 behavior which was calling the evaluateColumnMask() for each column and you have added the extra check. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 05:06:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/15140/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/15140/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@285 PS3, Line 285: throw new AuthorizationException(String.format( > Thanks for making the change to avoid unnecessary function calls. One mino Yeah, I add back the check for isColumnMaskingEnabled_ anyway. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 07:13:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5566/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 07:56:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5564/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 04:57:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/15140/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/15140/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@282 PS2, Line 282: if (!isColumnMaskingEnabled_ > Ignore this as I understand that you are reverting to the pre-IMPALA-9009 b Thanks! But we can exit sooner if the flag is True by moving "!isColumnMaskingEnabled_" to the caller to avoid 100 function calls. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 05:13:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15137 ) Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5480/ -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 31 Jan 2020 06:45:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15053 ) Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jan 2020 03:06:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5563/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 31 Jan 2020 03:06:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15053 ) Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes Refactored code to move codegen code from aggregation exec nodes to their plan nodes. Added some TODOs that will be fixed in the next few patch. Testing: - Ran queries and confirmed manually that the codegened code works. - Ran all e2e tests for agg nodes and partition joins. Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Reviewed-on: http://gerrit.cloudera.org:8080/15053 Reviewed-by: Csaba Ringhofer Tested-by: Impala Public Jenkins --- M be/src/exec/aggregation-node-base.cc M be/src/exec/aggregator.cc M be/src/exec/aggregator.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exprs/expr-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h 15 files changed, 448 insertions(+), 224 deletions(-) Approvals: Csaba Ringhofer: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 9 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4224: part 2: refactor filter routing table to support sinks
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15083 ) Change subject: IMPALA-4224: part 2: refactor filter routing table to support sinks .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15083 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 Gerrit-Change-Number: 15083 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 31 Jan 2020 04:33:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4224: part 2: refactor filter routing table to support sinks
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15083 ) Change subject: IMPALA-4224: part 2: refactor filter routing table to support sinks .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5481/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15083 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 Gerrit-Change-Number: 15083 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 31 Jan 2020 04:33:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Hello Aman Sinha, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15140 to look at the new patch set (#4). Change subject: IMPALA-9348: Add flag to disable column masking .. IMPALA-9348: Add flag to disable column masking Add flag --enable_column_masking which defaults to false to disable column masking feature until this feature becomes mature. This patch recovers some legacy codes before IMPALA-9009 (mainly inside RangerAuthorizationChecker). When the flag is set to false, the legacy code paths are used. Tests: - Recover legacy FE tests for verifying error messages. Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java M tests/authorization/test_ranger.py 10 files changed, 170 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/15140/4 -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15140/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/15140/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@193 PS1, Line 193: if (!isColumnMaskingEnabled_ If column masking is disabled, it's not clear to me why authorizeColumnMask() is being called. Shouldn't the effect of setting this flag to False be that the new code is not called at all ? Unless I misunderstood the intent of the flag. BTW, the check for if (!isColumnMaskingEnabled_) is being done twice ..is that needed ? -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 31 Jan 2020 03:52:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Hello Aman Sinha, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15140 to look at the new patch set (#2). Change subject: IMPALA-9348: Add flag to disable column masking .. IMPALA-9348: Add flag to disable column masking Add flag --enable_column_masking which defaults to false to disable column masking feature until this feature becomes mature. This patch recovers some legacy codes before IMPALA-9009 (mainly inside RangerAuthorizationChecker). When the flag is set to false, the legacy code paths are used. Tests: - Recover legacy FE tests for verifying error messages. Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java M tests/authorization/test_ranger.py 10 files changed, 169 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/15140/2 -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Hello Aman Sinha, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15140 to look at the new patch set (#3). Change subject: IMPALA-9348: Add flag to disable column masking .. IMPALA-9348: Add flag to disable column masking Add flag --enable_column_masking which defaults to false to disable column masking feature until this feature becomes mature. This patch recovers some legacy codes before IMPALA-9009 (mainly inside RangerAuthorizationChecker). When the flag is set to false, the legacy code paths are used. Tests: - Recover legacy FE tests for verifying error messages. Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java M tests/authorization/test_ranger.py 10 files changed, 169 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/15140/3 -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15140/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/15140/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@285 PS3, Line 285: "Column masking is disabled by --enable_column_masking flag. Can't access " + Thanks for making the change to avoid unnecessary function calls. One minor nit: This error message says 'Column masking is disabled by ...' but this assumes that the caller is always calling this function when the flag is false. Another caller could potentially call this function even when the flag is true. Anyways, this is ok for now since I understand the disabling is a temporary thing until the feature is fully tested. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 06:06:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9280: Fix parsing of timestamp with dash before TZH
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15130 ) Change subject: IMPALA-9280: Fix parsing of timestamp with dash before TZH .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/15130/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15130/4//COMMIT_MSG@11 PS4, Line 11: if the negative sign is the only possible : separator character in the separator substring. I have the same comment here as in the parser: This doesn't cover the whole picture as we also don't include it if the TZH itself has a '+' char as a sign. http://gerrit.cloudera.org:8080/#/c/15130/4/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/15130/4/be/src/runtime/datetime-iso-sql-format-parser.cc@296 PS4, Line 296: (*current_pos) nit: no need for the brackets. http://gerrit.cloudera.org:8080/#/c/15130/4/be/src/runtime/datetime-iso-sql-format-parser.cc@323 PS4, Line 323: // The last '-' of a separator sequence might be taken as a sign for timezone hour. : // UNLESS the last '-' is the only possible separator character in the separator : // substring (in which case it is not counted as the timezone hour's negative sign). I'd rephrase this comment a bit as it misses to mention that if the TZH starts with a '+' then we again don't take the last '-' of the separator sequence as the sign. Let's say something like this: "If the TZH token itself doesn't start with a sign and the separator sequence contains more than one separators then the last '-' of the separator sequence is taken as the sign of TZH". What do you think? -- To view, visit http://gerrit.cloudera.org:8080/15130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 Gerrit-Change-Number: 15130 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jan 2020 07:19:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6772: Enable test scanners fuzz for ORC
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15062 ) Change subject: IMPALA-6772: Enable test_scanners_fuzz for ORC .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5562/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3233e5d9f555029d954b5ddd5858ea194afc06bf Gerrit-Change-Number: 15062 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 31 Jan 2020 03:15:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 2: (1 comment) l http://gerrit.cloudera.org:8080/#/c/15140/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/15140/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@282 PS2, Line 282: if (!isColumnMaskingEnabled_ If there are 100 columns, this check for isColumnMaskingEnabled_ will be done as many times until one of the columns (could be the last one) access throws an exception. Can't we exit sooner since the flag is False anyways ? -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 04:40:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/15140/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/15140/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@282 PS2, Line 282: if (!isColumnMaskingEnabled_ > If there are 100 columns, this check for isColumnMaskingEnabled_ will be do Ignore this as I understand that you are reverting to the pre-IMPALA-9009 behavior. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 05:08:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9348: Add flag to disable column masking
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15140 ) Change subject: IMPALA-9348: Add flag to disable column masking .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5565/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7c98042cfb831ba00d9d2179367cba257b77a0 Gerrit-Change-Number: 15140 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 31 Jan 2020 06:00:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4224: part 2: refactor filter routing table to support sinks
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15083 ) Change subject: IMPALA-4224: part 2: refactor filter routing table to support sinks .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/15083/6/be/src/runtime/coordinator-filter-state.h File be/src/runtime/coordinator-filter-state.h: http://gerrit.cloudera.org:8080/#/c/15083/6/be/src/runtime/coordinator-filter-state.h@167 PS6, Line 167: emplace > Based on the text from https://en.cppreference.com/w/cpp/container/unordere I switched to explicitly checking whether the entry exists before calling emplace(). http://gerrit.cloudera.org:8080/#/c/15083/6/be/src/runtime/coordinator-filter-state.h@167 PS6, Line 167: std::piecewise_construct, : std::forward_as_tuple(filter.filter_id), : std::forward_as_tuple > nit: I see it's moved from coordinator.cc but maybe we could just use std:: I wasn't able to avoid this incantation - I think the problem this incantation avoids is that FilterState is not movable because of the SpinLock. -- To view, visit http://gerrit.cloudera.org:8080/15083 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 Gerrit-Change-Number: 15083 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 18:28:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4224: part 2: refactor filter routing table to support sinks
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15083 ) Change subject: IMPALA-4224: part 2: refactor filter routing table to support sinks .. Patch Set 7: Code-Review+1 carry +1 -- To view, visit http://gerrit.cloudera.org:8080/15083 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 Gerrit-Change-Number: 15083 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 18:28:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4224: part 2: refactor filter routing table to support sinks
Hello Zoltan Borok-Nagy, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15083 to look at the new patch set (#7). Change subject: IMPALA-4224: part 2: refactor filter routing table to support sinks .. IMPALA-4224: part 2: refactor filter routing table to support sinks This is some refactoring that will enable us to correctly construct a routing table where join build sinks produce filters: * The previous code relied (non-obviously) on producers being visited before consumers when walking the plan tree. This is no longer true when build sinks produce filters. The code is changed so it can construct the routing table entry when the first producer or consumer is visited (using GetOrCreateFilterState()) * The source plan node ID is added to TRuntimeFilterDesc, so that it's available when visiting consumer nodes. * The logic for adding a filter source to the routing table is factored out into a separate function AddFilterSource(), which can be used (in a later patch) for join build sinks. This change is not expected to change behaviour in any way. Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 --- M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java 5 files changed, 80 insertions(+), 54 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/15083/7 -- To view, visit http://gerrit.cloudera.org:8080/15083 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 Gerrit-Change-Number: 15083 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9280: Time string with signed TZH token preceded by dash cannot be cast to timestamp
Hello Gabor Kaszab, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15130 to look at the new patch set (#3). Change subject: IMPALA-9280: Time string with signed TZH token preceded by dash cannot be cast to timestamp .. IMPALA-9280: Time string with signed TZH token preceded by dash cannot be cast to timestamp In ISO SQL datetime format, "-" is both separator character and sign in TZH token. This change make sure to NOT include the last "-" char as part of TZH token if the negative sign is the only possible separator character in the separator substring. Testing: * Add test to query_test/test_cast_with_format.py Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 --- M be/src/runtime/datetime-iso-sql-format-parser.cc M tests/query_test/test_cast_with_format.py 2 files changed, 9 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/15130/3 -- To view, visit http://gerrit.cloudera.org:8080/15130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 Gerrit-Change-Number: 15130 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9280: Fix parsing of timestamp with dash before TZH
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15130 ) Change subject: IMPALA-9280: Fix parsing of timestamp with dash before TZH .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc@332 PS2, Line 332: && cur_pos_move_count > cur_tok_move_count && *(*current_pos) != '+' > What I understand from your explanation, the number of separator chars in t Basically separator matching works in a way that if there is at least one separator in a position somewhere in the format then there should be at least one separator in the input string in that section. (By position I don't mean the exact index of that character, rather the relative location to the surrounding tokens). Those examples are correct. For more explanation please check the design doc for this feature (feel free to ask if you still have questions): https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/ -- To view, visit http://gerrit.cloudera.org:8080/15130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 Gerrit-Change-Number: 15130 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 18:44:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5904: Add full tsan option and fix several TSAN bugs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add full_tsan option and fix several TSAN bugs .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/15116/2/be/src/common/init.cc File be/src/common/init.cc: http://gerrit.cloudera.org:8080/#/c/15116/2/be/src/common/init.cc@423 PS2, Line 423: // https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual#non-instrumented-code), line too long (95 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 20:27:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5904: Add full tsan option and fix several TSAN bugs
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add full_tsan option and fix several TSAN bugs .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/15116/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15116/1//COMMIT_MSG@10 PS1, Line 10: existing -tsan build flag. -tsan_full is equivalent to the current -tsan > Is -tsan_full useful? Wonder if we should just get rid of it. >From what I have read online, running with ignore_noninstrumented_modules does >mask a considerable number of race conditions: https://github.com/google/sanitizers/issues/949 There might be some issues with 'ignore_noninstrumented_modules' as well, so it would be nice to have an option that runs without it. I think -tsan option will be useful for automated tests, -tsan_full might be more useful for developers who can manually filter out any noise produced by third-party libs / Java. http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/reservation-tracker.h File be/src/runtime/bufferpool/reservation-tracker.h: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/reservation-tracker.h@316 PS1, Line 316: AtomicInt64 reservation_; > I think these members need some explanation that they can be read without h Done http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/system-allocator.cc@124 PS1, Line 124: #if defined(ADDRESS_SANITIZER) || defined(THREAD_SANITIZER) > Did you run into this bug? I think it's meant to be fixed: https://bugs.llv hmm I thought I did, but I guess not. removed this block entirely since the LLVM bug is fixed. http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/coordinator-backend-state.cc@179 PS1, Line 179: unique_lock lock(lock_); > Could this cause a deadlock if we exit from one of the points below where ' Added some docs to make it clearer. Essentially the lambda is executed in an object destructor. Since C++ destroys objects in the reverse-order they were created, the lock gets acquired after all other scoped objects (including the lock) are destroyed. http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/coordinator.h@292 PS1, Line 292: AtomicBool has_called_wait_; // if true, Wait() was called > I think the writer should still hold wait_lock_? Yeah, since it guards against concurrent calls of Coordinator::Wait that would make sense. -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 20:28:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5477/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Thu, 30 Jan 2020 20:04:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15137 Change subject: IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist .. IMPALA-9342: Membership updates should only remove quiescing nodes from the blacklist Currently, the ClusterMembershipMgr will remove a node from the blacklist whenever there is an "update" for a backend from the statestore. Updates are typically restricted to updates about the quiescing status of a node. The ClusterMembershipMgr should un-blacklist quiescing nodes since quiescing nodes are not part of any executor groups and will eventually be removed from the cluster membership. Thus, there is no reason they need to remain on the blacklist. However, other updates to a backend (e.g. updates that are not related to the quiescing status of a node) should not cause that node to be un-blacklisted. Doing so could cause a node to be un-blacklisted, but not added back to any executor groups, creating a state where a node is part of the cluster membership, but not part of any executor groups (or the blacklist). This patch fixes the aforementioned issue by only un-blacklisting an updated node in ClusterMembershipMgr::UpdateMembership when the node starts quiescing. Added some DCHECKs to ensure the consistency of the blacklist and the list of executor groups. Testing: * Ran core tests * Ran test_executor_groups.p, test_restart_services.py, and test_blacklist.py with --exploration_strategy=exhaustive locally Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 --- M be/src/scheduling/cluster-membership-mgr.cc M be/src/scheduling/cluster-membership-mgr.h 2 files changed, 32 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/15137/1 -- To view, visit http://gerrit.cloudera.org:8080/15137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id062e51df86315ac214d30db882736dbb7948a77 Gerrit-Change-Number: 15137 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar
[Impala-ASF-CR] IMPALA-9149: part 2: Re-enable Ranger-related EE tests
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/15088 ) Change subject: IMPALA-9149: part 2: Re-enable Ranger-related EE tests .. Patch Set 5: > Patch Set 4: Code-Review+2 > > (1 comment) Thanks Csaba for the pointer! I will take note of this. :-) -- To view, visit http://gerrit.cloudera.org:8080/15088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17420d7ff9beacd1b4d2ad72b68b8b54983e60cb Gerrit-Change-Number: 15088 Gerrit-PatchSet: 5 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 30 Jan 2020 16:09:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15053 ) Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5552/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 20:10:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14080 ) Change subject: IMPALA-8755: Backend support for Z-ordering .. Patch Set 21: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5549/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14080 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab Gerrit-Change-Number: 14080 Gerrit-PatchSet: 21 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Anonymous Coward (520) Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 16:37:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4224: part 2: refactor filter routing table to support sinks
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15083 ) Change subject: IMPALA-4224: part 2: refactor filter routing table to support sinks .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5551/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15083 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 Gerrit-Change-Number: 15083 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 19:13:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/15053 ) Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. Patch Set 7: (9 comments) http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/aggregator.h File be/src/exec/aggregator.h: http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/aggregator.h@132 PS6, Line 132: const std::string& name); > nit: indent +2 Done http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/grouping-aggregator.h File be/src/exec/grouping-aggregator.h: http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/grouping-aggregator.h@126 PS6, Line 126: ~GroupingAggregatorConfig() override {} > please add 'override' Done http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/grouping-aggregator.h@165 PS6, Line 165: > nit: double space Done http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/grouping-aggregator.h@176 PS6, Line 176: /// function. > nit: +1 / Done http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/hash-table.h@51 PS6, Line 51: class > ScalarExprsResultsRowLayout is a struct, which lead to the complaints by cl Done http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/hash-table.cc@951 PS6, Line 951: c > nit: extra space Done http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/non-grouping-aggregator.h File be/src/exec/non-grouping-aggregator.h: http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/non-grouping-aggregator.h@46 PS6, Line 46: ~NonGroupingAggregatorConfig() override {} > please add 'override' Done http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exec/non-grouping-aggregator.h@101 PS6, Line 101: referenc > typo: reference Done http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exprs/scalar-expr.h File be/src/exprs/scalar-expr.h: http://gerrit.cloudera.org:8080/#/c/15053/6/be/src/exprs/scalar-expr.h@203 PS6, Line 203: static std::string DebugString(const std::vector& exprs); : std::string DebugString(const std::str > optional: this could be a constructor in ScalarExprsResultsRowLayout good idea! -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 19:27:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Hello Daniel Becker, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15053 to look at the new patch set (#7). Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes Refactored code to move codegen code from aggregation exec nodes to their plan nodes. Added some TODOs that will be fixed in the next few patch. Testing: - Ran queries and confirmed manually that the codegened code works. - Ran all e2e tests for agg nodes and partition joins. Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 --- M be/src/exec/aggregation-node-base.cc M be/src/exec/aggregator.cc M be/src/exec/aggregator.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exprs/expr-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h 15 files changed, 448 insertions(+), 224 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/15053/7 -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 2: (2 comments) Thanks for working on this. This is looking pretty good. I'm thinking through the edge cases where we want to override some versions, so I may have a couple more comments. In the meantime, I'm going to run an upstream verify job on this review. http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh@719 PS2, Line 719: export IMPALA_TOOLCHAIN_KUDU_MAVEN_REPOSITORY="file://${IMPALA_TOOLCHAIN}" Since this is disabled, I think we can set it to an empty string. If that works, use it. http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh@722 PS2, Line 722: export IMPALA_KUDU_VERSION="3ba5ec5d0" : export IMPALA_KUDU_JAVA_VERSION="1.12.0-SNAPSHOT" One use case that we want to support is for someone to be able to override the IMPALA_KUDU_VERSION and IMPALA_KUDU_JAVA_VERSION and build against a custom Kudu. One might export these variables in bin/impala-config-branch.sh. So, IMPALA_KUDU_VERSION and IMPALA_KUDU_JAVA_VERSION should respect a preset if it is set. i.e. IMPALA_KUDU_VERSION=${IMPALA_KUDU_VERSION-"3ba5ec5d0"} IMPALA_KUDU_JAVA_VERSION=${IMPALA_KUDU_JAVA_VERSION-"1.12.0-SNAPSHOT"} -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Thu, 30 Jan 2020 20:04:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9280: Fix parsing of timestamp with dash before TZH
Hello Gabor Kaszab, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15130 to look at the new patch set (#4). Change subject: IMPALA-9280: Fix parsing of timestamp with dash before TZH .. IMPALA-9280: Fix parsing of timestamp with dash before TZH In ISO SQL datetime format, "-" is both separator character and sign in TZH token. This change make sure to NOT include the last "-" char as part of TZH token if the negative sign is the only possible separator character in the separator substring. Testing: * Add test to query_test/test_cast_with_format.py Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 --- M be/src/runtime/datetime-iso-sql-format-parser.cc M tests/query_test/test_cast_with_format.py 2 files changed, 9 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/15130/4 -- To view, visit http://gerrit.cloudera.org:8080/15130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 Gerrit-Change-Number: 15130 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Norbert Luksa has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/15051/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15051/6//COMMIT_MSG@35 PS6, Line 35: Also run TPC-H queries on the TPC-H benchmark where some :queries' runtime improved by around 10-15%, > You could also state explicitly whether other queries have regressed or not Done http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc@683 PS3, Line 683: > It is still used for CollectionValueBuilder. Oh, right. http://gerrit.cloudera.org:8080/#/c/15051/6/be/src/exec/orc-column-readers.h File be/src/exec/orc-column-readers.h: http://gerrit.cloudera.org:8080/#/c/15051/6/be/src/exec/orc-column-readers.h@202 PS6, Line 202: batch_ = static_cast(orc_batch); : if (orc_batch == nullptr) return Status::OK(); > Is it ok that batch_ is not set to null if orc_batch is null? Swapped the two lines. http://gerrit.cloudera.org:8080/#/c/15051/6/be/src/exec/orc-column-readers.h@210 PS6, Line 210: _- > Does Orc support mixed encoding, e.g. a stripe that has both dictionary and Yes, according to the ORC documentation it cannot change through stripes. I'm not sure if it's worth to add an extra variable just to make sure that this behaviour is preserved by the ORC lib. There are DCHECKs which should also fail in that case (dynamic casting to the Vector batch is based on the encoding). I think a comment to clarify this behaviour would be enough, but please raise your concern if you think otherwise. http://gerrit.cloudera.org:8080/#/c/15051/6/be/src/exec/orc-column-readers.h@211 PS6, Line 211: > "RETURN_IF_ERROR" + "return Status::OK()" could be simplified to "return". Done -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 7 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 17:17:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5904: Add full tsan option and fix several TSAN bugs
Hello Adar Dembo, Tim Armstrong, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15116 to look at the new patch set (#2). Change subject: IMPALA-5904: Add full_tsan option and fix several TSAN bugs .. IMPALA-5904: Add full_tsan option and fix several TSAN bugs This patch adds an additional build flag -full_tsan in addition to the existing -tsan build flag. -full_tsan is equivalent to the current -tsan behavior, and -tsan is changed to set the ignore_noninstrumented_modules flag to true. ignore_noninstrumented_modules causes TSAN to ignore any modules that are not TSAN-instrumented. This is necessary to get TSAN to play nicely with Java, since Java is not TSAN-instrumented (see https://wiki.openjdk.java.net/display/tsan/Main and JDK-8208520). While this might decrease the number of issues surfaced by TSAN, it drastically decreases the amount of noise produced by TSAN because the JVM is not running TSAN-instrumented code. Without this flag set to true, almost every single backend test fails with the error: WARNING: ThreadSanitizer: data race (pid=12939) Write of size 1 at 0x7fcbe379c4c6 by thread T31: #0 strncpy /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:650 (unifiedbetests+0x1b2a4ad) #1 (libjvm.so+0x90e706) This patch fixes various TSAN bugs (e.g. data races) reported while running backend tests and E2E against a TSAN build (it does not make Impala completely TSAN-clean). This patch makes the following changes: * Fixes several bugs involving issues with updating shared variables between threads * Fixes a few race conditions in test classes * Where possible, existing locks are used to fix any data races; in cases where the locking logic is non-trivial, atomics are used * There are a few places where variables are marked as 'volatile' presumably for synchronization purposes; TSAN flags these 'volatile' variables as unsafe, and according to https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rconc-volatile using 'volatile' for synchronization is dangerous; in these cases, the 'volatile' variables are changed to 'atomic' variables * This patch adds a suppression file (bin/tsan-suppresions.txt) similar to the UBSAN suppresion file (bin/ubsan-suppresions.txt) Testing: * Ran exhaustive tests * Ran core tests w/ ASAN build * Manually re-ran backend tests against a TSAN build and made sure the reported errors are gone Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 --- M CMakeLists.txt M be/CMakeLists.txt M be/src/common/init.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/rpc/thrift-thread.cc M be/src/rpc/thrift-thread.h M be/src/runtime/bufferpool/reservation-tracker.cc M be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/bufferpool/system-allocator.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-stress.h M be/src/runtime/io/scan-range.cc M be/src/service/session-expiry-test.cc M be/src/statestore/statestore-test.cc M be/src/statestore/statestore.cc M be/src/util/runtime-profile-test.cc M be/src/util/stopwatch.h M bin/run-backend-tests.sh A bin/tsan-suppressions.txt M buildall.sh M tests/common/environ.py M tests/webserver/test_web_pages.py 30 files changed, 209 insertions(+), 99 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/15116/2 -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9280: Fix parsing of timestamp with dash before TZH
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15130 ) Change subject: IMPALA-9280: Fix parsing of timestamp with dash before TZH .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5553/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 Gerrit-Change-Number: 15130 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 20:46:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9280: Fix parsing of timestamp with dash before TZH
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15130 ) Change subject: IMPALA-9280: Fix parsing of timestamp with dash before TZH .. Patch Set 2: (4 comments) Thank you Gabor for your explanation. http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc@297 PS2, Line 297: int cur_pos_move_count = 0; > I think the algorithm used here could maybe be simplified. I don't want to Ack http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc@332 PS2, Line 332: && cur_pos_move_count > cur_tok_move_count && *(*current_pos) != '+' > Counting the processed separator tokens doesn't seem the best approach for What I understand from your explanation, the number of separator chars in the time string does not need to be as much as separator chars in the pattern. What matter is there should be at least one separator char in the time string. Therefore, given a pattern "HH:MI-TZH": a) "08:00-01" is equal to "08:00 +01" b) "08:00-+01" is equal to "08:00 +01" c) "08:00--01" is equal to "08:00 -01" d) "08:00---01" is equal to "08:00 -01" And all above is also true for pattern "HH:MI--TZH", is that correct? http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/timestamp-test.cc File be/src/runtime/timestamp-test.cc: http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/timestamp-test.cc@1071 PS2, Line 1071: TEST(TimestampTest, IsoSqlWithTZ) { > Almost all of these formatting tests are in tests/query_test/test_cast_with Ack http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/timestamp-test.cc@1078 PS2, Line 1078: HH:MI-TZH > As far as I know we don't really support dateless timestamps even though Im Will follow example from tests/query_test/test_cast_with_format.py -- To view, visit http://gerrit.cloudera.org:8080/15130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 Gerrit-Change-Number: 15130 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 17:52:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9280: Fix parsing of timestamp with dash before TZH
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15130 ) Change subject: IMPALA-9280: Fix parsing of timestamp with dash before TZH .. Patch Set 4: (1 comment) Patch set 4 submitted. I only add one test case that is mentioned in JIRA since other corner case that I can think of has been covered by existing tests. http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/15130/2/be/src/runtime/datetime-iso-sql-format-parser.cc@332 PS2, Line 332: > Basically separator matching works in a way that if there is at least one s Ack -- To view, visit http://gerrit.cloudera.org:8080/15130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24c636e75fd380f6ebd091bcb38bb60a274f0e00 Gerrit-Change-Number: 15130 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 20:08:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes
Hello Daniel Becker, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15053 to look at the new patch set (#8). Change subject: IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes .. IMPALA-4080 [part 1]: Move codegen code from aggregation exec nodes to their plan nodes Refactored code to move codegen code from aggregation exec nodes to their plan nodes. Added some TODOs that will be fixed in the next few patch. Testing: - Ran queries and confirmed manually that the codegened code works. - Ran all e2e tests for agg nodes and partition joins. Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 --- M be/src/exec/aggregation-node-base.cc M be/src/exec/aggregator.cc M be/src/exec/aggregator.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exprs/expr-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h 15 files changed, 448 insertions(+), 224 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/15053/8 -- To view, visit http://gerrit.cloudera.org:8080/15053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58f52a262ac7d0af259d5bcda72ada93a851d3b2 Gerrit-Change-Number: 15053 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/15051/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15051/6//COMMIT_MSG@35 PS6, Line 35: Also run TPC-H queries on the TPC-H benchmark where some :queries' runtime improved by around 10-15%. You could also state explicitly whether other queries have regressed or not. -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 6 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 15:44:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering
Norbert Luksa has uploaded a new patch set (#21). ( http://gerrit.cloudera.org:8080/14080 ) Change subject: IMPALA-8755: Backend support for Z-ordering .. IMPALA-8755: Backend support for Z-ordering This change depends on gerrit.cloudera.org/#/c/13955/ (Frontend support for Z-ordering) The commit adds a Comparator based on Z-ordering. See in detail: https://en.wikipedia.org/wiki/Z-order_curve The comparator instead of calculating the Z-values of the rows, looks for the column with the most significant dimension, and compares the values of this column only. The most significant dimension will be the one where the compared values have the highest different bits. The algorithm requires values of the same binary representation, therefore the values are converted into either uint32_t, uint63_t or uint128_t, the smallest in which all data fits. Comparing smaller types with bigger ones would make the bigger type much more dominant therefore the bits of these smaller types are shifted up. All primitive types (including string and floating point types) are supported. Testing: * Added unit tests. * Run manual tests, comparing 4-column values with 4-bit integers, for all possible combinations. Checked the result by calculating the Z-value for each comparison. * Tested performance on various data, getting great results for selective queries. An example: used the TPCH dataset's lineitem table with scale 25, where the sorting columns are l_partkey and l_suppkey, in that order. Run selective queries for the value range of the two columns, for both lexical and Z-ordering and compared the percentage of filtered pages and row groups. While queries with filters on the first column showed almost no difference, queries on the second column is in favour of Z-ordering: Ordering | Column | Filtered pages % | Filtered row groups % Lex. 1st ~99% ~90% Z-ord. 1st ~99% ~89% Lex. 2nd ~25% 0% Z-ord. 2nd ~97% 0% The only drawback is the sorting itself, taking ~4 times more than lexical sorting (eg. sorting for the dataset above took 14m for Lexical, and 55m for Z-ordering). Note however, that this is a one-time thing to do, sorting only happens once, when writing the data. Also, lexical ordering is supported by codegen, while it is not implemented for Z-ordering yet. Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab --- M be/src/exec/exchange-node.cc M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/partial-sort-node.cc M be/src/exec/partial-sort-node.h M be/src/exec/sort-node.cc M be/src/exec/sort-node.h M be/src/exec/topn-node.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M be/src/util/CMakeLists.txt A be/src/util/tuple-row-compare-test.cc M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java 18 files changed, 1,127 insertions(+), 95 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/14080/21 -- To view, visit http://gerrit.cloudera.org:8080/14080 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab Gerrit-Change-Number: 14080 Gerrit-PatchSet: 21 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Anonymous Coward (520) Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5548/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Thu, 30 Jan 2020 13:41:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14080 ) Change subject: IMPALA-8755: Backend support for Z-ordering .. Patch Set 20: (1 comment) http://gerrit.cloudera.org:8080/#/c/14080/20/be/src/util/tuple-row-compare-test.cc File be/src/util/tuple-row-compare-test.cc: http://gerrit.cloudera.org:8080/#/c/14080/20/be/src/util/tuple-row-compare-test.cc@42 PS20, Line 42: desc nit: add underscore suffix -- To view, visit http://gerrit.cloudera.org:8080/14080 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab Gerrit-Change-Number: 14080 Gerrit-PatchSet: 20 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Anonymous Coward (520) Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 14:58:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4224: part 2: refactor filter routing table to support sinks
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15083 ) Change subject: IMPALA-4224: part 2: refactor filter routing table to support sinks .. Patch Set 6: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/15083/6/be/src/runtime/coordinator-filter-state.h File be/src/runtime/coordinator-filter-state.h: http://gerrit.cloudera.org:8080/#/c/15083/6/be/src/runtime/coordinator-filter-state.h@167 PS6, Line 167: emplace Based on the text from https://en.cppreference.com/w/cpp/container/unordered_map/emplace this might cause an extra copying of filter when the key exists: "The element may be constructed even if there already is an element with the key in the container, in which case the newly constructed element will be destroyed immediately." http://gerrit.cloudera.org:8080/#/c/15083/6/be/src/runtime/coordinator-filter-state.h@167 PS6, Line 167: std::piecewise_construct, : std::forward_as_tuple(filter.filter_id), : std::forward_as_tuple nit: I see it's moved from coordinator.cc but maybe we could just use std::make_pair() to keep it simple. -- To view, visit http://gerrit.cloudera.org:8080/15083 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc53ae0167af45e32093a024ff2d6e2c6466c876 Gerrit-Change-Number: 15083 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 15:14:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15051 ) Change subject: IMPALA-9226: Improve string allocations of the ORC scanner .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15051/3/be/src/exec/hdfs-orc-scanner.cc@683 PS3, Line 683: > As I see (and as you also suggested offline), we do not need to pass the me It is still used for CollectionValueBuilder. http://gerrit.cloudera.org:8080/#/c/15051/6/be/src/exec/orc-column-readers.h File be/src/exec/orc-column-readers.h: http://gerrit.cloudera.org:8080/#/c/15051/6/be/src/exec/orc-column-readers.h@202 PS6, Line 202: if (orc_batch == nullptr) return Status::OK(); : batch_ = static_cast(orc_batch); Is it ok that batch_ is not set to null if orc_batch is null? http://gerrit.cloudera.org:8080/#/c/15051/6/be/src/exec/orc-column-readers.h@210 PS6, Line 210: InitBlob Does Orc support mixed encoding, e.g. a stripe that has both dictionary and direct encoded encoded chunks? In that case if there were already dictionary encoded vectors, then we "forget" the dictionary blob here. If it is possible that there will be more dictionary encoded chunks, then blob_ will still point to a direct encoded blob instead of the dictionary. According to https://orc.apache.org/specification/ORCv2/ the same encoding will be used for a column in the whole stripe, so the issue above is not possible, but the code suggests that encoding can vary from ColumnVectorBatch to ColumnVectorBatch. It would be clearer to add a state to OrcStringColumnReader that stores the encoding, and treat it as an error if it changes within a stripe. In Parquet encoding can vary from page to page, it generally starts with dictionary encoded pages and falls back to "plain" when the dictionary gets too large, so readers (like me) may assume that Orc works similarly. http://gerrit.cloudera.org:8080/#/c/15051/6/be/src/exec/orc-column-readers.h@211 PS6, Line 211: return Status::OK(); "RETURN_IF_ERROR" + "return Status::OK()" could be simplified to "return". -- To view, visit http://gerrit.cloudera.org:8080/15051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f Gerrit-Change-Number: 15051 Gerrit-PatchSet: 6 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Jan 2020 15:20:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9149: part 2: Re-enable Ranger-related EE tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15088 ) Change subject: IMPALA-9149: part 2: Re-enable Ranger-related EE tests .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17420d7ff9beacd1b4d2ad72b68b8b54983e60cb Gerrit-Change-Number: 15088 Gerrit-PatchSet: 5 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 30 Jan 2020 14:23:31 + Gerrit-HasComments: No