[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Tianyi Wang, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#15). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 677 insertions(+), 245 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/15 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 15 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 14: Next change handles the merge. It was straightforward; the ec files are handled as regular hdfs files, so do not go through the prev. synthetic block generation. For testing, I re-ran core and s3 tests. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 14 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 23 May 2018 05:39:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/client-request-state.cc@975 PS12, Line 975: // RPCs and can block for a long time. Maybe we could add something short explaining this condition. E.g. "Ensure the parent query is cancelled if execution has started (if the query was not started, cancellation is handled by the async thread)." http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/impala-http-handler.cc@776 PS12, Line 776: request_state->operation_state() != TOperationState::ERROR_STATE > I guess another way to as it is what does this error mean "invalid query id Yeah the logic of this function is a bit weird. It looks like we actually return success with an empty summary, etc for archived DDL statements (and other things that wouldn't have had a coordinator when running). So it would be more consistent to do the same for a running query and would allow deleting some of this code. -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 12 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 05:09:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6953: part 2: clean up DiskIoMgr
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10479 ) Change subject: IMPALA-6953: part 2: clean up DiskIoMgr .. Patch Set 2: Code-Review+2 (2 comments) carry http://gerrit.cloudera.org:8080/#/c/10479/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10479/1//COMMIT_MSG@11 PS1, Line 11: Made further mechanical changes that involved a lot of code motion: > Both of these changes make sense on their own, so i don't see the need to s Ok, I'll merge separately. http://gerrit.cloudera.org:8080/#/c/10479/1//COMMIT_MSG@15 PS1, Line 15: Remove some unused member > whats that? lol, done -- To view, visit http://gerrit.cloudera.org:8080/10479 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie82849652266b6660dd479689e5134273b172eb5 Gerrit-Change-Number: 10479 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 04:20:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6953: part 2: clean up DiskIoMgr
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10479 ) Change subject: IMPALA-6953: part 2: clean up DiskIoMgr .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2529/ -- To view, visit http://gerrit.cloudera.org:8080/10479 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie82849652266b6660dd479689e5134273b172eb5 Gerrit-Change-Number: 10479 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 04:20:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6953: part 1: clean up DiskIoMgr
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: part 1: clean up DiskIoMgr .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2528/ -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 10 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 04:20:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6953: part 1: clean up DiskIoMgr
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: part 1: clean up DiskIoMgr .. Patch Set 10: Code-Review+2 forgot to rebase before merge -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 10 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 04:19:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6953: part 2: clean up DiskIoMgr
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10479 to look at the new patch set (#2). Change subject: IMPALA-6953: part 2: clean up DiskIoMgr .. IMPALA-6953: part 2: clean up DiskIoMgr Follow-on changes. Made further mechanical changes that involved a lot of code motion: * Move code that manipulates RequestContext internal state to RequestContext * Don't use protected friend pattern that doesn't work as I thought * Remove some unused member variables and arguments * Move PerDiskState definition into .cc Change-Id: Ie82849652266b6660dd479689e5134273b172eb5 --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/scanner-context.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h 17 files changed, 527 insertions(+), 564 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/10479/2 -- To view, visit http://gerrit.cloudera.org:8080/10479 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie82849652266b6660dd479689e5134273b172eb5 Gerrit-Change-Number: 10479 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-6953: part 1: clean up DiskIoMgr
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: part 1: clean up DiskIoMgr .. Patch Set 9: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2527/ -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 9 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 04:19:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6953: part 1: clean up DiskIoMgr
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: part 1: clean up DiskIoMgr .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2527/ -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 9 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 04:13:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6953: part 1: clean up DiskIoMgr
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: part 1: clean up DiskIoMgr .. Patch Set 9: Code-Review+2 carry -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 9 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 04:12:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 01:38:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table We realized that the libHDFS API call hdfsGetHedgedReadMetrics() crashes when the 'fs' argument passed to it is not a HDFS filesystem. There is an open bug for it on the HDFS side: HDFS-13417 However, it looks like we won't be getting a fix for it in the short term, so our only option at this point is to skip it. Testing: Made sure that enabling preads and scanning from S3 doesn't cause a crash. Also, added a custom cluster test to exercise the pread code path. We are unable to verify hedged reads in a minicluster, but we can at least exercise the code path to make sure that nothing breaks. Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Reviewed-on: http://gerrit.cloudera.org:8080/9966 Reviewed-by: Sailesh MukilTested-by: Impala Public Jenkins --- M be/src/runtime/io/scan-range.cc A tests/custom_cluster/test_hedged_reads.py 2 files changed, 33 insertions(+), 1 deletion(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size
Lars Volker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10483 Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size .. IMPALA-7044: Prevent overflow when computing Parquet block size When writing Parquet files we compute a minimum block size based on the number of columns in the target table: 3 * page_size * num_cols For tables with a large number of columns (> ~10k), this value will get larger than 2GB. When we pass it to hdfsOpenFile() in HdfsTableSink::CreateNewTmpFile() it gets cast to a signed int32 and can overflow. To fix this we return an error if we detect that the minimum block size exceed 2GB. This change adds a test using CTAS into a table with 12k columns, making sure that Impala returns the correct error. Change-Id: I6e63420e5a093c0bbc789201771708865b16e138 --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M tests/query_test/test_insert_parquet.py 4 files changed, 34 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/10483/1 -- To view, visit http://gerrit.cloudera.org:8080/10483 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138 Gerrit-Change-Number: 10483 Gerrit-PatchSet: 1 Gerrit-Owner: Lars Volker
[Impala-ASF-CR] [DOCS] Added a link to impala kerberos doc in impala-shell options doc
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10482 Change subject: [DOCS] Added a link to impala kerberos doc in impala-shell options doc .. [DOCS] Added a link to impala kerberos doc in impala-shell options doc Change-Id: Ifbd542dc0e9051e022636dd1b6bebb69b9b22b08 --- M docs/topics/impala_shell_options.xml 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/10482/1 -- To view, visit http://gerrit.cloudera.org:8080/10482 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ifbd542dc0e9051e022636dd1b6bebb69b9b22b08 Gerrit-Change-Number: 10482 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-5168: Codegen HASH PARTITIONED KrpcDataStreamSender::Send()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10421 ) Change subject: IMPALA-5168: Codegen HASH_PARTITIONED KrpcDataStreamSender::Send() .. Patch Set 2: Tim, thanks for pointing that out. I tried the idea of splitting up each partition expression evaluation and hashing into a separate function. The idea is that if there are few expressions, the compiler should be smart enough to inline them so we will get the same benefit. If there are too many expressions, the compiler will do appropriate level of inlining. However, for some reasons, this doesn't quite pan out as expected. In particular, splitting each expression into a separate function resulted in slightly longer codegen time. The patch is here (https://github.com/michaelhkw/incubator-impala/commit/76d071f1bf2d82d4b0905ed2e64150f43198538f). I compared the level of regression with different wide tables using the query you suggested: widetable_250_cols: 5% regression widetable_1000_cols: 3% regression widetable_250_cols_big: 12% regression widetable_250_cols_big is created by concatenating multiple widetable_250_cols to a table with 80 rows. It's large enough to trigger 3 instances of scan nodes (instead of 1 in the case of widetable_250_col). The larger table triggers slightly different optimized code as there are 3 channels. In the case of widetable_250_col (smaller table), there is only 1 channel so the optimizer was smart enough to optimize away all the evaluate and hash functions as all rows are being added to the same channel. Anyhow, I am open to other suggestions (e.g. optimizing the patch posted above). Given the level regression observed above, do you think there are extra steps which we should take ? -- To view, visit http://gerrit.cloudera.org:8080/10421 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff Gerrit-Change-Number: 10421 Gerrit-PatchSet: 2 Gerrit-Owner: Michael HoGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 01:21:16 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Not able to create new tables when the statestore is offline
Alex Rodoni has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10481 ) Change subject: [DOCS] Not able to create new tables when the statestore is offline .. [DOCS] Not able to create new tables when the statestore is offline Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244 --- M docs/topics/impala_components.xml 1 file changed, 13 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/10481/2 -- To view, visit http://gerrit.cloudera.org:8080/10481 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244 Gerrit-Change-Number: 10481 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] [DOCS] Not able to create new tables when the statestore is offline
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10481 Change subject: [DOCS] Not able to create new tables when the statestore is offline .. [DOCS] Not able to create new tables when the statestore is offline Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244 --- M docs/topics/impala_components.xml 1 file changed, 10 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/10481/1 -- To view, visit http://gerrit.cloudera.org:8080/10481 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244 Gerrit-Change-Number: 10481 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/impala-http-handler.cc@776 PS12, Line 776: request_state->operation_state() != TOperationState::ERROR_STATE > we actually want to check for queries that go through the admission contro I guess another way to as it is what does this error mean "invalid query id". Clearly this was not an invalid query id, or else there wouldn't be a request_state for it. So, what would be the appropriate error message for this case? put yet another way, if we just deleted this whole block, what would go wrong? -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 12 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 May 2018 00:24:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6953: part 2: clean up DiskIoMgr
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10479 ) Change subject: IMPALA-6953: part 2: clean up DiskIoMgr .. Patch Set 1: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/10479/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10479/1//COMMIT_MSG@11 PS1, Line 11: merging. Both of these changes make sense on their own, so i don't see the need to squash. But I don't feel strongly so do what you feel is best/easiest in this case. http://gerrit.cloudera.org:8080/#/c/10479/1//COMMIT_MSG@15 PS1, Line 15: RequestContRequestContext whats that? -- To view, visit http://gerrit.cloudera.org:8080/10479 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie82849652266b6660dd479689e5134273b172eb5 Gerrit-Change-Number: 10479 Gerrit-PatchSet: 1 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Comment-Date: Wed, 23 May 2018 00:19:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6917: Implement COMMENT ON TABLE/VIEW
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10478 Change subject: IMPALA-6917: Implement COMMENT ON TABLE/VIEW .. IMPALA-6917: Implement COMMENT ON TABLE/VIEW This patch implements updating comment on a table or view. Syntax: COMMENT ON TABLE t IS 'comment' COMMENT ON VIEW v IS 'comment' Testing: - Added new front-end tests - Ran all front-end tests - Added new end-to-end tests - Ran end-to-end DDL tests Change-Id: I497c17342f79ff7c99931fd8a0ddec0c79303dbf --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CommentOnDbStmt.java A fe/src/main/java/org/apache/impala/analysis/CommentOnTableOrViewStmt.java A fe/src/main/java/org/apache/impala/analysis/CommentOnTableStmt.java A fe/src/main/java/org/apache/impala/analysis/CommentOnViewStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M tests/metadata/test_ddl.py M tests/metadata/test_ddl_base.py 12 files changed, 278 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/10478/4 -- To view, visit http://gerrit.cloudera.org:8080/10478 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I497c17342f79ff7c99931fd8a0ddec0c79303dbf Gerrit-Change-Number: 10478 Gerrit-PatchSet: 4 Gerrit-Owner: Fredy Wijaya
[native-toolchain-CR] Upgrade Protobuf to 3.5.1
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10480 Change subject: Upgrade Protobuf to 3.5.1 .. Upgrade Protobuf to 3.5.1 Protobuf 3.0.0+ has support for both v2 and v3 of protobuf syntax. Apparently, the current version of protobuf in toolchain (2.6.1) doesn't seem to recognize the map type documented in protobuf v2 syntax. Change-Id: I4e4e9dc301ff16f9aecc32d30c22b523dda3033c --- M buildall.sh 1 file changed, 6 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/80/10480/1 -- To view, visit http://gerrit.cloudera.org:8080/10480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4e4e9dc301ff16f9aecc32d30c22b523dda3033c Gerrit-Change-Number: 10480 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-6953: part 1: clean up DiskIoMgr
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: part 1: clean up DiskIoMgr .. Patch Set 7: (3 comments) http://gerrit.cloudera.org:8080/#/c/10245/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10245/8//COMMIT_MSG@12 PS8, Line 12: Make DiskQueue is now an encapsulated class. > Nit: grammar Done http://gerrit.cloudera.org:8080/#/c/10245/8/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/10245/8/be/src/runtime/io/disk-io-mgr.h@440 PS8, Line 440: /// Total bytes read by the IoMgr. Upon further examination, total_bytes_read_counter and read_timer appear to be unused, so I removed them. http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h@498 PS7, Line 498: inline void RequestContext::IncrementDiskThreadAfterDequeue(int disk_id) { > I meant could we even more the class definition to the .cc file? But it loo Ah, yeah that's a good idea, I fixed up those places where DiskIoMgr touched disk_states_. In the process of looking at this I realised that the "protected friend" thing I was doing didn't actually do what I expected, so I went away from that and just documented int. I separated that out into a follow-on patch to make it easier to review. I think I'll squash the two before merging to reduce code churn in the git history, if that works for you. -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 23:27:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6953: part 2: clean up DiskIoMgr
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10479 Change subject: IMPALA-6953: part 2: clean up DiskIoMgr .. IMPALA-6953: part 2: clean up DiskIoMgr Follow-on changes. I separated this out from the previous patch to make it easier to review, but I'll squash before merging. Made further mechanical changes that involved a lot of code motion: * Move code that manipulates RequestContext internal state to RequestContRequestContext * Don't use protected friend pattern that doesn't work as I thought * Remove some unused member variables and arguments * Move PerDiskState definition into .cc Change-Id: Ie82849652266b6660dd479689e5134273b172eb5 --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/scanner-context.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h 17 files changed, 527 insertions(+), 564 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/10479/1 -- To view, visit http://gerrit.cloudera.org:8080/10479 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie82849652266b6660dd479689e5134273b172eb5 Gerrit-Change-Number: 10479 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-6953: part 1: clean up DiskIoMgr
Hello Joe McDonnell, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10245 to look at the new patch set (#9). Change subject: IMPALA-6953: part 1: clean up DiskIoMgr .. IMPALA-6953: part 1: clean up DiskIoMgr There should be no behavioural changes as a result of this refactoring. Make DiskQueue an encapsulated class. Remove friend classes where possible, either by using public methods or moving code between classes. Move method into protected in some cases. Split GetNextRequestRange() into two methods that operate on DiskQueue and RequestContext state. The methods belong to the respective classes. Reduce transitive #include dependencies to hopefully help with build time. Testing: Ran core tests. Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 --- M be/src/common/global-flags.cc M be/src/exec/blocking-join-node.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node-mt.h M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/parquet-column-readers.cc M be/src/runtime/io/disk-io-mgr-internal.h M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/error-converter.cc M be/src/runtime/io/local-file-system.cc M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/row-batch.h 21 files changed, 590 insertions(+), 534 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/10245/9 -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 9 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. Patch Set 3: > preads add some overhead to HDFS, since it sends out redundant read > requests to replicas. Having it on by default might cause some > confusion while running multiple heavy concurrent workloads. You're talking about hedged reads, not preads. We can use pread with hedge reads on or off. So that can't be the blocker. -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 23:16:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6953: clean up DiskIoMgr
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: clean up DiskIoMgr .. Patch Set 8: Code-Review+2 (1 comment) Looks good to me. The commit message could use a once-over for grammar. http://gerrit.cloudera.org:8080/#/c/10245/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10245/8//COMMIT_MSG@12 PS8, Line 12: Make DiskQueue is now an encapsulated class. Nit: grammar -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 23:14:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6802 (part 4): Clean up authorization tests
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/10442 ) Change subject: IMPALA-6802 (part 4): Clean up authorization tests .. Patch Set 1: (11 comments) http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java: http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@104 PS1, Line 104: private static final String[] ALLTYPES_COLUMNS = (String []) ArrayUtils.addAll( nit: String[] http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@923 PS1, Line 923: .ok(onServer(TPrivilegeLevel.INSERT)) We need to add REFRESH since VIEW_METADATA consists of SELECT, INSERT, and REFRESH http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@927 PS1, Line 927: .error(accessError("functional")); Missing check .error(accessError("functional"), onDatabase...) http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@933 PS1, Line 933: TTableName tableName = new TTableName("functional","alltypes"); nit: space after comma http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@935 PS1, Line 935: authorize("describe functional.alltypes") describe uses ANY privilege, we need more test cases with all other privileges. http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@961 PS1, Line 961: String [] locationString = new String[]{"Location:"}; nit: no space for array declaration http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@989 PS1, Line 989: tableName = new TTableName("functional","alltypes_view"); nit: space after comma http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@1013 PS1, Line 1013: tableName = new TTableName("functional","alltypes_view"); nit: space after comma http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@1018 PS1, Line 1018: viewStrings); join L1018 with L1017 http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@1040 PS1, Line 1040: authorize("describe functional.allcomplextypes.int_struct_col") describe uses ANY privilege, we need more test cases with all other privileges. http://gerrit.cloudera.org:8080/#/c/10442/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@1183 PS1, Line 1183: String [] requiredStrings, String [] excludedStrings, nit: fix indentation -- To view, visit http://gerrit.cloudera.org:8080/10442 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic4cf3a4751b41908ef81ec35d89a2713d9fa0dc4 Gerrit-Change-Number: 10442 Gerrit-PatchSet: 1 Gerrit-Owner: Adam HolleyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Comment-Date: Tue, 22 May 2018 23:11:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5392: Added all stack frames to ThreadInfo summary.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10145 ) Change subject: IMPALA-5392: Added all stack frames to ThreadInfo summary. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/10145/2/fe/src/main/java/org/apache/impala/common/JniUtil.java File fe/src/main/java/org/apache/impala/common/JniUtil.java: http://gerrit.cloudera.org:8080/#/c/10145/2/fe/src/main/java/org/apache/impala/common/JniUtil.java@275 PS2, Line 275: \n > This entire output is rendered AS IS on CatalogUI page. If there is a new l I have checked again and I do not believe this is accurate. I see in the page source text like at java.lang.Thread.run(Thread.java:748) Number of locked synchronizers = 1 - java.util.concurrent.ThreadPoolExecutor$Worker@7cd1ac19 and page display 624) at java.lang.Thread.run(Thread.java:748) Number of locked synchronizers = 1 - java.util.concurrent.ThreadPoolExecutor$Worker@2d35442b -- To view, visit http://gerrit.cloudera.org:8080/10145 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80ab4aad03e0c1f01fecad6b87779531244c28b7 Gerrit-Change-Number: 10145 Gerrit-PatchSet: 2 Gerrit-Owner: Abhishek SharmaGerrit-Reviewer: Abhishek Sharma Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Charles Agnello Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 22:54:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10060 to look at the new patch set (#13). Change subject: IMPALA-5216: Make admission control queuing async .. IMPALA-5216: Make admission control queuing async Implement asynchronous admission control queuing. This is achieved by running the admission control code-path in a separate thread. Major changes include: propagating cancellation to the admission control thread and dequeuing thread, and adding a new Query Operation State called "PENDING" that represents the state between completion of planning and starting of query execution. Testing: - Added a deterministic end to end test - Ran multiple stress tests successfully with a cancellation probability of 60% and with different values for the following parameters: max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a valid state afterwards (no orphan fragments or wrong metrics). - Ran all exhaustive tests and ASAN core tests successfully. Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf --- M be/src/common/atomic.h M be/src/common/logging.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/debug-util.cc M be/src/util/debug-util.h M be/src/util/promise-test.cc M be/src/util/promise.h M common/thrift/ImpalaService.thrift M tests/authorization/test_authorization.py M tests/beeswax/impala_beeswax.py M tests/common/impala_connection.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/hs2/hs2_test_suite.py M tests/hs2/test_hs2.py M tests/query_test/test_observability.py M www/query_backends.tmpl 29 files changed, 624 insertions(+), 219 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10060/13 -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 13 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 12: (3 comments) http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/client-request-state.h@319 PS12, Line 319: async_ > the async part seems redundant with "thread" to me, and inconsistent with o The "async" part i added was only to differentiate it from the thread that is executing the client Exec RPC and to make it sound similar to the method "ExecAsyncQueryOrDmlRequest()" which spawns it. http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/service/client-request-state.h@176 PS9, Line 176: or* GetCoordinator( > Maybe: coord_exec_called_, then it's very literal so less chance for other Done http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/impala-http-handler.cc@776 PS12, Line 776: request_state->operation_state() != TOperationState::ERROR_STATE > I guess that's for the case that the cancellation happens before the coordi we actually want to check for queries that go through the admission controller path and the problem is that a CTAS query is also one of them, but it is a DDL. Initially, the I was using a method "uses_admission_control()" that checked for those conditions, but I realized that a CTAS query only goes through that path if the table does not exist, which can only be determined while in CRS::Exec(). That eventually led me to define uses_admission_control() as checking for the existence of async_exec_thread_. Now that we got rid of that method, this was the next best alternative as I dont see other query types to be in this operation state without being unregistered and archived. Now that I dug more into it, I see there is a possibility where "compute states" queries can end up with ERROR_STATE, while the client does not close the query handle. In that case the highlighted conditional will let it pass, but would give the same output as that taken from an archived query, so seems harmless -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 12 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 22:43:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10396 ) Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. IMPALA-3134: Support different proc mem limits among impalads for admission control checks Currently the admission controller assumes that all backends have the same process mem limit as the impalad it itself is running on. With this patch the proc mem limit for each impalad is available to the admission controller and it uses it for making correct admisssion decisions. It currently works under the assumption that the per-process memory limit does not change dynamically. Testing: Added an e2e test. IMPALA-5662: Log the queuing reason for a query The queuing reason is now logged both while queuing for the first time and while trying to dequeue. Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Reviewed-on: http://gerrit.cloudera.org:8080/10396 Reviewed-by: Bikramjeet VigTested-by: Impala Public Jenkins --- M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/impala-server.cc M bin/start-impala-cluster.py M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 9 files changed, 127 insertions(+), 38 deletions(-) Approvals: Bikramjeet Vig: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10396 ) Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 22:27:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. Patch Set 3: Code-Review+2 preads add some overhead to HDFS, since it sends out redundant read requests to replicas. Having it on by default might cause some confusion while running multiple heavy concurrent workloads. Carry +2. -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 22:12:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2526/ -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 22:12:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5392: Added all stack frames to ThreadInfo summary.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/10145 ) Change subject: IMPALA-5392: Added all stack frames to ThreadInfo summary. .. Patch Set 10: > It looks like this got stalled. Jim, do you feel comfortable taking > this to +2, since you've already looked at it? Thanks for the reminder! I got my build working again, so I'll take another look at this. Abhishek, I'll try to get to this by 5pm Thursday GMT+7. THanks for your patience. -- To view, visit http://gerrit.cloudera.org:8080/10145 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80ab4aad03e0c1f01fecad6b87779531244c28b7 Gerrit-Change-Number: 10145 Gerrit-PatchSet: 10 Gerrit-Owner: Abhishek SharmaGerrit-Reviewer: Abhishek Sharma Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Charles Agnello Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 21:53:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. Patch Set 2: Code-Review+2 Do we know what is blocking us from setting --use_hdfs_pread=true by default (and even removing the old code path)? Not asking to do that with this change, but we should see if we can get there to reduce the test matrix. -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 20:39:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10396 ) Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2523/ -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 19:03:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10396 ) Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. Patch Set 5: Code-Review+2 Carrying Tim's +2 -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 19:02:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10396 to look at the new patch set (#5). Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. IMPALA-3134: Support different proc mem limits among impalads for admission control checks Currently the admission controller assumes that all backends have the same process mem limit as the impalad it itself is running on. With this patch the proc mem limit for each impalad is available to the admission controller and it uses it for making correct admisssion decisions. It currently works under the assumption that the per-process memory limit does not change dynamically. Testing: Added an e2e test. IMPALA-5662: Log the queuing reason for a query The queuing reason is now logged both while queuing for the first time and while trying to dequeue. Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e --- M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/impala-server.cc M bin/start-impala-cluster.py M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 9 files changed, 127 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/10396/5 -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10396 ) Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10396/4/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/10396/4/common/thrift/StatestoreService.thrift@75 PS4, Line 75: // The process memory limit of this backend. > "in bytes"? Done -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 19:01:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6953: clean up DiskIoMgr
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: clean up DiskIoMgr .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h@498 PS7, Line 498: inline void RequestContext::IncrementDiskThreadAfterDequeue(int disk_id) { > disk_states_ is now private, so not accessible outside RequestContext. I ca I meant could we even more the class definition to the .cc file? But it looks like it's still accessed directly by DiskIoMgr, right? -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 18:58:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10396 ) Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. Patch Set 4: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/10396/4/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/10396/4/common/thrift/StatestoreService.thrift@75 PS4, Line 75: // The process memory limit of this backend. "in bytes"? -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 18:50:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 18:48:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6953: clean up DiskIoMgr
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: clean up DiskIoMgr .. Patch Set 8: Code-Review+1 carry +1 -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 18:45:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6953: clean up DiskIoMgr
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: clean up DiskIoMgr .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr-internal.h File be/src/runtime/io/disk-io-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr-internal.h@79 PS7, Line 79: / > pre-existing nit: should only be // Done http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr-internal.h@98 PS7, Line 98: > extra space Done http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr.h@395 PS7, Line 395: HandleWriteFinished > () Done http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h@196 PS7, Line 196: gets > get Done http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h@498 PS7, Line 498: inline void RequestContext::IncrementDiskThreadAfterDequeue(int disk_id) { > I think this wrapper didn't exist in the old code. Is the PerDiskState com disk_states_ is now private, so not accessible outside RequestContext. I can't remember exactly why I had the forward declaration in protected, just checking now... ok, so I moved it to private and it compiles, so I'll go with that. -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 18:45:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6953: clean up DiskIoMgr
Hello Joe McDonnell, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10245 to look at the new patch set (#8). Change subject: IMPALA-6953: clean up DiskIoMgr .. IMPALA-6953: clean up DiskIoMgr There should be no behavioural changes as a result of this refactoring. Make DiskQueue is now an encapsulated class. Remove friend classes where possible, either by using public methods or moving code between classes. Move method into protected in some cases. Split GetNextRequestRange() into into two methods that operate on DiskQueue and RequestContext state. The methods belong to the respective classes. Reduce transitive #include dependencies to hopefully help with build time. Testing: Ran core tests. Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 --- M be/src/common/global-flags.cc M be/src/exec/blocking-join-node.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node-mt.h M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/parquet-column-readers.cc M be/src/runtime/io/disk-io-mgr-internal.h M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/error-converter.cc M be/src/runtime/io/local-file-system.cc M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/row-batch.h 21 files changed, 590 insertions(+), 534 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/10245/8 -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10396 to look at the new patch set (#4). Change subject: IMPALA-3134: Support different proc mem limits among impalads for admission control checks .. IMPALA-3134: Support different proc mem limits among impalads for admission control checks Currently the admission controller assumes that all backends have the same process mem limit as the impalad it itself is running on. With this patch the proc mem limit for each impalad is available to the admission controller and it uses it for making correct admisssion decisions. It currently works under the assumption that the per-process memory limit does not change dynamically. Testing: Added an e2e test. IMPALA-5662: Log the queuing reason for a query The queuing reason is now logged both while queuing for the first time and while trying to dequeue. Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e --- M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/impala-server.cc M bin/start-impala-cluster.py M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 9 files changed, 127 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/10396/4 -- To view, visit http://gerrit.cloudera.org:8080/10396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Gerrit-Change-Number: 10396 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2195: Improper handling of comments in queries
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9933 ) Change subject: IMPALA-2195: Improper handling of comments in queries .. Patch Set 12: Code-Review+1 forgot to carry +1 -- To view, visit http://gerrit.cloudera.org:8080/9933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ac7cb5a30e6dda73ebe761d9f0eb9ba038e14a7 Gerrit-Change-Number: 9933 Gerrit-PatchSet: 12 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Comment-Date: Tue, 22 May 2018 18:41:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. Patch Set 2: > Sailesh: Even though it doesn't count them as hedged reads, does it > work with pread as opposed to not work at all without pread? Yes, it does take the pread() code path (I verified that by adding debug log statements), but it doesn't count them as hedged reads. -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 18:37:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > I can try changing it to generating scan ranges in the backend later. I thi Sorry I didn't see your last comment. I didn't saw to L186... Then there shouldn't be a difference between synthesizing and using the existing block locations. I will do the backend-scanrange adoption -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:34:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 12: (5 comments) Looks nice. Please go ahead and rebase when you have a chance. Tim may want to take a final look at this point. Do we have sufficient test coverage for the various webserver/RPC handlers that can happen concurrently with admission control (previously, less cases were possible since the query handle wasn't returned until after admission control)? http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/scheduling/admission-controller.h@212 PS9, Line 212: us AdmitQuery(QuerySchedule* schedule, > I agree, but its kinda hard to think of a good name since the patch is litt Works for me. Or leaving as-is also okay. http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/client-request-state.h@319 PS12, Line 319: async_ the async part seems redundant with "thread" to me, and inconsistent with our usual naming (e.g. wait_thread_ in this class is the thing that calls Wait(). Another example: build_thread_ is the thing that does the join build asynchronously). But if you feel it's better, then leave it. http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/service/client-request-state.h@176 PS9, Line 176: or* GetCoordinator( > Yea, I kinda struggled with naming this as well. Maybe: coord_exec_called_, then it's very literal so less chance for other interpretations? http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/10060/12/be/src/service/impala-http-handler.cc@776 PS12, Line 776: request_state->operation_state() != TOperationState::ERROR_STATE I guess that's for the case that the cancellation happens before the coordinator is set? stepping back, what is this case trying to catch? Should it really just be checking for non query and dml stmt types? Anyway, fine to leave this alone for now. http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/impala-server.h@429 PS11, Line 429: It represents the set of queries that : /// are either queued or have started executing. Used primarily to identif > on second thought, even "submitted_queries" is not a good name for this. "i Yeah, I'm fine with adding the bit about cancellation as long as we also state that it tracks queries to be closed with the session (like your new comment does). It was just misleading without also mentioning that. -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 12 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 18:29:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > The reason is not strong: Though EC reads are remote, we might still don't ok, so you were worried that synthetic division into blocks could differ from how a file is chunked into blocks on hdfs. Assuming the block size provided by FileStatus never changes (L186), when would you see that the two different chunking schemes differ? -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:27:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > The reason is not strong: Though EC reads are remote, we might still don't I can try changing it to generating scan ranges in the backend later. I think it's fine to leave it as it is now and merge your change. -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:27:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > I know this is merged, but just wanted to revisit why we're storing block l The reason is not strong: Though EC reads are remote, we might still don't want a block to be read as 2 blocks. Reading across block boundary might lead to more connections to name nodes and data nodes. -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:17:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7048: Failed test: query test.test parquet page index.TestHdfsParquetTableIndexWriter.test write index many columns tables
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/10476 ) Change subject: IMPALA-7048: Failed test: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/10476/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10476/1//COMMIT_MSG@7 PS1, Line 7: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables probably okay to just shorten this to 'test_write_index_many_columns_tables' -- To view, visit http://gerrit.cloudera.org:8080/10476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd3be70fb654a49dda44309a8914fe1f2b48a1af Gerrit-Change-Number: 10476 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 22 May 2018 18:15:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/9966 ) Change subject: IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table .. Patch Set 2: Sailesh: Even though it doesn't count them as hedged reads, does it work with pread as opposed to not work at all without pread? -- To view, visit http://gerrit.cloudera.org:8080/9966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Gerrit-Change-Number: 9966 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 18:13:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10060 to look at the new patch set (#12). Change subject: IMPALA-5216: Make admission control queuing async .. IMPALA-5216: Make admission control queuing async Implement asynchronous admission control queuing. This is achieved by running the admission control code-path in a separate thread. Major changes include: propagating cancellation to the admission control thread and dequeuing thread, and adding a new Query Operation State called "PENDING" that represents the state between completion of planning and starting of query execution. Testing: - Added a deterministic end to end test - Ran multiple stress tests successfully with a cancellation probability of 60% and with different values for the following parameters: max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a valid state afterwards (no orphan fragments or wrong metrics). - Ran all exhaustive tests and ASAN core tests successfully. Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf --- M be/src/common/atomic.h M be/src/common/logging.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/debug-util.cc M be/src/util/debug-util.h M be/src/util/promise-test.cc M be/src/util/promise.h M common/thrift/ImpalaService.thrift M tests/authorization/test_authorization.py M tests/beeswax/impala_beeswax.py M tests/common/impala_connection.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/hs2/hs2_test_suite.py M tests/hs2/test_hs2.py M tests/query_test/test_observability.py M www/query_backends.tmpl 29 files changed, 624 insertions(+), 219 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10060/12 -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 12 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 11: (30 comments) http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/scheduling/admission-controller.h@44 PS9, Line 44: if > add "... or the caller wants to cancel" or something like that about cancel Done http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/scheduling/admission-controller.h@204 PS9, Line 204: Returns immediately if rejected > let's say what the return value and promise value are for each case, i.e. s Done http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/scheduling/admission-controller.h@206 PS9, Line 206: ANCELLED > i.e. explain status in this case. Done http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/scheduling/admission-controller.h@212 PS9, Line 212: Promise> could create a typedef for that since it appears in several places and is k I agree, but its kinda hard to think of a good name since the patch is littered with permutations of {Admission, outcome, admit}, "AdmissionPromise" perhaps? http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/scheduling/admission-controller.cc@581 PS11, Line 581: result is set or the query is cancelled or it : // times out. > the "result is set" also happens in the cancel case. To clarify, how about Done http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/scheduling/admission-controller.cc@590 PS11, Line 590: SleepIfSetInDebugOptions(schedule->query_options(), SLEEP_AFTER_ADMISSION_OUTCOME_MS); > what's interesting about this time interval? Is it to give the dequeue loop Yes, this enables the dequeue thread to wake and get hold of the lock before this does. http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/scheduling/admission-controller.cc@607 PS11, Line 607: stats->Dequeue(*schedule, true); > nit: it was a bit hard to see that this block is really the same as the els Done http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/client-request-state.h@70 PS11, Line 70: /// the query to the Admission controller for asynchronous admission control. > is there some post condition around the operation state we can document? li yes, when this returns the operation state is either RUNNING_STATE or PENDING_STATE. Added this to the comment. http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/client-request-state.h@131 PS11, Line 131: and sets the : /// query status to CANCELLE > I thought you decided to hold off on that change. sorry, forgot to revert this comment. http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/client-request-state.h@153 PS11, Line 153: T > missing space before Done http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/client-request-state.h@286 PS11, Line 286: dequeuing thread > from this context, not clear what thread this is talking about. How about: Done http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/client-request-state.h@442 PS11, Line 442: rocessing a query or dml execution request and submitting it to the : /// admission controller > this is confusing since it doesn't actually do that work (it happens asynch Done http://gerrit.cloudera.org:8080/#/c/10060/11/be/src/service/client-request-state.h@508 PS11, Line 508: : /// Submits the QuerySchedule to the admission controller and on successful admission, : /// starts up the coordinator execution, makes it accessible by setting : /// 'coord_exec_started_' to true and advances operation_state_ to RUNNING. Handles : /// async cancellation of queries and cleans up state if needed. : void FinishExecQueryOrDmlRequest(); > let's move that to be right after ExecAsyncQueryOrDmlRequest() since it's s Done http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/service/client-request-state.h@153 PS9, Line 153: GetCoord > I think you ended up calling it GetCoordinator(), either update the comment Done http://gerrit.cloudera.org:8080/#/c/10060/9/be/src/service/client-request-state.h@173 PS9, Line 173: /// Only set for 'QUERY' and 'DML' statement types and is available only after Exec() > little hard to understand what "set" and "available" mean from just the int Done
[Impala-ASF-CR] IMPALA-6953: clean up DiskIoMgr
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: clean up DiskIoMgr .. Patch Set 7: Code-Review+1 (5 comments) I had already started looking earlier this morning. Looks generally like an improvement to me. Joe, if you want to do the review please do +2 otherwise I can do that. http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr-internal.h File be/src/runtime/io/disk-io-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr-internal.h@79 PS7, Line 79: / pre-existing nit: should only be // http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr-internal.h@98 PS7, Line 98: extra space http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/disk-io-mgr.h@395 PS7, Line 395: HandleWriteFinished () http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h File be/src/runtime/io/request-context.h: http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h@196 PS7, Line 196: gets get http://gerrit.cloudera.org:8080/#/c/10245/7/be/src/runtime/io/request-context.h@498 PS7, Line 498: inline void RequestContext::IncrementDiskThreadAfterDequeue(int disk_id) { > These functions got pulled out of the class body, but the logic shouldn't h I think this wrapper didn't exist in the old code. Is the PerDiskState completely hidden from outside the RequestContext? I guess not yet but maybe that's the direction you are going and so why you have this wrapper. -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 17:59:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); I know this is merged, but just wanted to revisit why we're storing block locations for ec files this way rather than synthesizing them via L137? -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 17:58:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6953: clean up DiskIoMgr
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10245 ) Change subject: IMPALA-6953: clean up DiskIoMgr .. Patch Set 7: > (1 comment) > > Rebase. Do you have time to take a look Joe? Otherwise I can find > another reviewer. I'm taking a look. I should have a first pass done today. -- To view, visit http://gerrit.cloudera.org:8080/10245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a6e393f8c01d10143cbac91108af37f6498c1b1 Gerrit-Change-Number: 10245 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 17:31:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 14: Looks like you'll have conflicts with this: https://gerrit.cloudera.org/#/c/10413/ -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 14 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 16:29:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9943 ) Change subject: IMPALA-5706: Spilling sort optimisations .. Patch Set 14: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2522/ -- To view, visit http://gerrit.cloudera.org:8080/9943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 Gerrit-Change-Number: 9943 Gerrit-PatchSet: 14 Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 15:45:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9943 ) Change subject: IMPALA-5706: Spilling sort optimisations .. Patch Set 14: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2522/ -- To view, visit http://gerrit.cloudera.org:8080/9943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 Gerrit-Change-Number: 9943 Gerrit-PatchSet: 14 Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 22 May 2018 12:21:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5392: Added all stack frames to ThreadInfo summary.
Abhishek Sharma has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/10145 ) Change subject: IMPALA-5392: Added all stack frames to ThreadInfo summary. .. IMPALA-5392: Added all stack frames to ThreadInfo summary. The current implementation uses ThreadInfo.toString, which restricts the number of stack frames to 8. As a part of this fix, this particular constraint is removed. Now all stack frames are included in the summary. Change-Id: I80ab4aad03e0c1f01fecad6b87779531244c28b7 --- M fe/src/main/java/org/apache/impala/common/JniUtil.java 1 file changed, 85 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/10145/10 -- To view, visit http://gerrit.cloudera.org:8080/10145 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I80ab4aad03e0c1f01fecad6b87779531244c28b7 Gerrit-Change-Number: 10145 Gerrit-PatchSet: 10 Gerrit-Owner: Abhishek SharmaGerrit-Reviewer: Abhishek Sharma Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Charles Agnello Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong