[native-toolchain-CR] Remove CYRUS SASL
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15466 ) Change subject: Remove CYRUS_SASL .. Remove CYRUS_SASL CYRUS_SASL requires libdb4 which is not available in rhel8. Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2 Reviewed-on: http://gerrit.cloudera.org:8080/15466 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M buildall.sh 1 file changed, 0 insertions(+), 9 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2 Gerrit-Change-Number: 15466 Gerrit-PatchSet: 3 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Fix bison compilation with glibc 2.28
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15467 ) Change subject: Fix bison compilation with glibc 2.28 .. Fix bison compilation with glibc 2.28 Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a Reviewed-on: http://gerrit.cloudera.org:8080/15467 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M .gitignore M buildall.sh A source/bison/bison-3.0.4-patches/110-glibc-change-work-around.patch 3 files changed, 35 insertions(+), 3 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a Gerrit-Change-Number: 15467 Gerrit-PatchSet: 3 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Add support for centos8
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15465 ) Change subject: Add support for centos8 .. Add support for centos8 This commit adds a new centos8 docker image. Most of this is pretty straightforward with the exception of having to explicitly set our default python. This needs to happen early in the postinstall process since other tools (aws cli) depend on it. Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363 Reviewed-on: http://gerrit.cloudera.org:8080/15465 Reviewed-by: Tim Armstrong Reviewed-by: Laszlo Gaal Tested-by: Tim Armstrong --- M docker/all/assert-dependencies-present.py M docker/all/postinstall.sh A docker/redhat8.df M in-docker.py 4 files changed, 66 insertions(+), 3 deletions(-) Approvals: Tim Armstrong: Looks good to me, but someone else must approve; Verified Laszlo Gaal: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/15465 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363 Gerrit-Change-Number: 15465 Gerrit-PatchSet: 3 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Hector Acosta Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Remove centos6, ubuntu12, ubuntu14 platforms
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15464 ) Change subject: Remove centos6, ubuntu12, ubuntu14 platforms .. Remove centos6, ubuntu12, ubuntu14 platforms This commit removes unused platform redhat6, and EOL platforms ubuntu12 and ubuntu14. Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403 Reviewed-on: http://gerrit.cloudera.org:8080/15464 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M Makefile D docker/redhat6.df D docker/ubuntu1404.df M in-docker.py 4 files changed, 0 insertions(+), 111 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15464 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403 Gerrit-Change-Number: 15464 Gerrit-PatchSet: 3 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Compile thrift using the python in our toolchain.
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15468 ) Change subject: Compile thrift using the python in our toolchain. .. Compile thrift using the python in our toolchain. Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60 Reviewed-on: http://gerrit.cloudera.org:8080/15468 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M buildall.sh M source/python/build.sh M source/thrift/build.sh 3 files changed, 10 insertions(+), 3 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15468 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60 Gerrit-Change-Number: 15468 Gerrit-PatchSet: 4 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Remove centos6, ubuntu12, ubuntu14 platforms
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15464 ) Change subject: Remove centos6, ubuntu12, ubuntu14 platforms .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15464 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403 Gerrit-Change-Number: 15464 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:43 + Gerrit-HasComments: No
[native-toolchain-CR] Compile thrift using the python in our toolchain.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15468 ) Change subject: Compile thrift using the python in our toolchain. .. Patch Set 3: Verified+1 I did a successful build for all non-RHEL8 platforms with build ID 178-a93f184dd5 (I ran into some infra issues for RHEL8). So this shouldn't regress things at the least. We can get a RHEL8 build later and confirm. -- To view, visit http://gerrit.cloudera.org:8080/15468 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60 Gerrit-Change-Number: 15468 Gerrit-PatchSet: 3 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:30 + Gerrit-HasComments: No
[native-toolchain-CR] Remove CYRUS SASL
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15466 ) Change subject: Remove CYRUS_SASL .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2 Gerrit-Change-Number: 15466 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:39 + Gerrit-HasComments: No
[native-toolchain-CR] Add support for centos8
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15465 ) Change subject: Add support for centos8 .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15465 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363 Gerrit-Change-Number: 15465 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Hector Acosta Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:41 + Gerrit-HasComments: No
[native-toolchain-CR] Fix bison compilation with glibc 2.28
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15467 ) Change subject: Fix bison compilation with glibc 2.28 .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a Gerrit-Change-Number: 15467 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9549: Handle catalogd startup delays when using local catalog
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15561 ) Change subject: IMPALA-9549: Handle catalogd startup delays when using local catalog .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py File tests/custom_cluster/test_catalog_wait.py: http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@86 PS1, Line 86: @SkipIfBuildType.not_dev_build > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@106 PS1, Line 106: ; > flake8: E703 statement ends with a semicolon Done http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@116 PS1, Line 116: ; > flake8: E703 statement ends with a semicolon Done http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@128 PS1, Line 128: ; > flake8: E703 statement ends with a semicolon Done -- To view, visit http://gerrit.cloudera.org:8080/15561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044 Gerrit-Change-Number: 15561 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 26 Mar 2020 04:02:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9549: Handle catalogd startup delays when using local catalog
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15561 to look at the new patch set (#2). Change subject: IMPALA-9549: Handle catalogd startup delays when using local catalog .. IMPALA-9549: Handle catalogd startup delays when using local catalog Impalads should be tolerant of delays in catalogd startup. Currently, when running with the local catalog (use_local_catalog=true), impalad startup can fail when catalogd startup is delayed. What happens is that ImpalaServer's constructor calls ImpalaServer::UpdateCatalogMetrics(), which maintains two metrics counting the number of tables and databases. This is before the code in ImpalaServer::Start() that waits for the catalogd to start (added by IMPALA-4704), so there is no guarantee that catalogd is running. The UpdateCatalogMetrics() call ends up calling getDbs() in the frontend catalog. LocalCatalog::getDbs() tries to load the databases (and thus contact catalogd), and this call will fail if catalogd is not running. This fails startup. use_local_catalog=false is immune to this only because it does not contact catalogd in Catalog::getDbs(). This moves the UpdateCatalogMetrics() call from the ImpalaServer constructor to ImpalaServer::Start() after the impalad has already waited for the catalogd to start up. It also limits the call to run only in coordinators. Testing: - Added a test to custom_cluster.test_catalog_wait to delay catalogd start up by 60 seconds and verify that the impalads successfully start up. This test fails prior to this change. - Hand tested to verify that the metrics that are maintained by UpdateCatalogMetrics() are not meaningfully changed. They are not. Running coordinators and executors have the right count after they successfully start up. Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/service/impala-server.cc M tests/custom_cluster/test_catalog_wait.py 4 files changed, 65 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/15561/2 -- To view, visit http://gerrit.cloudera.org:8080/15561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044 Gerrit-Change-Number: 15561 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9549: Handle catalogd startup delays when using local catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15561 ) Change subject: IMPALA-9549: Handle catalogd startup delays when using local catalog .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py File tests/custom_cluster/test_catalog_wait.py: http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@86 PS1, Line 86: @SkipIfBuildType.not_dev_build flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@106 PS1, Line 106: ; flake8: E703 statement ends with a semicolon http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@116 PS1, Line 116: ; flake8: E703 statement ends with a semicolon http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@128 PS1, Line 128: ; flake8: E703 statement ends with a semicolon -- To view, visit http://gerrit.cloudera.org:8080/15561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044 Gerrit-Change-Number: 15561 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 26 Mar 2020 03:59:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9549: Handle catalogd startup delays when using local catalog
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15561 Change subject: IMPALA-9549: Handle catalogd startup delays when using local catalog .. IMPALA-9549: Handle catalogd startup delays when using local catalog Impalads should be tolerant of delays in catalogd startup. Currently, when running with the local catalog (use_local_catalog=true), impalad startup can fail when catalogd startup is delayed. What happens is that ImpalaServer's constructor calls ImpalaServer::UpdateCatalogMetrics(), which maintains two metrics counting the number of tables and databases. This is before the code in ImpalaServer::Start() that waits for the catalogd to start (added by IMPALA-4704), so there is no guarantee that catalogd is running. The UpdateCatalogMetrics() call ends up calling getDbs() in the frontend catalog. LocalCatalog::getDbs() tries to load the databases (and thus contact catalogd), and this call will fail if catalogd is not running. This fails startup. use_local_catalog=false is immune to this only because it does not contact catalogd in Catalog::getDbs(). This moves the UpdateCatalogMetrics() call from the ImpalaServer constructor to ImpalaServer::Start() after the impalad has already waited for the catalogd to start up. It also limits the call to run only in coordinators. Testing: - Added a test to custom_cluster.test_catalog_wait to delay catalogd start up by 60 seconds and verify that the impalads successfully start up. This test fails prior to this change. - Hand tested to verify that the metrics that are maintained by UpdateCatalogMetrics() are not meaningfully changed. They are not. Running coordinators and executors have the right count after they successfully start up. Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/service/impala-server.cc M tests/custom_cluster/test_catalog_wait.py 4 files changed, 64 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/15561/1 -- To view, visit http://gerrit.cloudera.org:8080/15561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044 Gerrit-Change-Number: 15561 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 ) Change subject: IMPALA-3766: optionally compress spilled data .. Patch Set 14: (8 comments) Thanks for the feedback, sorry to push out a rebase, I hope it isn't too bad looking at the PS13->14 diff http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h File be/src/runtime/bufferpool/buffer-pool-counters.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h@55 PS11, Line 55: g for writes to di > nit: Total bytes written to disk. (May be compressed) thanks, that's less ambiguous http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc File be/src/runtime/bufferpool/buffer-pool.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc@736 PS11, Line 736: [this, page]( > if the write op fails then write_io_ops will not be incremented whereas pre It wasn't an intentional change, but I think it is an edge case that doesn't really matter (if the query failed, you probably don't care about about the exact number of writes). http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@161 PS11, Line 161: using 4kb or smaller > nit: HOLE_PUNCH_BLOCK_SIZE_BYTES Done http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@180 PS11, Line 180: /// compression is used. > nit: although its obvious but maybe mention that -1 means compression_level Done http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@291 PS11, Line 291: // called once. Must be called with 'lock_' held. : Status CreateFiles() WARN_UNUSED > update comment thanks for catching http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@482 PS11, Line 482: return be > is this before or after compression? Done http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@65 PS11, Line 65: "(Advanced) Limit on the total bytes of compression buffers that will be used for " > nit:maybe mention that this limit is shared across all queries. Good point, updated. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@454 PS11, Line 454: int64_t num_bytes, TmpFile** tmp_file, int64_t* file_offset) { : lock_guard lock(lock_); : int64_t scratch_range_bytes = : > shouldn't this check be done after the free_ranges_ recycle underneath? if Yes, i'm not sure what I was thinking when I changed this. -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 03:35:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 ) Change subject: IMPALA-3766: optionally compress spilled data .. Patch Set 14: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5548/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 03:32:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Hello Sahil Takiar, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15454 to look at the new patch set (#14). Change subject: IMPALA-3766: optionally compress spilled data .. IMPALA-3766: optionally compress spilled data Enabled via --disk_spill_compression_codec, which uses the same syntax as the compression_codec query option. Recommended codecs are LZ4 and ZSTD. ZSTD supports specifying a compression level. The compression is done in TmpFileMgr using a temporary buffer. Allocation of disk space is reworked slightly so that the allocation can happen after compression. The default power-of-two disk block sizes would lead to a lot of internal fragmentation, so a new strategy for free space management, similar to that used in the data cache, can be used with --disk_spill_punch_holes=true. TmpFileMgr will allocate a range of the actual compressed size and punch holes in the file for each range that is no longer needed. UncompressedWriteIoBytes is added to the buffer pool profiles, so that you can see what degree of compression is achieved. Typically I saw ratios of 2-3x for LZ4 and ZSTD (with LZ4 toward the lower end and ZSTD toward the higher end). Limitations: The management of the compression buffer memory could be improved. Ideally it would be integrated with the buffer pool and use the buffer pool allocator instead of being done "on the side". We would probably want to do this before making this the default, for resource management and performance reasons (doing a malloc() directly does not use the caching supported by the buffer pool). Testing: * Run buffer pool spilling tests with different combinations of the new options. * Extend existing TmpFileMgr tests for file space allocation to run with hole punching enabled. * Switch a couple of spilling tests to use the new option. * Add a metrics test to check for scratch leaks. * Enable the new options by default for end-to-end dockerized tests to get additional coverage. * Add a unit test where allocating compression memory fails, both on the read and write path. * Ran a single-node stress test on TPC-DS SF 1 and TPC-H SF 10 The peak compression buffer usage was ~40MB. Perf: I ran this spilling query using an SSD as the scratch disk: set mem_limit=200m; select count(distinct l_partkey) from tpch30_parquet.lineitem; The time taken for the second run of each query was: No compression: 19.59s LZ4: 18.56s ZSTD: 20.59s Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 --- M be/src/runtime/bufferpool/buffer-pool-counters.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-internal.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M be/src/service/query-options.cc M be/src/util/parse-util.cc M be/src/util/parse-util.h M bin/jenkins/dockerized-impala-run-tests.sh M tests/custom_cluster/test_scratch_disk.py M tests/verifiers/metric_verifier.py 15 files changed, 849 insertions(+), 246 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/15454/14 -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Compile thrift using the python in our toolchain.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15468 ) Change subject: Compile thrift using the python in our toolchain. .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15468 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60 Gerrit-Change-Number: 15468 Gerrit-PatchSet: 3 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 02:40:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 ) Change subject: IMPALA-3766: optionally compress spilled data .. Patch Set 12: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5602/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 01:39:07 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5601/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 01:37:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 ) Change subject: IMPALA-3766: optionally compress spilled data .. Patch Set 13: (8 comments) http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h File be/src/runtime/bufferpool/buffer-pool-counters.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h@55 PS11, Line 55: g for writes to di nit: Total bytes written to disk. (May be compressed) http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc File be/src/runtime/bufferpool/buffer-pool.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc@736 PS11, Line 736: [this, page]( if the write op fails then write_io_ops will not be incremented whereas previously it was. Is that the expected behavior? Dont have a preference either way, just wanted to point out the diff in behavior. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@161 PS11, Line 161: using 4kb or smaller nit: HOLE_PUNCH_BLOCK_SIZE_BYTES http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@180 PS11, Line 180: THdfsCompression::type compr nit: although its obvious but maybe mention that -1 means compression_level_ is not used. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@291 PS11, Line 291: // occurs. Returns an error only if no temporary files are usable or the scratch : /// limit is exceeded. Must be c update comment http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@482 PS11, Line 482: is this before or after compression? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@65 PS11, Line 65: "(Advanced) Limit on the total bytes of compression buffers that will be used for " nit:maybe mention that this limit is shared across all queries. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@454 PS11, Line 454: int64_t num_bytes, TmpFile** tmp_file, int64_t* file_offset) { : lock_guard lock(lock_); : int64_t scratch_range_bytes = : shouldn't this check be done after the free_ranges_ recycle underneath? if punch_holes() is false and we already have an allocated file that we can recycle then we dont need to add to current_bytes_allocated_ and we wont hit this limit. -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 01:36:16 + Gerrit-HasComments: Yes
[native-toolchain-CR] Compile thrift using the python in our toolchain.
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15468 to look at the new patch set (#3). Change subject: Compile thrift using the python in our toolchain. .. Compile thrift using the python in our toolchain. Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60 --- M buildall.sh M source/python/build.sh M source/thrift/build.sh 3 files changed, 10 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/68/15468/3 -- To view, visit http://gerrit.cloudera.org:8080/15468 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60 Gerrit-Change-Number: 15468 Gerrit-PatchSet: 3 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 4: (6 comments) Patch set 4 submitted. This patch address some of comments from patch set 3, and add initial microbenchmarking code. After some exploration, I come to conclusion that quadratic probing does not fit well with robin hood hashing. We might be better off by dropping quadratic probing entirely and commit on robin hood linear probing only. http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@12 PS3, Line 12: bucket rebalancing after every insert. > You told me before about short-circuiting the lookup but I didn't quite get I've been looking at this. It looks like short-circuiting lookup only possible in linear probing mode because it exploit that linear probing behavior. Therefore, we need to drop quadratic probing mode first before we can fully implement all improvement that is possible in Robin Hood, linear probing. http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@14 PS3, Line 14: Instead of proactively swapping elements during insertion, the > Also I think when probing for lookup we might consider trying the "start in I'm still thinking how to do single pass insert. From what I understand, we should tweak the insertion pattern that is using Iterator. For example, FindProbeRow(), FindBuildRowBucket() should commit early whether they indeed plan to do insert or just testing existence. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@115 PS3, Line 115: new_distance <= target_d > Ah, I see what you mean. Somehow I was under impression that element that i Done http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@138 PS3, Line 138: // swapped back into temporary location at index curr_idx. > Sorry, this example is bad, because the final balanced bucked should be [-1 This check, unfortunately, need to stay because it is possible to swap with empty bucket in the case of quadratic probing. We should iron this out if we decide to drop quadratic probing. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@162 PS3, Line 162: } else { > Wonder if we should consider storing the bucket distance (calculated during Patch Set 4 address O(1) distance measurement for linear probing. For quadratic probing, because the probe steps follow Triangular number pattern, we supposedly can do ~O(1) by doing its inverse function https://en.wikipedia.org/wiki/Triangular_number#Triangular_roots_and_tests_for_triangular_numbers However, I'm leaning towards dropping quadratic probing entirely, so I keep it as loop for quadratic probing. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@374 PS3, Line 374: DCHECK(node_ != NULL); > > And hash-table suppose to be thread-safe for read access. Add rebalancing at the end of Iterator::SetTuple and it pass core tests. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 01:25:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5600/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 13 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 26 Mar 2020 01:10:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Hello Sahil Takiar, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15454 to look at the new patch set (#12). Change subject: IMPALA-3766: optionally compress spilled data .. IMPALA-3766: optionally compress spilled data Enabled via --disk_spill_compression_codec, which uses the same syntax as the compression_codec query option. Recommended codecs are LZ4 and ZSTD. ZSTD supports specifying a compression level. The compression is done in TmpFileMgr using a temporary buffer. Allocation of disk space is reworked slightly so that the allocation can happen after compression. The default power-of-two disk block sizes would lead to a lot of internal fragmentation, so a new strategy for free space management, similar to that used in the data cache, can be used with --disk_spill_punch_holes=true. TmpFileMgr will allocate a range of the actual compressed size and punch holes in the file for each range that is no longer needed. UncompressedWriteIoBytes is added to the buffer pool profiles, so that you can see what degree of compression is achieved. Typically I saw ratios of 2-3x for LZ4 and ZSTD (with LZ4 toward the lower end and ZSTD toward the higher end). Limitations: The management of the compression buffer memory could be improved. Ideally it would be integrated with the buffer pool and use the buffer pool allocator instead of being done "on the side". We would probably want to do this before making this the default, for resource management and performance reasons (doing a malloc() directly does not use the caching supported by the buffer pool). Testing: * Run buffer pool spilling tests with different combinations of the new options. * Extend existing TmpFileMgr tests for file space allocation to run with hole punching enabled. * Switch a couple of spilling tests to use the new option. * Add a metrics test to check for scratch leaks. * Enable the new options by default for end-to-end dockerized tests to get additional coverage. * Add a unit test where allocating compression memory fails, both on the read and write path. * Ran a single-node stress test on TPC-DS SF 1 and TPC-H SF 10 The peak compression buffer usage was ~40MB. Perf: I ran this spilling query using an SSD as the scratch disk: set mem_limit=200m; select count(distinct l_partkey) from tpch30_parquet.lineitem; The time taken for the second run of each query was: No compression: 19.59s LZ4: 18.56s ZSTD: 20.59s Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 --- M be/src/runtime/bufferpool/buffer-pool-counters.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-internal.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M be/src/service/query-options.cc M be/src/util/parse-util.cc M be/src/util/parse-util.h M bin/jenkins/dockerized-impala-run-tests.sh M tests/custom_cluster/test_scratch_disk.py M tests/verifiers/metric_verifier.py 15 files changed, 845 insertions(+), 249 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/15454/12 -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 ) Change subject: IMPALA-3766: optionally compress spilled data .. Patch Set 9: (9 comments) http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h File be/src/runtime/bufferpool/buffer-pool-counters.h: http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59 PS9, Line 59: RuntimeProfile::Counter* uncompressed_bytes_written; > Do we need some tests for the new counters as well? I added tests for the bytes written counters. I didn't do that for the timers since the SCOPED_TIMER macro makes it fairly trivial - I manually inspected a profile. http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59 PS9, Line 59: RuntimeProfile::Counter* uncompressed_bytes_written; > I think adding the timers makes a lot of sense. I moved the uncompressed bytes to the backend level and made sure there were compression and encryption timers to both. The concern about unaccounted CPU time in an operator was the deciding factor for the second one. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h File be/src/runtime/tmp-file-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h@94 PS11, Line 94: This is also the : /// offset of the next chunk to allocate in the file. > maybe move this to the beginning of the comment block? I kinda think now that this sentence doesn't add much, so i deleted it instead of moving it. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@586 PS11, Line 586: DCHECK(io_mgr_buffer->eosr()); : DCHECK_LE(io_mgr_buffer->len(), read_buffer.len()); : if (io_mgr_ > at this point, the compressed_.buffer() should be a nullptr, right? should I'm not sure what you mean, it has to be non-NULL if it's reading a compressed range. I guess read_buffer does need to be non-null, so I added a DCHECK for that http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@644 PS11, Line 644: return status; : } : > not sure I understand what this method is for, but why don't we need to dec This method is used if we wrote out a page from memory to disk, but didn't evict the page, and now are wanting to pin the page again. In that case we don't need to read the data from disk,but we *do* need to decrypt it if the buffer was encrypted in-place. With compression all the compreession and encryption was done in the separate buffer and the original one was unmodified. Updated the comment to be less vague. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@779 PS11, Line 779: write_range_->SetData(buffer_to_write.data(), buffer_to_write.len()); > if this fails, do we still need to call FreeCompressedBuffer()? That's a good point. Adding FreeCompressedbuffer() on all the error paths was getting error-prone, so I switched to using a scoped exit trigger. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@807 PS11, Line 807: > might be cleaner to have return true/false - true if the compression succee Done http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@833 PS11, Line 833: file_ = file; This was wrong since we are now allocating MaxOutputLen() - in an earlier patch it allocated a slightly smaller buffer. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@835 PS11, Line 835: Status status = io_ctx->AddWriteRange(write_range_.get()); > is it possible for ProcessBlock to fail, but still write some data to the c That's a good point. I just switched to calling .Release() to be symmetrical with TryAllocate(), I don't think the indirection helped here. I'm not sure I can test this code path. The only case where I think ProcessBlock() could fail on the compression path is if the output buffer is too small, which we avoid by allocating MaxOutputLen(). So I'm not sure if this is actually reachable unless there's a weird internal error in a compressor -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 00:56:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
Hello David Rorke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15511 to look at the new patch set (#4). Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. WIP IMPALA-9434: Implement Robin Hood Hash Table. Robin hood hashing reduces the variances of probe lengths by continually rebalancing elements. This patch is intended to be the first step towards full robin hood hash table implementation by doing bucket rebalancing after every insert. Instead of proactively swapping elements during insertion, the algorithm does the rebalancing by first letting the new element probe an empty bucket and claim it through PrepareBucketForInsert(). After the bucket at index curr_idx claimed, the rebalancing algorithm in function Rebalance() walk through the probe sequence again to find richer or empty bucket to swap with bucket[curr_idx]. The search and swapping continue until it swap with empty bucket or does not find any richer bucket along the probe sequence. It effectively treat bucket[curr_idx] as temporary variable during the rebalancing process. Rebalance() return the new index of the recently claimed bucket after rebalancing process done. The decision to not swap elements proactively is because the probe function is used for both insertion and lookup-only. Keeping probing routine decoupled from insertion routine allow us to invoke rebalance only after bucket claim, not on lookup-only. This patch also maintain compatibility with quadratic probing. However, it block us from leveraging some robin hood technique that only possible with linear probing such as lookup short-circuit (fast lookup on nonexistent keys). Testing: - Pass core tests Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/exprs/scalar-expr.h 6 files changed, 610 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/15511/4 -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. IMPALA-9548: UdfExecutorTest failures after HIVE-22893 HIVE-22893: "Enhance data size estimation for fields computed by UDFs" modified o.a.h.hive.ql.udf.UDFSubstr and added a dependency on a few new Hive classes located in the hive-exec jar. These classes include: o.a.h.hive.ql.plan.ColStatistics o.a.h.hive.ql.stats.estimator.StatEstimator o.a.h.hive.ql.stats.estimator.StatEstimatorProvider The test UdfExecutorTest#HiveStringsTest loads the class UDFSubstr and thus needs to load the aforementioned stats classes as well. shaded-deps/pom.xml selectively pulls in certain classes from the hive-exec jar, and excludes all others. This patch simply addes the necessary stats classes to load UDFSubstr. Thus, fixing UdfExecutorTest. Testing: * Ran core tests with CDP_BUILD_NUMBER=2244454 and USE_CDP_HIVE=true, validated that UdfExecutorTest now passes Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Reviewed-on: http://gerrit.cloudera.org:8080/15544 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M shaded-deps/pom.xml 1 file changed, 3 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Mar 2020 00:48:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. Patch Set 13: > Patch Set 11: > > (1 comment) > > There is only one failing test and it is caused by the change. Thanks Csaba! I have changed the way we generate the hash value of a query string as suggested in that link. -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 13 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 26 Mar 2020 00:38:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Fang-Yu Rao has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true This patch bumps up the version of Guava libraries from 14.0.1 to 28.1-jre. Due to some changes in Guava's API's, we modify the call sites accordingly. Moreover, in order to instruct the Java classes under the directory of $IMPALA_HOME/common/yarn-extras to use the new Guava libraries, we explicitly added a dependency in the corresponding pom.xml file. On the other hand, we set DISABLE_SENTRY to true regardless of $USE_CDP_HIVE since Sentry's Guava version has not been bumped up yet and thus run-sentry-service.sh cannot be successfully executed. Recall that by setting DISABLE_SENTRY to true we also disable every Sentry-related test, which is fine from now on since Impala 3.4 was recently branched. The plan is to drop support for Sentry in the Impala 4 line. Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 --- M bin/impala-config.sh M common/yarn-extras/pom.xml M common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/Column.java M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java M fe/src/main/java/org/apache/impala/catalog/DataSource.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java M impala-parent/pom.xml 48 files changed, 103 insertions(+), 94 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/15214/13 -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 13 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8005: Randomize partitioning exchanges.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15497 ) Change subject: IMPALA-8005: Randomize partitioning exchanges. .. Patch Set 4: Code-Review+1 (1 comment) LGTM. http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc File be/src/runtime/data-stream-test.cc: http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc@742 PS3, Line 742: ASSERT_EQ(result, true); > If I'm understanding this correctly, I think the probability of a false mat Yes, agree the probability of all receivers having a match for all received values is pretty small. I was suggesting to simplify the test but am good with the current end-to-end test. -- To view, visit http://gerrit.cloudera.org:8080/15497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1936e6cc3e8d66420a5a9301f49221ca38f3e468 Gerrit-Change-Number: 15497 Gerrit-PatchSet: 4 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 26 Mar 2020 00:30:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15524 ) Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. Patch Set 4: (6 comments) first pass, i'm not really familiar with our python infra, so it might take me a while to go through this properly; left a few preliminary comments http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@17 PS4, Line 17: With regard to validating the patch, my assumption is that simply passing : the existing set of e2e shell tests is sufficient to confirm that the shell : is functioning properly. No new tests were added. given that the main pain here was dealing with the encodings, would it make sense to add additional tests that deal with this? http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@119 PS4, Line 119: No effort has been made at this point to come up with a way to integrate do we have plans to do this? http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@600 PS4, Line 600: try: : args = self.sanitise_input(args.decode('utf-8')) # python2 : except AttributeError: : args = self.sanitise_input(args) # python3 can we use the 'if sys.version_info.major == 2:' check here? http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1254 PS4, Line 1254: # if filter(self.DML_REGEX.match, tokens): is_dml = True : if any(self.DML_REGEX.match(t) for t in tokens): was the first line suppose to be deleted? http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1913 PS4, Line 1913: print(shell.CANCELLATION_MESSAGE, file=sys.stderr) : shell._reconnect_cancellation() : else: : print("Socket error %s: %s" % (e.errno, e)) what is this change for? http://gerrit.cloudera.org:8080/#/c/15524/4/tests/shell/test_shell_interactive.py File tests/shell/test_shell_interactive.py: http://gerrit.cloudera.org:8080/#/c/15524/4/tests/shell/test_shell_interactive.py@803 PS4, Line 803: we expect users to also : # any install 3rd upstream libraries from PyPI. nit: revise -- To view, visit http://gerrit.cloudera.org:8080/15524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615 Gerrit-Change-Number: 15524 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 21:51:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5599/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 15 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 25 Mar 2020 21:38:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. Patch Set 15: (4 comments) http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py@946 PS14, Line 946: suppress_error_on_cance > nit: make this a global string Done http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py@966 PS14, Line 966: > nit: same comment as above Done http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py File tests/custom_cluster/test_hs2_fault_injection.py: http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py@21 PS14, Line 21: import requests : > nit: insert newline between these two lines Done http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py@124 PS14, Line 124: > since this is an internal testing method, it should be prefixed with __ Done -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 15 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 25 Mar 2020 20:56:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Abhishek Rawat has uploaded a new patch set (#15). ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. IMPALA-9466: impala-shell client retry for hs2-http protocol Added retries for idempotent rpcs: OpenSession, PingImpalaHS2Service, GetResultSetMetadata, CloseImpalaOperation (non dmls), CancelOperation, GetOperationStatus, GetRuntimeProfile, GetExecSummary, GetLog Retries were also added to the 'set all' query execution and subsequent result fetch in the ImpalaHS2Client._open_session() The retries are only supported for hs2-http protocol and enabled by default. At most there are 3 retries for a failed rpc. There is a sleep duration of 'n' seconds after nth retry. Only failed rpcs due to an error in the http transport are retried and if an rpc failed because the server returned an error in the rpc response then such scenarios are not retriable. Improved error diagnostics by dumping stack trace when ImpalaShell. _execute_stmt() gets an 'Unknown Exception'. Testing: - Added a custom_cluster test which injects fault into the http transport and checks expected behavior from the various rpcs. Some of these tests leave the session in an open state and so these tests are not suitable for the e2e test framework which have metric verifiers expecting related metrics to be 0 at the end of the test. - Manually tested real world scenarios with impala-shell client communicating with an impala coordinator via a fault injecting istio mesh. - Manually tested dropping connections on an nginx ingress gateway by sending SIGTERM to all worker processes. Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 --- M shell/impala_client.py M shell/impala_shell.py M shell/shell_exceptions.py A tests/custom_cluster/test_hs2_fault_injection.py 4 files changed, 505 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/15378/15 -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 15 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. Patch Set 14: Code-Review+1 (4 comments) A few more nits, otherwise LGTM. http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py@946 PS14, Line 946: "TAPPLICATION_EXCEPTION" nit: make this a global string http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py@966 PS14, Line 966: "SERVER_ERROR" nit: same comment as above http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py File tests/custom_cluster/test_hs2_fault_injection.py: http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py@21 PS14, Line 21: import requests : from shell.ImpalaHttpClient import ImpalaHttpClient nit: insert newline between these two lines http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py@124 PS14, Line 124: expect_msg_retry since this is an internal testing method, it should be prefixed with __ same with expect_msg_no_retry -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 14 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 25 Mar 2020 20:12:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8005: Randomize partitioning exchanges.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15497 ) Change subject: IMPALA-8005: Randomize partitioning exchanges. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc File be/src/runtime/data-stream-test.cc: http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc@742 PS3, Line 742: if (map_query_1[i] != map_query_2[i]) { > It depends on what the core purpose of the test is. I am wondering if it i If I'm understanding this correctly, I think the probability of a false match should be very small. The odds of a value being hashed to the same receiver for both query1 and query2 is roughly 1/#receivers = 1/4. A false positive for this test would require all the values on all the receivers to be the same from query1 to query2. So, the probability of a false match is (1/4)^(#distinct values). It looks like the Sender uses a distinct value for each send, so there should be about 1024 distinct values (we should double check that). (1/4)^1024 is very small. Doing the check on the sender side would also work. I like the end-to-end aspect of this, but a direct comparison on the sender side would also be ok. -- To view, visit http://gerrit.cloudera.org:8080/15497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1936e6cc3e8d66420a5a9301f49221ca38f3e468 Gerrit-Change-Number: 15497 Gerrit-PatchSet: 3 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 25 Mar 2020 19:57:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 3: GVO failed due to an unrelated error. Filed IMPALA-9552 -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 19:52:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 19:49:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5547/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 19:49:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5546/ -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 19:41:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15524 ) Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@121 PS4, Line 121: processes. > It's true -- I actually had one more JIRA filed for that: Maybe it's getting a new sqlparse version from somewhere? http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1790 PS4, Line 1790: if isinstance(options.output_delimiter, str): > Ah, interesting. I guess I was just working from our existing test cases -- I'm OK with deferring it, I don't know if it actually works on master. I doubt anyone is relying on it. -- To view, visit http://gerrit.cloudera.org:8080/15524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615 Gerrit-Change-Number: 15524 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 18:26:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. Patch Set 11: (1 comment) There is only one failing test and it is caused by the change. http://gerrit.cloudera.org:8080/#/c/15214/11/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java File fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java: http://gerrit.cloudera.org:8080/#/c/15214/11/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java@765 PS11, Line 765: hasher.putString(queryStr, Charset.forName("UTF-8")); We investigated the failed test with Fang-Yu and it is caused by returning different hash values here after the change, see https://stackoverflow.com/questions/21287714/hashing-issue-between-guava-versions/21291270#21291270 -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 11 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 25 Mar 2020 18:19:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. Patch Set 11: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5545/ -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 11 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 25 Mar 2020 18:07:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 ) Change subject: IMPALA-3766: optionally compress spilled data .. Patch Set 9: (7 comments) http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h File be/src/runtime/bufferpool/buffer-pool-counters.h: http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59 PS9, Line 59: RuntimeProfile::Counter* uncompressed_bytes_written; > I think adding the timers makes a lot of sense. Do we need some tests for the new counters as well? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h File be/src/runtime/tmp-file-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h@94 PS11, Line 94: This is also the : /// offset of the next chunk to allocate in the file. maybe move this to the beginning of the comment block? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@586 PS11, Line 586: DCHECK(io_mgr_buffer->eosr()); : DCHECK_LE(io_mgr_buffer->len(), read_buffer.len()); : if (io_mgr_ at this point, the compressed_.buffer() should be a nullptr, right? should we add a DCHECK for that? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@644 PS11, Line 644: return status; : } : not sure I understand what this method is for, but why don't we need to decrypt the data if the data is compressed? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@779 PS11, Line 779: write_range_->SetData(buffer_to_write.data(), buffer_to_write.len()); if this fails, do we still need to call FreeCompressedBuffer()? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@807 PS11, Line 807: might be cleaner to have return true/false - true if the compression succeeded, false otherwise. I think it currently relies on checking if 'compressed_len_' has been set, which seems more fragile. i think it simplifies the caller as well, you could re-write it to something like this: MemRange buffer_to_write = buffer; if (parent_->tmp_file_mgr_->compression_enabled()) { if (TryCompress(buffer)) { buffer_to_write = MemRange(compressed_.buffer(), compressed_len_); } } http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@835 PS11, Line 835: Status status = io_ctx->AddWriteRange(write_range_.get()); is it possible for ProcessBlock to fail, but still write some data to the compressed buffer? if it is, then will this DCHECK? at this point compressed_len_ should still be -1 right? which means 'DCHECK(is_compressed())' will fail because 'is_compressed()' should return false? -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 17:34:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15382 ) Change subject: IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5598/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56a57d75ff11f00ebc85b85bcaf104fb8108c478 Gerrit-Change-Number: 15382 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 25 Mar 2020 17:34:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15382 ) Change subject: IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache .. Patch Set 3: Just stacked this on top of the latest LIRS -- To view, visit http://gerrit.cloudera.org:8080/15382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56a57d75ff11f00ebc85b85bcaf104fb8108c478 Gerrit-Change-Number: 15382 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 25 Mar 2020 16:52:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15382 to look at the new patch set (#3). Change subject: IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache .. IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache This adds two sets of metrics. The first is per-partition metrics to track the performance of the underlying filesystem for the data cache. It keeps histograms of read, write, and eviction latency for each data cache partition along with another metric recording the path for the partition. These are exposed as the following metrics: impala-server.io-mgr.remote-data-cache-partition-$0.path impala-server.io-mgr.remote-data-cache-partition-$0.read-latency impala-server.io-mgr.remote-data-cache-partition-$0.write-latency impala-server.io-mgr.remote-data-cache-partition-$0.eviction-latency This also adds metrics to keep counts of hits, misses, and entries in the data cache. Since reducing the latency of IO is an important feature of the data cache, the absolute count of hits and misses is as important as the hit bytes and miss bytes. This adds the following metrics: impala-server.io-mgr.remote-data-cache-hit-count impala-server.io-mgr.remote-data-cache-miss-count impala-server.io-mgr.remote-data-cache-num-entries Testing: - Hand testing to verify the per-partition latency histograms - Modified custom_cluster/test_data_cache.py to also test the counts. Change-Id: I56a57d75ff11f00ebc85b85bcaf104fb8108c478 --- M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/runtime/io/hdfs-file-reader.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json M tests/custom_cluster/test_data_cache.py 7 files changed, 205 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/15382/3 -- To view, visit http://gerrit.cloudera.org:8080/15382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I56a57d75ff11f00ebc85b85bcaf104fb8108c478 Gerrit-Change-Number: 15382 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15537 ) Change subject: IMPALA-9107: Add timestamp to maven logging options. .. IMPALA-9107: Add timestamp to maven logging options. We found that using awk to add a timestamp to the maven log can fail if gawk is not installed. It seems better to configure maven to add the timestamp itself. Sample output: Running mvn -U -Dorg.slf4j.simpleLogger.showDateTime=true -Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss -B install -DskipTests Directory /home/dknupp/Impala/ext-data-source 16:37:16 [INFO] Scanning for projects... 16:37:16 [INFO] 16:37:16 [INFO] Reactor Build Order: 16:37:16 [INFO] 16:37:16 [INFO] Apache Impala External Data Source [pom] 16:37:16 [INFO] Apache Impala External Data Source API [jar] 16:37:16 [INFO] Apache Impala External Data Source Sample [jar] 16:37:16 [INFO] Apache Impala External Data Source Test Library [jar] 16:37:17 [INFO] 16:37:17 [INFO] < org.apache.impala:impala-data-source > 16:37:17 [INFO] Building Apache Impala External Data Source 1.0-SNAPSHOT [1/4] 16:37:17 [INFO] [ pom ]- [etc...] Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Reviewed-on: http://gerrit.cloudera.org:8080/15537 Reviewed-by: Laszlo Gaal Tested-by: Impala Public Jenkins --- M bin/mvn-quiet.sh 1 file changed, 4 insertions(+), 2 deletions(-) Approvals: Laszlo Gaal: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 12: I cherry-picked this to Kudu here: https://gerrit.cloudera.org/#/c/15557/ It looks like it conflicts with a different, larger, Kudu patch but it probably makes sense to merge this ahead. -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 16:35:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: IMPALA-9538 Bump up linux-syscall-support.h .. IMPALA-9538 Bump up linux-syscall-support.h Bump up linux-syscall-support.h to newest version which support aarch64 Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Reviewed-on: http://gerrit.cloudera.org:8080/15510 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M be/src/gutil/linux_syscall_support.h M be/src/gutil/spinlock_linux-inl.h M be/src/kudu/util/debug-util.cc 3 files changed, 1,746 insertions(+), 891 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 11: Verified+1 Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 11 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 16:13:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 10: Verified+1 I'll carry the verification since nothing changed between patches and it compiled OK -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 16:12:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 ) Change subject: IMPALA-3766: optionally compress spilled data .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h File be/src/runtime/bufferpool/buffer-pool-counters.h: http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59 PS9, Line 59: RuntimeProfile::Counter* uncompressed_bytes_written; > That's a good point. This is a per-operator counter versus a per-backend co I think adding the timers makes a lot of sense. For uncompressed bytes counter, I'm not sure. I could go either way. It looks like most "Buffer pool" counters are exposed per-operator, and are then aggregated in the per-backend profiles ("Per Node Profiles")? I agree its probably not super useful to understand the compression ratio per-operator. http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/tmp-file-mgr.cc@63 PS9, Line 63: d) Limit on the total bytes of compressio > Updated the comment to explain (I think that's what you were getting at, ri yeah updated docs for this flag look good, thanks -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 16:12:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 16:12:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 14:44:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5546/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 14:44:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5545/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 11 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 25 Mar 2020 13:46:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 11 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 25 Mar 2020 13:46:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 10 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 25 Mar 2020 13:44:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/1 ) Change subject: IMPALA-9545 Decide cacheline size of aarch64 .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5544/ -- To view, visit http://gerrit.cloudera.org:8080/1 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f Gerrit-Change-Number: 1 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 25 Mar 2020 11:17:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/1 ) Change subject: IMPALA-9545 Decide cacheline size of aarch64 .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5597/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/1 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f Gerrit-Change-Number: 1 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 25 Mar 2020 07:03:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15524 ) Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@121 PS4, Line 121: processes. > I'm running into an issue where if I run the packaged shell with python3, s It's true -- I actually had one more JIRA filed for that: https://issues.apache.org/jira/browse/IMPALA-9362 It is weird that the pip-installed version doesn't have this issue: $ impala-shell Starting Impala with no authentication using Python 3.7.6 <-- note version Opened TCP connection to localhost:21000 Connected to localhost:21000 Server version: impalad version 3.4.0-SNAPSHOT DEBUG (build cc91c...) *** Welcome to the Impala shell. (Impala Shell v3.4.0-SNAPSHOT (cc91c66) built on Tue Mar 24 09:54:50 PDT 2020) The SET command shows the current value of all shell and query options. *** [localhost:21000] default> http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1790 PS4, Line 1790: if isinstance(options.output_delimiter, str): > This seems to barf if I pass in a unicode delimiter. I don't think this is Ah, interesting. I guess I was just working from our existing test cases -- I guess we don't have one for this. We can try to resolve it if you think it's important. http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh File shell/make_shell_tarball.sh: http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh@52 PS4, Line 52: if [ "${USE_THRIFT11_GEN_PY:-}" == "false" ]; then > Does it make sense to allow overriding this? Would a shell built with old t Actually, that's a good point. Other tests pass, but unicode-related tests don't. We should probably disallow this. http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh@53 PS4, Line 53: # thrift 0.9.3-p7 > Not sure if we need this comment? Ack -- To view, visit http://gerrit.cloudera.org:8080/15524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615 Gerrit-Change-Number: 15524 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 06:28:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/1 ) Change subject: IMPALA-9545 Decide cacheline size of aarch64 .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5544/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/1 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f Gerrit-Change-Number: 1 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 25 Mar 2020 06:23:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64
zhaoren...@hotmail.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/1 Change subject: IMPALA-9545 Decide cacheline size of aarch64 .. IMPALA-9545 Decide cacheline size of aarch64 ARM64's L3 cacheline size is different according to CPU vendor's architecture. If user defined CACHELINESIZE_AARCH64 in impala-config-local.sh, then we will use that value, if user did not define it, then we will get the value from OS, if fail, then we will use the default value 64. Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f --- M CMakeLists.txt M be/src/gutil/port.h M buildall.sh 3 files changed, 24 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/1/2 -- To view, visit http://gerrit.cloudera.org:8080/1 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f Gerrit-Change-Number: 1 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward