[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10529 ) Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 31 May 2018 05:29:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10529 ) Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error When we upgraded to thrift-0.9.3, the TSSLSocket.py logic changed quite a bit. Our RHEL7 machines come equipped with Python 2.7.5. Looking at these comments, that means that we'll be unable to create a 'SSLContext' but be able to explicitly specify ciphers: https://github.com/apache/thrift/blob/88591e32e710a0524327153c8b629d5b461e35e0/lib/py/src/transport/TSSLSocket.py#L37-L41 # SSLContext is not available for Python < 2.7.9 _has_ssl_context = sys.hexversion >= 0x020709F0 # ciphers argument is not available for Python < 2.7.0 _has_ciphers = sys.hexversion >= 0x020700F0 If we cannot create a 'SSLContext', then we cannot use TLSv1.2 and have to use TLSv1: https://github.com/apache/thrift/blob/88591e32e710a0524327153c8b629d5b461e35e0/lib/py/src/transport/TSSLSocket.py#L48-L49 # For python >= 2.7.9, use latest TLS that both client and server # supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For python < 2.7.9, use TLS 1.0 since TLSv1_X nor OP_NO_SSLvX is # unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else \ ssl.PROTOCOL_TLSv1 Our custom cluster test forces the server to use TLSv1.2 and also forces a specific cipher: https://github.com/apache/impala/blob/2f22a6f67ff363a0832a7ceee5d0020c8fd9b15a/tests/custom_cluster/test_client_ssl.py#L118-L119 So this combination of configuration values causes a failure in RHEL7 because we only allow a specific cipher which works with TLSv1.2, but the client cannot use TLSv1.2 due to the Python version as mentioned above. We've not noticed these failures on older-than-RHEL7-systems since the OpenSSL versions on those systems don't support TLSv1.2. (< OpenSSL 1.0.1) To fix this, we need to change the Python version on RHEL 7 to be >= Python 2.7.9. This patch skips the test if an older version of Python than 2.7.9 is detected. Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Reviewed-on: http://gerrit.cloudera.org:8080/10529 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins --- M tests/custom_cluster/test_client_ssl.py 1 file changed, 6 insertions(+), 6 deletions(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 5 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10531 ) Change subject: IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize(). .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2572/ -- To view, visit http://gerrit.cloudera.org:8080/10531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f Gerrit-Change-Number: 10531 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 31 May 2018 03:32:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10529 ) Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2571/ -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 31 May 2018 02:17:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10529 ) Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. Patch Set 4: Code-Review+2 (5 comments) Thanks for the reviews! Upgrading to +2 since 2 reviewers +1'd it. http://gerrit.cloudera.org:8080/#/c/10529/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10529/3//COMMIT_MSG@9 PS3, Line 9: When we upgraded to thrift-0.9.3, the TSSLSocket.py logic changed quite > To be super clear: we used to ignore this silently but now TSSLSocket.py br I explained this in a previous top level comment. http://gerrit.cloudera.org:8080/#/c/10529/3//COMMIT_MSG@13 PS3, Line 13: https://github.com/apache/thrift/blob/88591e32e710a0524327153c8b629d5b461e35e0/lib/py/src/transport/TSSLSocket.py#L37-L41 > You can press "y" in the github UI to get code a permalink UI. Or specify a Done http://gerrit.cloudera.org:8080/#/c/10529/3/tests/custom_cluster/test_client_ssl.py File tests/custom_cluster/test_client_ssl.py: http://gerrit.cloudera.org:8080/#/c/10529/3/tests/custom_cluster/test_client_ssl.py@36 PS3, Line 36: HAS_LEGACY_OPENSSL = getattr(ssl, "OPENSSL_VERSION_NUMBER", None) > It's super confusing that you'd change a constant on line 43. Can you just Done http://gerrit.cloudera.org:8080/#/c/10529/3/tests/custom_cluster/test_client_ssl.py@45 PS3, Line 45: > If you want, you can use "getattr" with a default to avoid this. Done http://gerrit.cloudera.org:8080/#/c/10529/3/tests/custom_cluster/test_client_ssl.py@132 PS3, Line 132: statestored_args=SSL_WILDCARD_ARGS, > Consider just inlining Done -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 31 May 2018 02:17:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Hello Michael Brown, Philip Zeyliger, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10529 to look at the new patch set (#4). Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error When we upgraded to thrift-0.9.3, the TSSLSocket.py logic changed quite a bit. Our RHEL7 machines come equipped with Python 2.7.5. Looking at these comments, that means that we'll be unable to create a 'SSLContext' but be able to explicitly specify ciphers: https://github.com/apache/thrift/blob/88591e32e710a0524327153c8b629d5b461e35e0/lib/py/src/transport/TSSLSocket.py#L37-L41 # SSLContext is not available for Python < 2.7.9 _has_ssl_context = sys.hexversion >= 0x020709F0 # ciphers argument is not available for Python < 2.7.0 _has_ciphers = sys.hexversion >= 0x020700F0 If we cannot create a 'SSLContext', then we cannot use TLSv1.2 and have to use TLSv1: https://github.com/apache/thrift/blob/88591e32e710a0524327153c8b629d5b461e35e0/lib/py/src/transport/TSSLSocket.py#L48-L49 # For python >= 2.7.9, use latest TLS that both client and server # supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For python < 2.7.9, use TLS 1.0 since TLSv1_X nor OP_NO_SSLvX is # unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else \ ssl.PROTOCOL_TLSv1 Our custom cluster test forces the server to use TLSv1.2 and also forces a specific cipher: https://github.com/apache/impala/blob/2f22a6f67ff363a0832a7ceee5d0020c8fd9b15a/tests/custom_cluster/test_client_ssl.py#L118-L119 So this combination of configuration values causes a failure in RHEL7 because we only allow a specific cipher which works with TLSv1.2, but the client cannot use TLSv1.2 due to the Python version as mentioned above. We've not noticed these failures on older-than-RHEL7-systems since the OpenSSL versions on those systems don't support TLSv1.2. (< OpenSSL 1.0.1) To fix this, we need to change the Python version on RHEL 7 to be >= Python 2.7.9. This patch skips the test if an older version of Python than 2.7.9 is detected. Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 --- M tests/custom_cluster/test_client_ssl.py 1 file changed, 6 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/10529/4 -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7012: Fix NPE when parsing unexpected tokens
Tianyi Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10512 ) Change subject: IMPALA-7012: Fix NPE when parsing unexpected tokens .. IMPALA-7012: Fix NPE when parsing unexpected tokens Currently some token are not added to tokenIdMap in the sql scanner and an NPE will be thrown if such are parsed as unexpected tokens. This patch fixes this bug and adds a case to ParserTest. Change-Id: I9c846fdfb22ba37bfc3b1985b9a044014ab58968 --- M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java 3 files changed, 35 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/10512/2 -- To view, visit http://gerrit.cloudera.org:8080/10512 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c846fdfb22ba37bfc3b1985b9a044014ab58968 Gerrit-Change-Number: 10512 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10550 to look at the new patch set (#4). Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans .. PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans Revert to the pre-IMPALA-3905 algorithm for deciding when to return a batch from an Avro scan. The post-IMPALA-3905 algorithm is bad for wide tables where there are only a small number of rows per Avro block. Optimise memory transfer for selective scans - don't attach unused decompression buffers to the output batch. Combined with the previous change, this dramatically reduces the amount of memory transferred out of scanner threads for selective scans of wide tables. Cap the maximum row batch queue size at 5 * the number of active scanner threads, so that we guarantee lower amounts of memory in the row batch queue with fewer scanner threads, rather than just achieving it statistically. It does not reduce the default significantly on typical server configurations that would have 24+ cores except under high concurrency or low memory environments where the number of scanner threads is limited. We should evaluate reducing the default further or otherwise better controlling memory consumption in a follow-up, based on experiments. Includes some observability improvements including additional counters that will help diagnose issues like this more easily: * Add counters to give some insight into row batch queue. Here's an excerpt: - RowBatchBytesEnqueued: 20.89 MB (21903380) - RowBatchQueueCapacity: 5 (5) - RowBatchQueueGetWaitTime: 59.187ms - RowBatchQueuePeakMemoryUsage: 8.85 MB (9279347) - RowBatchQueuePutWaitTime: 0.000ns - RowBatchesEnqueued: 6 (6) * Don't create AverageScannerThreadConcurrency for MT scan node where it's not actually used. * Track the row batch queue memory consumption against a sub-tracker HDFS_SCAN_NODE (id=2): Reservation=48.00 MB OtherMemory=588.00 KB Total=48.57 MB Peak=48.62 MB Queued Batches: Total=588.00 KB Peak=637.00 KB Ran the repro in the JIRA. Memory consumption was reduced from ~500MB to ~220MB on my system. TODO: - running stress test on a single node against Avro - Running exhaustive tests Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b --- M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/scan-node.h M be/src/runtime/mem-pool.cc M be/src/runtime/mem-pool.h M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/util/blocking-queue.h 13 files changed, 274 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/10550/4 -- To view, visit http://gerrit.cloudera.org:8080/10550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b Gerrit-Change-Number: 10550 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10531 ) Change subject: IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize(). .. Patch Set 1: Code-Review+2 Given that I'm seeing this on 2.x (using the old HBase), the code we copied from HBase should be tracked as a separate issue. It seems reasonable to handle this null appropriately. -- To view, visit http://gerrit.cloudera.org:8080/10531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f Gerrit-Change-Number: 10531 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 31 May 2018 01:29:35 + Gerrit-HasComments: No
[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10550 ) Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/10550/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10550/2//COMMIT_MSG@18 PS2, Line 18: Cap the maximum row batch queue size at 5 * the number of active : scanner threads. > adding this heuristic on top of the current one seems hard to understand ho I want to get a tighter upper bound on the amount of memory in the queue and some guarantee that scaling down the scanner thread count (automatically or manually) will actually reduce the amount of queued memory. We could leave this out of the change - it's not essential - but it felt like it should include some harder guarantees about queue size to make it easier to reason about. We can usually control memory consumption from queueing batches by automatically limiting num_scanner_threads based on concurrency and availability of reservation (or by setting in manually), but there's no reason, with the current code, that a single scanner thread couldn't fill up the whole queue, e.g. if the consumer thread was off doing some other work for a long time. http://gerrit.cloudera.org:8080/#/c/10550/2//COMMIT_MSG@32 PS2, Line 32: Track the row batch queue memory consumption against a sub-tracker > what does that look like in terms of the memtracker hierarchy? Added the MemTracker dump here. http://gerrit.cloudera.org:8080/#/c/10550/2/be/src/exec/hdfs-avro-scanner.cc File be/src/exec/hdfs-avro-scanner.cc: http://gerrit.cloudera.org:8080/#/c/10550/2/be/src/exec/hdfs-avro-scanner.cc@496 PS2, Line 496: Check : // AtCapacity() at the end of the loop to guarantee that we process at least one row : // so that we mke progress even if the batch starts off with AtCapacity() == true. > that's kinda confusing because it's counter to our usual which is that once Done http://gerrit.cloudera.org:8080/#/c/10550/2/be/src/exec/hdfs-avro-scanner.cc@568 PS2, Line 568: keep_current_chunk = true to > but line 573 passes false. i think you meant false here. For some reason my brain likes reversing things sometimes. http://gerrit.cloudera.org:8080/#/c/10550/2/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/10550/2/be/src/exec/hdfs-scan-node.cc@238 PS2, Line 238: row_batch->SetMemTracker(row_batches_mem_tracker_); > which mem-tracker are these accounted to before this line? They're accounted against ExecNode::mem_tracker_ - see HdfsScanner::ProcessSplit() then transferred to the caller of GetNext() with AcquireState() http://gerrit.cloudera.org:8080/#/c/10550/2/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/10550/2/be/src/runtime/mem-tracker.cc@243 PS2, Line 243: } > why is it that we need this new interface now but didn't need it in other c The motivation for doing it in this patch specifically is to avoid adding more contention on the global MemTracker consumption counters (process and pool level) in case that started to become more of a bottleneck. I think this is a better solution in general than separately calling Consume() and Release() because it avoids the unnecessary updates of common ancestor counters and also avoids the window of inconsistency where the memory isn't accounted against the query (which can be hard to reason about). We should probably switch to using this in other places where it applies, e.g. the various DataStream* classes in a follow-on. -- To view, visit http://gerrit.cloudera.org:8080/10550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b Gerrit-Change-Number: 10550 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 31 May 2018 01:27:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10550 to look at the new patch set (#3). Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans .. PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans Revert to the pre-IMPALA-3905 algorithm for deciding when to return a batch from an Avro scan. The post-IMPALA-3905 algorithm is bad for wide tables where there are only a small number of rows per Avro block. Optimise memory transfer for selective scans - don't attach unused decompression buffers to the output batch. Combined with the previous change, this dramatically reduces the amount of memory transferred out of scanner threads for selective scans of wide tables. Cap the maximum row batch queue size at 5 * the number of active scanner threads, so that we guarantee lower amounts of memory in the row batch queue with fewer scanner threads, rather than just achieving it statistically. It does not reduce the default significantly on typical server configurations that would have 24+ cores except under high concurrency or low memory environments where the number of scanner threads is limited. We should evaluate reducing the default further or otherwise better controlling memory consumption in a follow-up, based on experiments. Includes some observability improvements including additional counters that will help diagnose issues like this more easily: * Add counters to give some insight into row batch queue. Here's an excerpt: - RowBatchBytesEnqueued: 20.89 MB (21903380) - RowBatchQueueCapacity: 5 (5) - RowBatchQueueGetWaitTime: 59.187ms - RowBatchQueuePeakMemoryUsage: 8.85 MB (9279347) - RowBatchQueuePutWaitTime: 0.000ns - RowBatchesEnqueued: 6 (6) * Don't create AverageScannerThreadConcurrency for MT scan node where it's not actually used. * Track the row batch queue memory consumption against a sub-tracker HDFS_SCAN_NODE (id=2): Reservation=48.00 MB OtherMemory=588.00 KB Total=48.57 MB Peak=48.62 MB Queued Batches: Total=588.00 KB Peak=637.00 KB Ran the repro in the JIRA. Memory consumption was reduced from ~500MB to ~220MB on my system. TODO: - running stress test on a single node against Avro - Running exhaustive tests Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b --- M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/scan-node.h M be/src/runtime/mem-pool.cc M be/src/runtime/mem-pool.h M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/util/blocking-queue.h 13 files changed, 272 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/10550/3 -- To view, visit http://gerrit.cloudera.org:8080/10550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b Gerrit-Change-Number: 10550 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 ) Change subject: IMPALA-3307: Add support for IANA time-zone db .. Patch Set 10: (1 comment) Yeah I agree with Phil that doing JNI to java to unzip a file may be the least of all evils - we wouldn't be adding another dependency and there's no risk of DOSing the NameNode. I doubt it would take the NameNode down, but it could disrupt other things happening on the cluster unnecessarily. http://gerrit.cloudera.org:8080/#/c/9986/10/testdata/tzdb/2017c/Africa/Abidjan File testdata/tzdb/2017c/Africa/Abidjan: http://gerrit.cloudera.org:8080/#/c/9986/10/testdata/tzdb/2017c/Africa/Abidjan@1 PS10, Line 1: ../Atlantic/St_Helena > We're adding a ton of files. Do we need such a big database for our testing We'll also need to exclude these files from RAT checks (and document how they are licensed). -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 10 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 31 May 2018 00:30:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7092: Re-enable EC tests broken by HDFS-13539
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10553 Change subject: IMPALA-7092: Re-enable EC tests broken by HDFS-13539 .. IMPALA-7092: Re-enable EC tests broken by HDFS-13539 This patch re-enables previously OOM-failed tests broken by HDFS-13539 and attributes other disabled EC tests to their root cause, replacing the "fix_later" tag. There are 2 types of failed tests remaining that are not inherently EC incompatible: 1. HDFS caching is currently buggy under EC: HDFS-9879. 2. The unit of an EC scan range is a block group, which is of 3X the regular block size in the mini-cluster. It breaks some tests and is tracked in IMPALA-7098. Change-Id: I38df02a5cb1e9f6edcd4713eb8c13ba4081754e6 --- M tests/common/skip.py M tests/custom_cluster/test_admission_controller.py M tests/metadata/test_explain.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py 11 files changed, 15 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/10553/1 -- To view, visit http://gerrit.cloudera.org:8080/10553 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I38df02a5cb1e9f6edcd4713eb8c13ba4081754e6 Gerrit-Change-Number: 10553 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-5380: [DOCS] Added additional filesystems supported in URI
Alex Rodoni has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10551 ) Change subject: IMPALA-5380: [DOCS] Added additional filesystems supported in URI .. IMPALA-5380: [DOCS] Added additional filesystems supported in URI Change-Id: I11d57efd564fb5779539a88d7b209cee93d313c6 --- M docs/topics/impala_authorization.xml 1 file changed, 9 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/10551/2 -- To view, visit http://gerrit.cloudera.org:8080/10551 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I11d57efd564fb5779539a88d7b209cee93d313c6 Gerrit-Change-Number: 10551 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-5380: [DOCS] Added additional filesystems supported in URI
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10551 Change subject: IMPALA-5380: [DOCS] Added additional filesystems supported in URI .. IMPALA-5380: [DOCS] Added additional filesystems supported in URI Change-Id: I11d57efd564fb5779539a88d7b209cee93d313c6 --- M docs/topics/impala_authorization.xml 1 file changed, 9 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/10551/1 -- To view, visit http://gerrit.cloudera.org:8080/10551 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I11d57efd564fb5779539a88d7b209cee93d313c6 Gerrit-Change-Number: 10551 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/10529 ) Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. Patch Set 3: Code-Review+1 ...pending Phil's comments, which are fair. The Python version comparison looks like the accepted idiom (Impyla does it, for example) -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 30 May 2018 23:58:50 + Gerrit-HasComments: No
[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10550 ) Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans .. Patch Set 2: Here's an example of the profile with the new additions: HDFS_SCAN_NODE (id=0):(Total: 60.443ms, non-child: 60.443ms, % non-child: 100.00%) - AverageHdfsReadThreadConcurrency: 0.00 - AverageScannerThreadConcurrency: 0.00 - BytesRead: 9.36 MB (9816312) - BytesReadDataNodeCache: 0 - BytesReadLocal: 9.36 MB (9816312) - BytesReadRemoteUnexpected: 0 - BytesReadShortCircuit: 9.36 MB (9816312) - BytesSkipped: 0 - CachedFileHandlesHitCount: 0 (0) - CachedFileHandlesMissCount: 2 (2) - CollectionItemsRead: 0 (0) - DecompressionTime: 21.409ms - MaxCompressedTextFileLength: 0 - NumDisksAccessed: 1 (1) - NumScannerThreadReservationsDenied: 0 (0) - NumScannerThreadsStarted: 1 (1) - PeakMemoryUsage: 24.99 MB (26205298) - PerReadThreadRawHdfsThroughput: 3.80 GB/sec - RemoteScanRanges: 0 (0) - RowBatchBytesEnqueued: 20.89 MB (21903380) - RowBatchQueueCapacity: 5 (5) - RowBatchQueueGetWaitTime: 59.187ms - RowBatchQueuePeakMemoryUsage: 8.85 MB (9279347) - RowBatchQueuePutWaitTime: 0.000ns - RowBatchesEnqueued: 6 (6) - RowsRead: 200.00K (20) - RowsReturned: 747 (747) - RowsReturnedRate: 12.36 K/sec - ScanRangesComplete: 1 (1) - ScannerThreadsInvoluntaryContextSwitches: 1 (1) - ScannerThreadsTotalWallClockTime: 63.681ms - MaterializeTupleTime(*): 38.682ms - ScannerThreadsSysTime: 0.000ns - ScannerThreadsUserTime: 60.000ms - ScannerThreadsVoluntaryContextSwitches: 4 (4) - TotalRawHdfsOpenFileTime(*): 1.449ms - TotalRawHdfsReadTime(*): 2.408ms - TotalReadThroughput: 0.00 /sec -- To view, visit http://gerrit.cloudera.org:8080/10550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b Gerrit-Change-Number: 10550 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 23:14:54 + Gerrit-HasComments: No
[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10550 ) Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans .. Patch Set 2: I'm still testing out the change but it would be good if you could take a look to make sure that the approach makes sense. -- To view, visit http://gerrit.cloudera.org:8080/10550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b Gerrit-Change-Number: 10550 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 23:13:21 + Gerrit-HasComments: No
[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans
Tim Armstrong has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10550 ) Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans .. PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans Revert to the pre-IMPALA-3905 algorithm for deciding when to return a batch from an Avro scan. The post-IMPALA-3905 algorithm is bad for wide tables where there are only a small number of rows per Avro block. Optimise memory transfer for selective scans - don't attach unused decompression buffers to the output batch. Combined with the previous change, this dramatically reduces the amount of memory transferred out of scanner threads for selective scans of wide tables. Cap the maximum row batch queue size at 5 * the number of active scanner threads. This means that num_scanner_threads gives better control over memory consumption. It does not reduce the default significantly on typical server configurations that would have 24+ cores except under high concurrency or low memory environments where the number of scanner threads is limited. We should evaluate reducing the default further or otherwise better controlling memory consumption in a follow-up, based on experiments. Includes some observability improvements including additional counters that will help diagnose issues like this more easily: * Add counters to give some insight into row batch queue. * Don't create AverageScannerThreadConcurrency for MT scan node where it's not actually used. * Track the row batch queue memory consumption against a sub-tracker Testing: Ran the repro in the JIRA. Memory consumption was reduced from ~500MB to ~220MB on my system. TODO: - running stress test on a single node against Avro - Running exhaustive tests Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b --- M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/scan-node.h M be/src/runtime/mem-pool.cc M be/src/runtime/mem-pool.h M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/util/blocking-queue.h 13 files changed, 272 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/10550/2 -- To view, visit http://gerrit.cloudera.org:8080/10550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b Gerrit-Change-Number: 10550 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10529 ) Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. Patch Set 3: > > (5 comments) > > > > My main question is to confirm that we understand why the Python > > upgrade broke this. Before it was also broken, but we were > silently > > ignoring that? > > > > The rest is python nits, which you can ignore if you'd like. > > I did some more digging to answer the questions and this is what > I've come up with (I've posted this in the JIRA as well, but > pasting here for better context): > > I missed a detail which was that this test never ran on RHEL6 due > to all our RHEL6 machines having OpenSSL 1.0.0 which doesn't > support TLSv1.2, causing them to be skipped. > > On RHEL7, this used to work before the Thrift upgrade because the > old Thrift cpp library (0.9.0) was somehow accepting TLSv1 > connections even though we explicitly set TLSv1.2 on the server. > I'm unable to figure out why that was happening, and it looks like > a bug, but I'll keep looking. It could be a bug in the Python 'ssl' > library, or the Thrift 0.9.0 python library, or the Thrift 0.9.0 > CPP library, or even OpenSSL. > > In Thrift 0.9.3, we explicitly select TLSv1.2 if that's what the > user specified which fixes the above mentioned bug. Our test caught > this bug, since the client side doesn't support TLSv1.2 unless we > are equipped with Python 2.7.9 or up. > > As for a weaker test, we already run test_ssl() which is a weaker > test as it doesn't enforce any ciphers or TLS versions which allows > the client and server to negotiate a protocol that they're both > aware of. > > So to summarize, things weren't working as expected before and we > were passing the test when it should have been failing, but the > upgrade fixed the original problem and caused the test to start > failing. > > Carry +1. Thanks for the review! I'll address the nits in a separate patchset. -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 30 May 2018 22:32:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10529 ) Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. Patch Set 3: Code-Review+1 > (5 comments) > > My main question is to confirm that we understand why the Python > upgrade broke this. Before it was also broken, but we were silently > ignoring that? > > The rest is python nits, which you can ignore if you'd like. I did some more digging to answer the questions and this is what I've come up with (I've posted this in the JIRA as well, but pasting here for better context): I missed a detail which was that this test never ran on RHEL6 due to all our RHEL6 machines having OpenSSL 1.0.0 which doesn't support TLSv1.2, causing them to be skipped. On RHEL7, this used to work before the Thrift upgrade because the old Thrift cpp library (0.9.0) was somehow accepting TLSv1 connections even though we explicitly set TLSv1.2 on the server. I'm unable to figure out why that was happening, and it looks like a bug, but I'll keep looking. It could be a bug in the Python 'ssl' library, or the Thrift 0.9.0 python library, or the Thrift 0.9.0 CPP library, or even OpenSSL. In Thrift 0.9.3, we explicitly select TLSv1.2 if that's what the user specified which fixes the above mentioned bug. Our test caught this bug, since the client side doesn't support TLSv1.2 unless we are equipped with Python 2.7.9 or up. As for a weaker test, we already run test_ssl() which is a weaker test as it doesn't enforce any ciphers or TLS versions which allows the client and server to negotiate a protocol that they're both aware of. So to summarize, things weren't working as expected before and we were passing the test when it should have been failing, but the upgrade fixed the original problem and caused the test to start failing. Carry +1. Thanks for the review! -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 30 May 2018 22:32:42 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10545 ) Change subject: IMPALA-7061: Rework HBase splitting and assignment .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2570/ -- To view, visit http://gerrit.cloudera.org:8080/10545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897 Gerrit-Change-Number: 10545 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 30 May 2018 22:25:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10543 ) Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location .. Patch Set 1: (13 comments) http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@189 PS1, Line 189: in String as nit: remove, kinda obvious http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@189 PS1, Line 189: :: nit:Java style http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@191 PS1, Line 191: private final HashMap> locationToPartMap_ = Unfortunate that a lot of state maintained in this class and is pretty confusing and can lead to subtle bugs if we forget to update/remove from one of these data structures. We need to clean these up eventually. http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1105 PS1, Line 1105: partition.getLocation() can be inferred from the partition object? http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1204 PS1, Line 1204: :: nit:Java http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1205 PS1, Line 1205: maintains updates http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1207 PS1, Line 1207: chang nit: update? http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1214 PS1, Line 1214: Adds 'location' as key and 'partition' as value to locationToPartMap_. May be just say "Adds 'partition' to locationTo..." Rest is implicit. http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1216 PS1, Line 1216: insertToLocationPartMap call it addToLocationPartMap may be? http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1468 PS1, Line 1468: a Map that contains all the HdfsPartitions :* that point to the same locations as the ones received I think the earlier description is more clear. http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2434 PS1, Line 2434: change nit: call it updatePartitionLocation()? http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2435 PS1, Line 2435: } else { : partition.setLocation(location); : } I don't think non HdfsTable is possible here. Look at catalog_.getHdfsPartition() http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py@159 PS1, Line 159: assert data.split('\t') == ['21', '6'] Could you drop one of the partitions that point to the same partition dir and make sure the other partition still counts it? -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 30 May 2018 22:10:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10365 ) Change subject: IMPALA-6035: Add query options to limit thread reservation .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc@426 PS7, Line 426: max_thread_reservation = make_pair(, e.second.thread_reservation); I think the use of pairs hurts readability (since fields don't have meaningful names) but given that the new pair is declared right here, I don't feel strongly in this case. So if you and Bikram feel this is clearest, leave it. http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc@432 PS7, Line 432: chedule->query_options().__isset.buffer_pool_limit : && schedule->query_options() query_opts http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc@440 PS7, Line 440: else if Preexisting, but I think it's non-obvious why this is "else if" with the buffer pool option check. I had to do a double take. How about adding a brief explanation (maybe just reference InitBufferPoolState())? http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc@461 PS7, Line 461: else if given that these aren't really mutually exclusive (unlike the buffer pool limit vs mem_limit thing), how about just use 'if' here? The "return true" pattern is used throughout the exclude downstream checks. http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/query-schedule.h@49 PS7, Line 49: input to a BackendState input to AdmissionController and a BackendState (especially given that the thread reservation has no execution-side counterpart) -- To view, visit http://gerrit.cloudera.org:8080/10365 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007 Gerrit-Change-Number: 10365 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 21:53:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10483 ) Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@795 PS4, Line 795: num_cols it's a bit misleading to say "number of columns in the target table" and give this value since the value is the columns in the parquet file not the table (when partitioned). http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@796 PS4, Line 796: } The preexisting block size logic is pretty convoluted. It appears that this check is completely redundant with the one you are adding to hdfs-table-writer and the check in HdfsParquetTableWriter::InitNewFile(). I'm not opposed to adding this since the error message is more informative, but I don't think that was your intent, so wanted to check that logic. And am wondering if we could consolidate the checks rather than having them sprinkled around (like move this check to HdfsParquetTableWriter::InitNewFile(). hmm, but then it might be too late since we'd have already hit the "generic" 2GB check you're adding to HdfsTableWriter. http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@925 PS4, Line 925: num_cols maybe call this num_file_columns to be clear that it's the file columns not the table columns. http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@979 PS4, Line 979: a table with " << num_cols << " columns not your change, but this is also confused. http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-table-sink.cc@371 PS4, Line 371: output_partition->partition_descriptor->block_size(); in what cases can this be > 2GB? I'm wondering if we should just cap this value to 2GB, and only issue the 2GB error for the cases where user can actually do something about it. -- To view, visit http://gerrit.cloudera.org:8080/10483 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138 Gerrit-Change-Number: 10483 Gerrit-PatchSet: 4 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 30 May 2018 21:19:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. Patch Set 8: Code-Review+1 (1 comment) Carry Vuk's +1 http://gerrit.cloudera.org:8080/#/c/10510/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/10510/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java@635 PS5, Line 635: sues.apach > thx for the info.. yeah that's ugly. if there is a jira to either fix this Done -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 30 May 2018 21:10:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6020: [DOCS] REFRESH statement cannot detect HDFS block movement
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10470 ) Change subject: IMPALA-6020: [DOCS] REFRESH statement cannot detect HDFS block movement .. Patch Set 1: Juan, You also filed this downstream. Could you comment whether to keep or remove the passage? -- To view, visit http://gerrit.cloudera.org:8080/10470 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib6de1469e0f375511fd415e79074f4ec72cf109b Gerrit-Change-Number: 10470 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Juan Yu Gerrit-Comment-Date: Wed, 30 May 2018 21:10:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Fredy Wijaya has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. IMPALA-5552: Add support for authorized proxy groups The patch adds support for mapping of users to a list of proxy groups. The following flags are added in impalad: - authorized_proxy_group_config - authorized_proxy_group_config_delimiter Example: --authorized_proxy_group_config=hue=group1,group2;user1=* This feature is not supported on Shell-based Hadoop groups mapping providers. Testing: - Added FE unit test to check for groups mapping provider - Added BE unit test for the parsing logic - Added a new test in test_authorization.py - Ran all end-to-end test_authorization.py Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb --- M be/src/service/CMakeLists.txt M be/src/service/frontend.cc M be/src/service/frontend.h A be/src/service/impala-server-test.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/service/JniFrontendTest.java M tests/authorization/test_authorization.py 11 files changed, 385 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/10510/8 -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. Patch Set 7: Code-Review+1 (1 comment) thx for the changes. lgtm, but let others settle on their reviews as well. http://gerrit.cloudera.org:8080/#/c/10510/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/10510/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java@635 PS5, Line 635: startsWith > Unfortunately there's no way to do this at the moment. I agree this is very thx for the info.. yeah that's ugly. if there is a jira to either fix this api or for impala to use something else, please add it. in either case, pls add a comment about why this is done (e.g., // HACK: ... ) -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 30 May 2018 20:13:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 16: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 16 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 19:34:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc@406 PS5, Line 406: thorized_proxy_user_config or FLAGS_authorized_proxy_group_config > can be specific to to the authorized_proxy_config for which this method was I initially did that with lambda: https://gerrit.cloudera.org/c/10510/2/be/src/service/impala-server.cc#323. Let me know what you think about it and I can use the old code back. http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc@412 PS5, Line 412: st string& config: proxy_confi > if ws is included around a group or user string, e.g., " * ", will (and s Make sense I'll update it. http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc@416 PS5, Line 416: > can use a literal pair constructor: { ... } Done http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc@1383 PS5, Line 1383: if (user == "*" | > its an error, no need to wrap. Done http://gerrit.cloudera.org:8080/#/c/10510/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/10510/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java@635 PS5, Line 635: startsWith > This makes the error messages in the security package part of the api. Is t Unfortunately there's no way to do this at the moment. I agree this is very ugly. This is another example in Impala that relies on the same exact exception message: https://github.com/apache/impala/blob/e9bd917a218b5e1717fced983f70f64850c6e02f/fe/src/main/java/org/apache/impala/util/RequestPoolService.java#L302 The source in Hadoop API: https://github.com/apache/hadoop/blob/47c31ff16b452d47afc6ffc1cf936ac2de9b788d/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java#L198-L200 -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 30 May 2018 18:55:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Fredy Wijaya has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. IMPALA-5552: Add support for authorized proxy groups The patch adds support for mapping of users to a list of proxy groups. The following flags are added in impalad: - authorized_proxy_group_config - authorized_proxy_group_config_delimiter Example: --authorized_proxy_group_config=hue=group1,group2;user1=* This feature is not supported on Shell-based Hadoop groups mapping providers. Testing: - Added FE unit test to check for groups mapping provider - Added BE unit test for the parsing logic - Added a new test in test_authorization.py - Ran all end-to-end test_authorization.py Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb --- M be/src/service/CMakeLists.txt M be/src/service/frontend.cc M be/src/service/frontend.h A be/src/service/impala-server-test.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/service/JniFrontendTest.java M tests/authorization/test_authorization.py 11 files changed, 382 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/10510/7 -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-2751: Matching quotes are not required in comments
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/10541 ) Change subject: IMPALA-2751: Matching quotes are not required in comments .. Patch Set 2: I ran all shell tests on both Python 2.6 and Python 2.7. This should also fix IMPALA-7089. -- To view, visit http://gerrit.cloudera.org:8080/10541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2feae34026a7e63f3d31489f757f093a73ca5d2c Gerrit-Change-Number: 10541 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 30 May 2018 18:53:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2751: Matching quotes are not required in comments
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10541 Change subject: IMPALA-2751: Matching quotes are not required in comments .. IMPALA-2751: Matching quotes are not required in comments This patch fixes the issue where non-matching quotes inside comments will cause the shell to not terminate. The fix is to strip any SQL comments before sending to shlex since shlex does not understand SQL comments and will raise an exception when it sees unmatched quotes regardless whether the quotes are in the comments or not. Testing: - Added new shell tests - Ran all end-to-end shell tests on Python 2.6 and Python 2.7 Change-Id: I2feae34026a7e63f3d31489f757f093a73ca5d2c --- M shell/impala_shell.py M tests/shell/test_shell_commandline.py M tests/shell/test_shell_interactive.py 3 files changed, 26 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/10541/2 -- To view, visit http://gerrit.cloudera.org:8080/10541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2feae34026a7e63f3d31489f757f093a73ca5d2c Gerrit-Change-Number: 10541 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy Wijaya
[Impala-ASF-CR] IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10531 ) Change subject: IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize(). .. Patch Set 1: (1 comment) I think handling the null is the right thing to do. It bothers me a bit that we got the list of HRegionLocation from HBase and then one of these calls fails. http://gerrit.cloudera.org:8080/#/c/10531/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/10531/1/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@711 PS1, Line 711: This is copied from org.apache.hadoop.hbase.client.HTable in HBase 0.95 This is where the HRegionLocation is coming from, maybe this needs to be updated for the new HBase? We had a TODO here, but we removed it in IMPALA-4082. -- To view, visit http://gerrit.cloudera.org:8080/10531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f Gerrit-Change-Number: 10531 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 30 May 2018 18:53:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6990: TestClientSsl.test tls v12 failing due to Python SSL error
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10529 ) Change subject: IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error .. Patch Set 3: Code-Review+1 (5 comments) My main question is to confirm that we understand why the Python upgrade broke this. Before it was also broken, but we were silently ignoring that? The rest is python nits, which you can ignore if you'd like. http://gerrit.cloudera.org:8080/#/c/10529/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10529/3//COMMIT_MSG@9 PS3, Line 9: When we upgraded to thrift-0.9.3, the TSSLSocket.py logic changed quite To be super clear: we used to ignore this silently but now TSSLSocket.py broke this? i.e., why is it that the test started failing after the Thrift change... http://gerrit.cloudera.org:8080/#/c/10529/3//COMMIT_MSG@13 PS3, Line 13: https://github.com/apache/thrift/blob/master/lib/py/src/transport/TSSLSocket.py#L37-L41 You can press "y" in the github UI to get code a permalink UI. Or specify a tag instead of "master". http://gerrit.cloudera.org:8080/#/c/10529/3/tests/custom_cluster/test_client_ssl.py File tests/custom_cluster/test_client_ssl.py: http://gerrit.cloudera.org:8080/#/c/10529/3/tests/custom_cluster/test_client_ssl.py@36 PS3, Line 36: HAS_TLSV12_INCOMPATIBLE_PYTHON = True It's super confusing that you'd change a constant on line 43. Can you just put line 43 here? The same comment sort of applies to the other stuff, but you've not changed it. http://gerrit.cloudera.org:8080/#/c/10529/3/tests/custom_cluster/test_client_ssl.py@45 PS3, Line 45: # Old ssl module versions don't even have OPENSSL_VERSION_NUMBER as a member. If you want, you can use "getattr" with a default to avoid this. e.g.: getattr(ssl, "OPENSSL_VERSION_NUMBER", None) Feel free to ignore since this was pre-existing. http://gerrit.cloudera.org:8080/#/c/10529/3/tests/custom_cluster/test_client_ssl.py@132 PS3, Line 132: @pytest.mark.skipif(HAS_TLSV12_INCOMPATIBLE_PYTHON, reason=SKIP_TLSV12_MSG) Consider just inlining @pytest.mark.skipif(sys.version_info < REQUIRED_MIN_PYTHON_VERSION_FOR_TLSV12, reason="Python version too old to allow Thrift client to use TLSv1.2") -- To view, visit http://gerrit.cloudera.org:8080/10529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187 Gerrit-Change-Number: 10529 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 30 May 2018 18:51:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6714: [DOCS] ORC file format support
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10525 ) Change subject: IMPALA-6714: [DOCS] ORC file format support .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10525/1/docs/topics/impala_orc.xml File docs/topics/impala_orc.xml: http://gerrit.cloudera.org:8080/#/c/10525/1/docs/topics/impala_orc.xml@133 PS1, Line 133: select * from Could you make all SQL keywords in uppercase as in the Hive examples below? -- To view, visit http://gerrit.cloudera.org:8080/10525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib1ee23ed844653c274babdce5a332dbe5c79b630 Gerrit-Change-Number: 10525 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Wed, 30 May 2018 18:45:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10545 ) Change subject: IMPALA-7061: Rework HBase splitting and assignment .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2570/ -- To view, visit http://gerrit.cloudera.org:8080/10545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897 Gerrit-Change-Number: 10545 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 30 May 2018 18:39:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 16: Code-Review+2 Tim, do you have any more comments? -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 16 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 18:37:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10060 ) Change subject: IMPALA-5216: Make admission control queuing async .. Patch Set 16: > - GetOperationStatus() after rejection should show reason. > => GetOperationStatus contains the operation state and the status > message. Unfortunately for a CANCELLED status the message is empty. > Instead we can take care of this coverage after IMPALA-1262 where > we can check for operation state to verify this scenario. I assume the CANCELLED case is covered by your sleeps. Is the GetOperationStatus() (or beeswax equivalent) after rejection/timeout covered by some test? (Presumably the admission controller test?) -- To view, visit http://gerrit.cloudera.org:8080/10060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf Gerrit-Change-Number: 10060 Gerrit-PatchSet: 16 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 18:35:26 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10545 ) Change subject: IMPALA-7061: Rework HBase splitting and assignment .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897 Gerrit-Change-Number: 10545 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 30 May 2018 18:17:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/10510/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10510/5//COMMIT_MSG@11 PS5, Line 11: impalad > question(s) on the design, since I've not seen this feature (or the related This is consistent with the existing user delegation flags. For what it's worth, we have a many options (including HDFS's core-site.xml and hadoop-site.xml) where effectively you need coherent configs across the cluster. We assume people use tools to make it happen. -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 30 May 2018 18:17:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/10545 ) Change subject: IMPALA-7061: Rework HBase splitting and assignment .. Patch Set 1: 2.x cherrypick required moving some files around, but no major changes otherwise. Dataload + fe tests succeeded locally. -- To view, visit http://gerrit.cloudera.org:8080/10545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897 Gerrit-Change-Number: 10545 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 30 May 2018 18:17:17 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment
Hello Philip Zeyliger, Impala Public Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/10545 to review the following change. Change subject: IMPALA-7061: Rework HBase splitting and assignment .. IMPALA-7061: Rework HBase splitting and assignment Some frontend PlannerTests rely on HBase tables being arranged in a deterministic way. Specifically, the HBase tables need to be split with specific region boundaries and those regions need to be assigned to specific HBase region servers. Currently, the tables are created without splits and testdata/bin/split-hbase.sh runs Java code in HBaseTestDataRegionAssignment to split and assign the tables. This runs during dataload via testdata/bin/create-load-data.sh and during tests with bin/run-all-tests.sh. There are problems with both parts of this process. The table splitting is flaky. Since significant time can pass between the assignments and the tests, rebalancing means the assignments are not always stable. This changes the process so that the HBase tables are created with the splits already specified via the HBase shell. The splits remain stable over time. PlannerTestBase runs the assignment code in HBaseTestDataRegionAssignment at the start of the PlannerTests. This makes the assignments deterministic. No other tests depends on the exact assignments, so this does not regress anything. Testing: - Local testing - Ran gerrit-verify-dryrun-external 2.x does not have minicluster profiles, so the HBaseTestDataRegionAssignment.java is in the regular fe/src/test directories. Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897 Reviewed-on: http://gerrit.cloudera.org:8080/10447 Reviewed-by: Philip Zeyliger Tested-by: Impala Public Jenkins (cherry picked from commit 9a5410570e25431813b96e00f7b91db44f672f38) --- M bin/run-all-tests.sh A fe/src/test/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssignment.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M testdata/bin/create-load-data.sh M testdata/bin/generate-schema-statements.py D testdata/bin/split-hbase.sh M testdata/datasets/functional/functional_schema_template.sql D testdata/src/main/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssigment.java 8 files changed, 165 insertions(+), 390 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/10545/1 -- To view, visit http://gerrit.cloudera.org:8080/10545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897 Gerrit-Change-Number: 10545 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT JOIN
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10519 ) Change subject: [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT_JOIN .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ff2245b6d8e3160a10aae64f622813a40121ee8 Gerrit-Change-Number: 10519 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 17:50:03 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT JOIN
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10519 ) Change subject: [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT_JOIN .. [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT_JOIN Change-Id: I9ff2245b6d8e3160a10aae64f622813a40121ee8 Reviewed-on: http://gerrit.cloudera.org:8080/10519 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M docs/topics/impala_perf_joins.xml 1 file changed, 4 insertions(+), 5 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9ff2245b6d8e3160a10aae64f622813a40121ee8 Gerrit-Change-Number: 10519 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Fredy Wijaya has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. IMPALA-5552: Add support for authorized proxy groups The patch adds support for mapping of users to a list of proxy groups. The following flags are added in impalad: - authorized_proxy_group_config - authorized_proxy_group_config_delimiter Example: --authorized_proxy_group_config=hue=group1,group2;user1=* This feature is not supported on Shell-based Hadoop groups mapping providers. Testing: - Added FE unit test to check for groups mapping provider - Added BE unit test for the parsing logic - Added a new test in test_authorization.py - Ran all end-to-end test_authorization.py Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb --- M be/src/service/CMakeLists.txt M be/src/service/frontend.cc M be/src/service/frontend.h A be/src/service/impala-server-test.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/service/JniFrontendTest.java M tests/authorization/test_authorization.py 11 files changed, 367 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/10510/6 -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 6 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/10510/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10510/5//COMMIT_MSG@11 PS5, Line 11: impalad question(s) on the design, since I've not seen this feature (or the related, user delegation support). are these flags only relevant for coordinators? why are these flags added to impalad? is there a case for different impalads having different auth lists? if there is no case, then this looks like it opens up opportunities for misconfiguration. perhaps catalogd is a better place for this? (I could be wrong on this since I didn't see how these proxies are used) http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc@406 PS5, Line 406: --authorized_proxy_user_config flag or --authorized_proxy_group_config can be specific to to the authorized_proxy_config for which this method was called. http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc@412 PS5, Line 412: parsed_allowed_users_or_groups if ws is included around a group or user string, e.g., " * ", will (and should) that ws be trimmed from begin/end? http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc@416 PS5, Line 416: make_pair( can use a literal pair constructor: { ... } http://gerrit.cloudera.org:8080/#/c/10510/5/be/src/service/impala-server.cc@1383 PS5, Line 1383: RETURN_IF_ERROR(s); its an error, no need to wrap. http://gerrit.cloudera.org:8080/#/c/10510/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/10510/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java@635 PS5, Line 635: startsWith This makes the error messages in the security package part of the api. Is there any existing alternative to avoid this? If not, is there a JIRA filed to change that? -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 30 May 2018 17:47:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT JOIN
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10519 ) Change subject: [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT_JOIN .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/301/ -- To view, visit http://gerrit.cloudera.org:8080/10519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ff2245b6d8e3160a10aae64f622813a40121ee8 Gerrit-Change-Number: 10519 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 17:41:09 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] A typo fix in runtime filtering doc
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10544 ) Change subject: [DOCS] A typo fix in runtime filtering doc .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I43006540ed24c957c9ad2ec54707bb9c678d6fb3 Gerrit-Change-Number: 10544 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 30 May 2018 17:13:34 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] A typo fix in runtime filtering doc
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10544 ) Change subject: [DOCS] A typo fix in runtime filtering doc .. [DOCS] A typo fix in runtime filtering doc Change-Id: I43006540ed24c957c9ad2ec54707bb9c678d6fb3 Reviewed-on: http://gerrit.cloudera.org:8080/10544 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins --- M docs/topics/impala_runtime_filtering.xml 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I43006540ed24c957c9ad2ec54707bb9c678d6fb3 Gerrit-Change-Number: 10544 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] [DOCS] A typo fix in runtime filtering doc
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10544 ) Change subject: [DOCS] A typo fix in runtime filtering doc .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/300/ -- To view, visit http://gerrit.cloudera.org:8080/10544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I43006540ed24c957c9ad2ec54707bb9c678d6fb3 Gerrit-Change-Number: 10544 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 30 May 2018 17:07:17 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] A typo fix in runtime filtering doc
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/10544 ) Change subject: [DOCS] A typo fix in runtime filtering doc .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I43006540ed24c957c9ad2ec54707bb9c678d6fb3 Gerrit-Change-Number: 10544 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 30 May 2018 16:55:25 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] A typo fix in runtime filtering doc
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10544 Change subject: [DOCS] A typo fix in runtime filtering doc .. [DOCS] A typo fix in runtime filtering doc Change-Id: I43006540ed24c957c9ad2ec54707bb9c678d6fb3 --- M docs/topics/impala_runtime_filtering.xml 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/10544/1 -- To view, visit http://gerrit.cloudera.org:8080/10544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I43006540ed24c957c9ad2ec54707bb9c678d6fb3 Gerrit-Change-Number: 10544 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-6812: Fix flaky Kudu scan tests
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10503 ) Change subject: IMPALA-6812: Fix flaky Kudu scan tests .. Patch Set 1: > I'm sympathetic to the problem that people may still encounter > these test failures if they run the tests through a means other > than run-all-tests.sh > Perhaps one option is to make kudu_read_mode a query option, so that it can be set in the query options by the python code. Thomas, can you think of any meaningful Impala/Kudu integration/functional testing that we'll miss out if we change the tests to all rn with READ_AT_SNAPSHOT. It does seem like we need to enable this in order to get predictable scan results, but I also agree with Tim that it's best to test in the way most similar to the defaults. -- To view, visit http://gerrit.cloudera.org:8080/10503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I70df84f2cbc663107f2ad029565d3c15bdfbd47c Gerrit-Change-Number: 10503 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 30 May 2018 16:42:44 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT JOIN
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10519 ) Change subject: [DOCS] Fixed an inconsistent table ordering recommendations with STRAIGHT_JOIN .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ff2245b6d8e3160a10aae64f622813a40121ee8 Gerrit-Change-Number: 10519 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 16:34:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9943 ) Change subject: IMPALA-5706: Spilling sort optimisations .. Patch Set 17: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 Gerrit-Change-Number: 9943 Gerrit-PatchSet: 17 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 16:26:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9943 ) Change subject: IMPALA-5706: Spilling sort optimisations .. Patch Set 17: Sorry for the delay here, got busy with some unplanned things. -- To view, visit http://gerrit.cloudera.org:8080/9943 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9 Gerrit-Change-Number: 9943 Gerrit-PatchSet: 17 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 May 2018 16:26:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 ) Change subject: IMPALA-3307: Add support for IANA time-zone db .. Patch Set 10: (5 comments) > > > > > > Uploaded patch set 9. > > > > > > > > > > Patch -set 9 contains the following changes: > > > > > - Added a full timezone db to testdata/tzdb. > > > > > - End-to-end tests and BE-tests were changed to use this > > > timezone > > > > > db. This was necessary because some timezone-tests were > > failing > > > > on > > > > > older jenkins workers that had an older tzdata package > > > installed. > > > > > > > > It might be a good idea to store the timezone-db files in one > > > .tar > > > > file and extract them before running the tests. What do you > > > think? > > > > > > I agree, .taring or compressing the tz db would be much better, > > if > > > it does not make the code too complicated. Having less file > would > > > make the review more readable, >From a review ability perspective, there's absolutely no need for this commit >to have 221 timezone files. You can test it with ~3, or do a separate commit. >i.e., it's neither here nor there. > > > > Alternatively we can store timezone files in a JAR archive > instead. > > The BE can call into the java FE to extract files from it. > > Tim, Dan, what do you think? The Yarn equivalent here has this notion of a "distributed cache" which is to say it stores the files locally and re-uses them across jobs. I can't tell if we should be worried that all impalads, at boot time, will slam HDFS with reading the timezonedb. I think reading ~200 files per impalad times 200 impalad daemons may be a lot of HDFS metadata load, but maybe it's comparable to what we do for queries anyway. I certainly think that tar or jar is a better way to go. Since this is happening at boot, we can probably still fork to tar, which we can assume is available, or use Java, which has tar libraries and native support for zip. http://gerrit.cloudera.org:8080/#/c/9986/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9986/10//COMMIT_MSG@35 PS10, Line 35: specify an HDFS/S3/ADLS location that contains the shared compiled In what format? Does it add to the host one or override it? My /usr/share/zoneinfo is full of symlinks which HDFS doesn't support in some configurations (and S3 certainly doesn't). http://gerrit.cloudera.org:8080/#/c/9986/10//COMMIT_MSG@41 PS10, Line 41: - The name of the coordinator node’s local time-zone is saved to the Is it easy to tell what the local time zone is of an impalad node? (E.g., do we log it?) http://gerrit.cloudera.org:8080/#/c/9986/10//COMMIT_MSG@45 PS10, Line 45: - Introduces a new startup flag (--hdfs_zone_abbrev_conf) to impalad What's the distinction between this and --hdfs_zone_info_dir? Do we need both? http://gerrit.cloudera.org:8080/#/c/9986/10/testdata/tzdb/2017c/Africa/Abidjan File testdata/tzdb/2017c/Africa/Abidjan: http://gerrit.cloudera.org:8080/#/c/9986/10/testdata/tzdb/2017c/Africa/Abidjan@1 PS10, Line 1: ../Atlantic/St_Helena We're adding a ton of files. Do we need such a big database for our testing purposes? http://gerrit.cloudera.org:8080/#/c/9986/10/tests/custom_cluster/test_shared_tzdb.py File tests/custom_cluster/test_shared_tzdb.py: http://gerrit.cloudera.org:8080/#/c/9986/10/tests/custom_cluster/test_shared_tzdb.py@38 PS10, Line 38: cls.ImpalaTestMatrix.add_constraint(lambda v: Add a comment about what this is trying to do? -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 10 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 30 May 2018 15:59:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6987: [DOCS] Refactor the INVALIDATE METADATA and REFRESH docs
Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/10339 ) Change subject: IMPALA-6987: [DOCS] Refactor the INVALIDATE METADATA and REFRESH docs .. Patch Set 4: Sorry for the delay. Taking a look today. -- To view, visit http://gerrit.cloudera.org:8080/10339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2124e14900d0f82569c061cc46006447bb054b36 Gerrit-Change-Number: 10339 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 30 May 2018 15:40:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 ) Change subject: IMPALA-3307: Add support for IANA time-zone db .. Patch Set 10: > > > > > Uploaded patch set 9. > > > > > > > > Patch -set 9 contains the following changes: > > > > - Added a full timezone db to testdata/tzdb. > > > > - End-to-end tests and BE-tests were changed to use this > > timezone > > > > db. This was necessary because some timezone-tests were > failing > > > on > > > > older jenkins workers that had an older tzdata package > > installed. > > > > > > It might be a good idea to store the timezone-db files in one > > .tar > > > file and extract them before running the tests. What do you > > think? > > > > I agree, .taring or compressing the tz db would be much better, > if > > it does not make the code too complicated. Having less file would > > make the review more readable, and would also make the tz db > > consume much less space on hdfs, as the many small files will be > > rounded up to hdfs block size. > > Extracting files from a .tar file can be tricky. Probably we would > have to add libtar library to the native-toolchain to handle .tar > files. > > Alternatively we can store timezone files in a JAR archive instead. > The BE can call into the java FE to extract files from it. Tim, Dan, what do you think? -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 10 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 30 May 2018 15:34:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 ) Change subject: IMPALA-3307: Add support for IANA time-zone db .. Patch Set 10: > > > > Uploaded patch set 9. > > > > > > Patch -set 9 contains the following changes: > > > - Added a full timezone db to testdata/tzdb. > > > - End-to-end tests and BE-tests were changed to use this > timezone > > > db. This was necessary because some timezone-tests were failing > > on > > > older jenkins workers that had an older tzdata package > installed. > > > > It might be a good idea to store the timezone-db files in one > .tar > > file and extract them before running the tests. What do you > think? > > I agree, .taring or compressing the tz db would be much better, if > it does not make the code too complicated. Having less file would > make the review more readable, and would also make the tz db > consume much less space on hdfs, as the many small files will be > rounded up to hdfs block size. Extracting files from a .tar file can be tricky. Probably we would have to add libtar library to the native-toolchain to handle .tar files. Alternatively we can store timezone files in a JAR archive instead. The BE can call into the java FE to extract files from it. -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 10 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 30 May 2018 15:25:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location
Gabor Kaszab has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10543 Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same location .. IMPALA-6119: Fix issue with multiple partitions sharing same location When multiple partitions point to the same location and a new data file is added to any of them then the expected behaviour is that this new file is added to the other partitions pointing to the same location as well. Apparently, this is not the case and right after the insertion the new file is only visible in the partition where it was inserted to and an invalidate metadata is needed to resolve this inconsistency. This fix addresses this issue with keeping track of a mapping between locations and the HdfsPartitions pointing to it. When new files are inserted into a partition then all the other partition's metadata are reloaded that point to the same location as the one where the files are inserted. Testing: There was an existing test that covered partitions pointing to the same location. However, after each insert it executed a refresh to reload the metadata for the entire table. This reload was removed to cover the changes of this fix. Another test is introduced to cover the case when the location of a partition is altered. Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/metadata/test_partition_metadata.py 3 files changed, 107 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/10543/1 -- To view, visit http://gerrit.cloudera.org:8080/10543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a Gerrit-Change-Number: 10543 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab
[Impala-ASF-CR] IMPALA-7090: Limit the size of expr created by EqualityDisjunctsToInRule
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10528 ) Change subject: IMPALA-7090: Limit the size of expr created by EqualityDisjunctsToInRule .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10528 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie40c3210271a9e3c7f1b2b869a8c2ec8bacaa72a Gerrit-Change-Number: 10528 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 30 May 2018 09:28:27 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert IMPALA-2751: Matching quotes are not requirerd in comments
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10537 ) Change subject: Revert IMPALA-2751: Matching quotes are not requirerd in comments .. Revert IMPALA-2751: Matching quotes are not requirerd in comments This patch is causing a large number of builds to fail, see IMPALA-7089. Change-Id: Id9995a91408d86a5ae1ecd70d07b02622ae26b43 Reviewed-on: http://gerrit.cloudera.org:8080/10537 Reviewed-by: Philip Zeyliger Tested-by: Impala Public Jenkins --- M shell/impala_shell.py M tests/shell/test_shell_interactive.py 2 files changed, 5 insertions(+), 25 deletions(-) Approvals: Philip Zeyliger: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Id9995a91408d86a5ae1ecd70d07b02622ae26b43 Gerrit-Change-Number: 10537 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] Revert IMPALA-2751: Matching quotes are not requirerd in comments
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10537 ) Change subject: Revert IMPALA-2751: Matching quotes are not requirerd in comments .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id9995a91408d86a5ae1ecd70d07b02622ae26b43 Gerrit-Change-Number: 10537 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 30 May 2018 08:12:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6338: Disable more flaky bloom filter tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10530 ) Change subject: IMPALA-6338: Disable more flaky bloom filter tests .. IMPALA-6338: Disable more flaky bloom filter tests Until IMPALA-6338 is fixed, temporarily disable tests that are affected by it - any test that has a 'limit' and relies on the contents of the runtime profile. This patch disables the runtime profile check for all such tests in bloom_filter.test Change-Id: Ifc9da892efa3b27d63056ad8e3befac82808ffdb Reviewed-on: http://gerrit.cloudera.org:8080/10530 Reviewed-by: Bikramjeet Vig Tested-by: Impala Public Jenkins --- M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test 1 file changed, 12 insertions(+), 8 deletions(-) Approvals: Bikramjeet Vig: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10530 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ifc9da892efa3b27d63056ad8e3befac82808ffdb Gerrit-Change-Number: 10530 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-6338: Disable more flaky bloom filter tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10530 ) Change subject: IMPALA-6338: Disable more flaky bloom filter tests .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10530 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc9da892efa3b27d63056ad8e3befac82808ffdb Gerrit-Change-Number: 10530 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 30 May 2018 08:00:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. Patch Set 5: (2 comments) > Patch Set 3: > > Update the commit message - IMPALA-5552 vs 5522 Fixed the typo. http://gerrit.cloudera.org:8080/#/c/10510/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/10510/2/be/src/service/impala-server.cc@319 PS2, Line 319: } : : if (!FLAGS_authorized_proxy_group_config.empty()) { : AddAuthorizedProxyConfig( : aut > The error can be more generic saying that there's something invalid about t Done. I made the error more generic and removed the use of lambda. http://gerrit.cloudera.org:8080/#/c/10510/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/10510/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@623 PS2, Line 623: > For common Impala deployments, do you know what's in use? That's weird that CM defaults to ShellBasedUnixGroupsMapping and I think that needs to be updated to use the JNI with fallback implementation instead. FYI, Sentry also relies calls the same API to get the groups: https://github.com/apache/sentry/blob/97f666345c1c068fed56a29512f9dcea77e44c8a/sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java#L60 I added a check to fail hard when the groups mapping provider is a shell-based implementation. -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 30 May 2018 06:28:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5552: Add support for authorized proxy groups
Fredy Wijaya has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/10510 ) Change subject: IMPALA-5552: Add support for authorized proxy groups .. IMPALA-5552: Add support for authorized proxy groups The patch adds support for mapping of users to a list of proxy groups. The following flags are added in impalad: - authorized_proxy_group_config - authorized_proxy_group_config_delimiter Example: --authorized_proxy_group_config=hue=group1,group2;user1=* This feature is not supported on Shell-based Hadoop groups mapping providers. Testing: - Added FE unit test to check for groups mapping provider - Added BE unit test for the parsing logic - Added a new test in test_authorization.py - Ran all end-to-end test_authorization.py Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb --- M be/src/service/CMakeLists.txt M be/src/service/frontend.cc M be/src/service/frontend.h A be/src/service/impala-server-test.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/test/java/org/apache/impala/service/JniFrontendTest.java M tests/authorization/test_authorization.py 11 files changed, 364 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/10510/5 -- To view, visit http://gerrit.cloudera.org:8080/10510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6953f89c293b06b72f523b11802232133d9d6cbb Gerrit-Change-Number: 10510 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7090: Limit the size of expr created by EqualityDisjunctsToInRule
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10528 ) Change subject: IMPALA-7090: Limit the size of expr created by EqualityDisjunctsToInRule .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2568/ -- To view, visit http://gerrit.cloudera.org:8080/10528 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie40c3210271a9e3c7f1b2b869a8c2ec8bacaa72a Gerrit-Change-Number: 10528 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 30 May 2018 06:07:11 + Gerrit-HasComments: No