[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15524 ) Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@121 PS4, Line 121: processes. > I'm running into an issue where if I run the packaged shell with python3, s It's true -- I actually had one more JIRA filed for that: https://issues.apache.org/jira/browse/IMPALA-9362 It is weird that the pip-installed version doesn't have this issue: $ impala-shell Starting Impala with no authentication using Python 3.7.6 <-- note version Opened TCP connection to localhost:21000 Connected to localhost:21000 Server version: impalad version 3.4.0-SNAPSHOT DEBUG (build cc91c...) *** Welcome to the Impala shell. (Impala Shell v3.4.0-SNAPSHOT (cc91c66) built on Tue Mar 24 09:54:50 PDT 2020) The SET command shows the current value of all shell and query options. *** [localhost:21000] default> http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1790 PS4, Line 1790: if isinstance(options.output_delimiter, str): > This seems to barf if I pass in a unicode delimiter. I don't think this is Ah, interesting. I guess I was just working from our existing test cases -- I guess we don't have one for this. We can try to resolve it if you think it's important. http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh File shell/make_shell_tarball.sh: http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh@52 PS4, Line 52: if [ "${USE_THRIFT11_GEN_PY:-}" == "false" ]; then > Does it make sense to allow overriding this? Would a shell built with old t Actually, that's a good point. Other tests pass, but unicode-related tests don't. We should probably disallow this. http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh@53 PS4, Line 53: # thrift 0.9.3-p7 > Not sure if we need this comment? Ack -- To view, visit http://gerrit.cloudera.org:8080/15524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615 Gerrit-Change-Number: 15524 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 06:28:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/1 ) Change subject: IMPALA-9545 Decide cacheline size of aarch64 .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5544/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/1 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f Gerrit-Change-Number: 1 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 25 Mar 2020 06:23:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64
zhaoren...@hotmail.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/1 Change subject: IMPALA-9545 Decide cacheline size of aarch64 .. IMPALA-9545 Decide cacheline size of aarch64 ARM64's L3 cacheline size is different according to CPU vendor's architecture. If user defined CACHELINESIZE_AARCH64 in impala-config-local.sh, then we will use that value, if user did not define it, then we will get the value from OS, if fail, then we will use the default value 64. Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f --- M CMakeLists.txt M be/src/gutil/port.h M buildall.sh 3 files changed, 24 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/1/2 -- To view, visit http://gerrit.cloudera.org:8080/1 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f Gerrit-Change-Number: 1 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward
[Impala-ASF-CR] IMPALA-9373: more tactical IWYU fixes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15248 ) Change subject: IMPALA-9373: more tactical IWYU fixes .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8de71866bdf3211e53560d9bfe930e7657c4d7f1 Gerrit-Change-Number: 15248 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 25 Mar 2020 03:37:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9373: more tactical IWYU fixes
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15248 ) Change subject: IMPALA-9373: more tactical IWYU fixes .. IMPALA-9373: more tactical IWYU fixes This is a grab-bag of fixes that I did with a mix of manual inspection. The techniques used were: * Getting preprocessor output for a few files by modifying command lines from compiler_commands.json to include -E. This is revealing because you see all the random unrelated cruft that gets pulled in. A useful one liner to extract an (approximate) list of headers from preprocessor output is: grep '^#.*h' be/src/util/CMakeFiles/Util.dir/os-info.cc.i | \ grep -o '".*"' | sort -u * Looking at the IWYU recommendations for guidance on what headers can be removed (and what need to be added). * Grepping for includes of headers, especially in other headers where they become viral. An example one-liner to find these: git grep -l 'include.*' | grep '\.h$' Non-exhaustive list of changes made: --- Unnest classes from TmpFileMgr so we can forward-declare them. This lets us remove tmp-file-mgr.h from buffer-pool.h and query-state.h, which are both widely included headers in the codebase. Also remove webserver.h from other headers, since it pulls in openssl-util.h and consequently a lot of openssl headers. Avoid including runtime/multi-precision.h in other headers. It pulls in a lot of boost multiprecision headers that are only needed for internal implementations of math and decimal operations. This required replacing some references to int128_t with __int128_t, which I don't think significantly hurts code readability. Also remove references to decimal-util.h where they're not needed, since it transitively pulls in multi-precision.h Reduce includes of boost/date_time modules, which are transitively many places via timestamp-value.h. Remove transitive dependencies of timestamp-value.h to avoid pulling in remaining boost date_time headers where not needed. Dependent headers are: scalar-expr-evaluator.h, expr-value.h Remove references to debug-util.h in other headers, because it pulls in a lot of thread headers. Remove references to llvm-codegen.h where possible, because it pulls in many llvm headers. Other opportunities: * boost/algorithm/string.hpp includes many string algorithms and pulls in a lot of headers. * util/string-parser.h is a giant header with many dependencies. * There's lots of redundancy between boost and standard c++ headers. Both pull in vast numbers of utility headers for C++ metaprogramming and similar things. If we reduced virality of boost headers this would help a lot, and also if we switch to equivalent standard headers where possible (e.g. unordered_map, unordered_set, function, bind, etc). Compile time with clang/ASAN: - Before: real9m6.311s user62m25.006s sys 2m44.798s After: real8m17.073s user55m38.425s sys 2m25.808s Change-Id: I8de71866bdf3211e53560d9bfe930e7657c4d7f1 Reviewed-on: http://gerrit.cloudera.org:8080/15248 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/benchmarks/atod-benchmark.cc M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/benchmarks/overflow-benchmark.cc M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/codegen/llvm-codegen.cc M be/src/common/init.cc M be/src/common/logging.cc M be/src/common/logging.h M be/src/common/status.cc M be/src/common/thread-debug-info-test.cc M be/src/common/thread-debug-info.h M be/src/exec/aggregator.cc M be/src/exec/blocking-plan-root-sink.cc M be/src/exec/buffered-plan-root-sink.cc M be/src/exec/catalog-op-executor.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/grouping-aggregator.cc M be/src/exec/hash-table-test.cc M be/src/exec/hdfs-avro-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/join-builder.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-table-sink.h M be/src/exec/orc-column-readers.cc M be/src/exec/orc-column-readers.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-common.h M be/src/exec/parquet/parquet-version-test.cc M be/src/exec/partitioned-hash-join-builde
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 20: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5596/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 20 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 25 Mar 2020 03:33:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5595/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 19 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 25 Mar 2020 03:30:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5594/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 10 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 25 Mar 2020 03:16:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Hello Thomas Tauber-Marshall, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15306 to look at the new patch set (#20). Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. IMPALA-8690: Add LIRS cache eviction algorithm One concern for the data cache is that the LRU eviction algorithm is suceptible to being flushed by large scans of low priority data. This implements the LIRS algorithm described in "LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy to Improve Buffer Cache Performance" by Song Jiang / Xiaodon Xhang 2002. LIRS is a scan-resistent eviction algorithm with low performance penalty to LRU. This introduces the startup flag data_cache_eviction_policy to control which eviction policy to use. The only two options are LRU and LIRS, with the default continuing to be LRU. To accomodate the new algorithm and associated tests, some code moved around: 1. The RLCacheShard implementation moved from util/cache/cache.cc to util/cache/rl-cache.cc. 2. The backend cache tests were split into multiple files. util/cache/cache-test.h contains shared cache testing code. util/cache/cache-test.cc contains generic tests that should work for any algorithm. util/cache/rl-cache-test.cc are RLCacheShard specific tests util/cache/lirs-cache-test.cc are LIRS specific tests 3. To make it easy for clients of the cache code to customize the cache eviction algorithm, the public interface changed from using a template to taking the policy as an argument. 4. Cache::MemoryType is removed. 5. Cache adds an Init() method to verify the validity of startup flags Testing: - Added LIRS specific backend cache tests (lirs-cache-test) - Ran TPC-DS with a very small cache and concurrency to test corner cases with the LIRS eviction policy - Parameterized data-cache-test to run for both LRU and LIRS - Added LIRS equivalents for tests in custom_cluster/test_data_cache.py - Ran cache-bench with LRU and LIRS. The results are: Test case | Algorithm | Lookups / sec | Hit rate ZIPFIAN ratio=1.00x | LRU | 11.31M| 99.9% ZIPFIAN ratio=1.00x | LIRS | 10.09M| 99.8% ZIPFIAN ratio=3.00x | LRU | 11.36M| 95.9% ZIPFIAN ratio=3.00x | LIRS | 9.27M| 96.4% UNIFORM ratio=1.00x | LRU | 7.46M| 99.8% UNIFORM ratio=1.00x | LIRS | 6.93M| 99.8% UNIFORM ratio=3.00x | LRU | 5.63M| 33.3% UNIFORM ratio=3.00x | LIRS | 3.24M| 33.3% The takeaway is that LIRS is a bit slower on lookups and quite a bit slower on inserts. However, they both are still doing millions of operations per second, so it should not be a bottleneck for the data cache. Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/cache/CMakeLists.txt M be/src/util/cache/cache-bench.cc M be/src/util/cache/cache-internal.h M be/src/util/cache/cache-test.cc A be/src/util/cache/cache-test.h M be/src/util/cache/cache.cc M be/src/util/cache/cache.h A be/src/util/cache/lirs-cache-test.cc A be/src/util/cache/lirs-cache.cc A be/src/util/cache/rl-cache-test.cc A be/src/util/cache/rl-cache.cc M bin/rat_exclude_files.txt M tests/custom_cluster/test_data_cache.py 16 files changed, 2,665 insertions(+), 844 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/15306/20 -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 20 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 18: (3 comments) http://gerrit.cloudera.org:8080/#/c/15306/18/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/18/be/src/util/cache/lirs-cache.cc@77 PS18, Line 77: // If the key has only been accessed once, its reuse distance is considered infinite. > if i'm reading the hdfs-file-reader.cc and data-cache.cc code correctly, it This ends up being a problem mainly in corner cases. You are right that it would be better to handle this directly. I changed Lookup to take a LookupBehavior which is either NORMAL or NO_UPDATE. It defaults to NORMAL. NO_UPDATE does not change any priorities. This replaces the defunct CacheBehavior argument. I added tests for NO_UPDATE in lirs-cache-test.cc and rl-cache-test.cc. Here's a quick run down of why this doesn't impact LIRS much. There are a few different cases: Case 1: If there is a hit in ReadDataCache, then the entry is PROTECTED or UNPROTECTED, and we won't go on to WriteDataCache. Case 2: If there is a miss in ReadDataCache and it is completely missing from the cache, then when we switch over to WriteDataCache, it's lookup doesn't change the cache at all. There is no entry in the cache, so it doesn't matter. (First corner case: if something else inserted an entry between us doing ReadDataCache and WriteDataCache, this would count as a lookup and could bump its priority.) Case 3: If there is a miss in ReadDataCache and it is a TOMBSTONE entry, then Lookup doesn't modify anything. Lookup doesn't move TOMBSTONE entires around. Case 4: If there is a partial hit where we wanted to read length 1024 and there was length 512, then the Lookup from ReadDataCache will bump the priority of the original 512 entry (correctly), then it would go try to insert the full 1024 length entry. The Lookup in WriteDataCache would bump its priority again. This is another corner case. http://gerrit.cloudera.org:8080/#/c/15306/18/be/src/util/cache/lirs-cache.cc@150 PS18, Line 150: ref_count > nit: could you add some docs for 'ref_count' its not clear to me when it ne I added a comment down at the ref_count field that gives a description. http://gerrit.cloudera.org:8080/#/c/15306/18/be/src/util/cache/lirs-cache.cc@411 PS18, Line 411: HandleTable table_; > nit: add docs Added a one-line comment -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 18 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 25 Mar 2020 02:46:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Hello Thomas Tauber-Marshall, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15306 to look at the new patch set (#19). Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. IMPALA-8690: Add LIRS cache eviction algorithm One concern for the data cache is that the LRU eviction algorithm is suceptible to being flushed by large scans of low priority data. This implements the LIRS algorithm described in "LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy to Improve Buffer Cache Performance" by Song Jiang / Xiaodon Xhang 2002. LIRS is a scan-resistent eviction algorithm with low performance penalty to LRU. This introduces the startup flag data_cache_eviction_policy to control which eviction policy to use. The only two options are LRU and LIRS, with the default continuing to be LRU. To accomodate the new algorithm and associated tests, some code moved around: 1. The RLCacheShard implementation moved from util/cache/cache.cc to util/cache/rl-cache.cc. 2. The backend cache tests were split into multiple files. util/cache/cache-test.h contains shared cache testing code. util/cache/cache-test.cc contains generic tests that should work for any algorithm. util/cache/rl-cache-test.cc are RLCacheShard specific tests util/cache/lirs-cache-test.cc are LIRS specific tests 3. To make it easy for clients of the cache code to customize the cache eviction algorithm, the public interface changed from using a template to taking the policy as an argument. 4. Cache::MemoryType is removed. 5. Cache adds an Init() method to verify the validity of startup flags Testing: - Added LIRS specific backend cache tests (lirs-cache-test) - Ran TPC-DS with a very small cache and concurrency to test corner cases with the LIRS eviction policy - Parameterized data-cache-test to run for both LRU and LIRS - Added LIRS equivalents for tests in custom_cluster/test_data_cache.py - Ran cache-bench with LRU and LIRS. The results are: Test case | Algorithm | Lookups / sec | Hit rate ZIPFIAN ratio=1.00x | LRU | 11.31M| 99.9% ZIPFIAN ratio=1.00x | LIRS | 10.09M| 99.8% ZIPFIAN ratio=3.00x | LRU | 11.36M| 95.9% ZIPFIAN ratio=3.00x | LIRS | 9.27M| 96.4% UNIFORM ratio=1.00x | LRU | 7.46M| 99.8% UNIFORM ratio=1.00x | LIRS | 6.93M| 99.8% UNIFORM ratio=3.00x | LRU | 5.63M| 33.3% UNIFORM ratio=3.00x | LIRS | 3.24M| 33.3% The takeaway is that LIRS is a bit slower on lookups and quite a bit slower on inserts. However, they both are still doing millions of operations per second, so it should not be a bottleneck for the data cache. Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/cache/CMakeLists.txt M be/src/util/cache/cache-bench.cc M be/src/util/cache/cache-internal.h M be/src/util/cache/cache-test.cc A be/src/util/cache/cache-test.h M be/src/util/cache/cache.cc M be/src/util/cache/cache.h A be/src/util/cache/lirs-cache-test.cc A be/src/util/cache/lirs-cache.cc A be/src/util/cache/rl-cache-test.cc A be/src/util/cache/rl-cache.cc M bin/rat_exclude_files.txt M tests/custom_cluster/test_data_cache.py 16 files changed, 2,665 insertions(+), 844 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/15306/19 -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 19 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true
Fang-Yu Rao has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true .. IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true This patch bumps up the version of Guava libraries from 14.0.1 to 28.1-jre. Due to some changes in Guava's API's, we modify the call sites accordingly. Moreover, in order to instruct the Java classes under the directory of $IMPALA_HOME/common/yarn-extras to use the new Guava libraries, we explicitly added a dependency in the corresponding pom.xml file. On the other hand, we set DISABLE_SENTRY to true regardless of $USE_CDP_HIVE since Sentry's Guava version has not been bumped up yet and thus run-sentry-service.sh cannot be successfully executed. Recall that by setting DISABLE_SENTRY to true we also disable every Sentry-related test, which is fine from now on since Impala 3.4 was recently branched. The plan is to drop support for Sentry in the Impala 4 line. Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 --- M bin/impala-config.sh M common/yarn-extras/pom.xml M common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/Column.java M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java M fe/src/main/java/org/apache/impala/catalog/DataSource.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java M impala-parent/pom.xml 48 files changed, 103 insertions(+), 94 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/15214/10 -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 10 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5593/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 02:14:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 This patch bumps up CDP_BUILD_NUMBER to 2244454 which contains a change introduced by RANGER-2688. Due to this change, we added to ranger-admin-site.xml.template a cookie-related configuration so that the Ranger server could be properly started. Testing: Verified that the data loading passes and that all the Ranger-related FE and E2E tests are successful - when $USE_CDP_HIVE is false, and - when $USE_CDP_HIVE is true. Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Reviewed-on: http://gerrit.cloudera.org:8080/15533 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M bin/impala-config.sh M testdata/cluster/ranger/ranger-admin-site.xml.template 2 files changed, 13 insertions(+), 9 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 5 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 4 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 25 Mar 2020 02:10:39 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@138 PS3, Line 138: if (curr_bucket->filled) { > I just remember that it is possible to swap with an empty bucket, especiall Sorry, this example is bad, because the final balanced bucked should be [-1, 1, 1, 2] A better example maybe is table with size 8, quadratic probe, with incoming element 1, 1, 1, 1, 7, 7. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 01:50:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
David Rorke has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@14 PS3, Line 14: Instead of proactively swapping elements during insertion, the > Yes, my intuition is also that doing the additional work to create a separa Also I think when probing for lookup we might consider trying the "start in the middle / smart search" approach described here: https://programming.guide/robin-hood-hashing.html The tradeoff is it might be less cache friendly. But it might be best to explore this as a follow on change later just to keep the scope of the initial change manageable. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 01:43:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
David Rorke has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@14 PS3, Line 14: Instead of proactively swapping elements during insertion, the > Having separate Probe() function that is special for insert might remove th Yes, my intuition is also that doing the additional work to create a separate single pass probe for insert and also making sure we maintain the distance invariant in all cases are both worthwhile. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@162 PS3, Line 162: inline int64_t HashTable::BucketDistance( > I'll see if I can put O(1) distance measurement, at least for the linear pr Wonder if we should consider storing the bucket distance (calculated during insert/rebalance) in the bucket itself. This might work against the goals of IMPALA-7635 but with a carefully packed memory layout maybe we could still squeeze in a byte sized value for the distance. Seems intuitively like a byte should be enough to store most distances, and could fall back to calculating the distance via probe in rare cases where the distance won't fit in a byte. Of course if you can figure out a good enough O(1) approach that doesn't require any storage that might be simpler. Regarding linear vs quadratic I'd also hope that robin-hood doesn't require quadratic. Maybe we should just do some benchmarking and see if there's a clear winner between different probing algorithms and if so go with that. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 01:36:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h
zhaoren...@hotmail.com has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/15510 ) Change subject: IMPALA-9538 Bump up linux-syscall-support.h .. IMPALA-9538 Bump up linux-syscall-support.h Bump up linux-syscall-support.h to newest version which support aarch64 Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 --- M be/src/gutil/linux_syscall_support.h M be/src/gutil/spinlock_linux-inl.h M be/src/kudu/util/debug-util.cc 3 files changed, 1,746 insertions(+), 891 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/15510/10 -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@138 PS3, Line 138: if (curr_bucket->filled) { > Good point, I must have miss this while I refine my code. Will change this I just remember that it is possible to swap with an empty bucket, especially in quadratic_probing mode. Consider hash-table with buckets size 4, quadratic probing, and incoming elements 1, 1, 2 in that order. Let say -1 represent empty bucket and hash of an element is equal to that element itself. Initially the buckets look like this: [-1, -1, -1, -1] After 1 and 1 inserted to the table, buckets will look like this [-1, 1, 1, -1] Now, when we want to insert 2 to table, this algorithm will put 2 at the last empty bucket in the probe sequence, that is index 3. [-1, 1, 1, 2] Then, we rebalance by swapping 2 at index 3 with 1 at index 2 [-1, 1, 2, 1] Now, the element 1 at index 3 is misplaced, because if we follow quadratic_probing, the appropriate probe sequence for 1 should be 1, 2, 0 (that is (2 + 2) mod 4), 3, and so on. So we need to swap 1 at index 3 with the empty bucket at index 0. [1, 1, 2, -1] At this point, bucket index 3 is an empty bucket and the rebalancing finish. This is kind of counter intuitive, because most literature show that robin-hood should pair with linear probing. And in linear probing, this case will not happen. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 01:28:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-9538 Bump up linux-syscall-support.h
zhaoren...@hotmail.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: [WIP]IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 9: Hi, Tim, you are right, it is from https://chromium.googlesource.com/linux-syscall-support/ And I will remove WIP asap. Thanks -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Mar 2020 01:28:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15524 ) Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. Patch Set 4: (5 comments) I did a pass over it. I'm not sure that I full understood all the details, so might want to do another pass later on, but I think relying on the end-to-end tests to validate this makes sense. One thing was that I couldn't get the built tarball to work with python 3 on my system http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@121 PS4, Line 121: processes. I'm running into an issue where if I run the packaged shell with python3, sqlparse barfs: tarmstrong@tarmstrong-box2:~/Impala/impala$ ./shell/build/impala-shell-3.4.0-SNAPSHOT/impala-shell -B --output_delimiter='??' Traceback (most recent call last): File "/home/tarmstrong/Impala/impala/shell/build/impala-shell-3.4.0-SNAPSHOT/impala_shell.py", line 35, in import sqlparse File "", line 971, in _find_and_load File "", line 955, in _find_and_load_unlocked File "", line 656, in _load_unlocked File "", line 626, in _load_backward_compatible File "/home/tarmstrong/Impala/impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/__init__.py", line 13, in File "", line 971, in _find_and_load File "", line 955, in _find_and_load_unlocked File "", line 656, in _load_unlocked File "", line 626, in _load_backward_compatible File "/home/tarmstrong/Impala/impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/engine/__init__.py", line 8, in File "", line 971, in _find_and_load File "", line 951, in _find_and_load_unlocked File "", line 894, in _find_spec File "", line 1157, in find_spec File "", line 1131, in _get_spec File "", line 1112, in _legacy_get_spec File "", line 441, in spec_from_loader File "", line 544, in spec_from_file_location File "/home/tarmstrong/Impala/impala/shell/build/impala-shell-3.4.0-SNAPSHOT/ext-py/sqlparse-0.1.19-py2.7.egg/sqlparse/lexer.py", line 84 except Exception, err: ^ I guess the shell tarball only supports python 2 at the moment and the pip-installed package can run with newer versions of the dependency? http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1633 PS4, Line 1633: if not isinstance(input_string, str): nit: could express as a single if with an "and" clause instead of nested ifs OK to ignore http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1790 PS4, Line 1790: if isinstance(options.output_delimiter, str): This seems to barf if I pass in a unicode delimiter. I don't think this is important but wanted to pass along $ ./shell/build/impala-shell-3.4.0-SNAPSHOT/impala-shell -B --output_delimiter='??' /home/tarmstrong/Impala/impala/shell/build/impala-shell-3.4.0-SNAPSHOT/lib/option_parser.py:325: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if '--live_progress' in sys.argv and '--disable_live_progress' in sys.argv: /home/tarmstrong/Impala/impala/shell/build/impala-shell-3.4.0-SNAPSHOT/lib/option_parser.py:328: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if '--verbose' in sys.argv and '--quiet' in sys.argv: Traceback (most recent call last): File "/home/tarmstrong/Impala/impala/shell/build/impala-shell-3.4.0-SNAPSHOT/impala_shell.py", line 1940, in impala_shell_main() File "/home/tarmstrong/Impala/impala/shell/build/impala-shell-3.4.0-SNAPSHOT/impala_shell.py", line 1791, in impala_shell_main delim_sequence = bytearray(options.output_delimiter, 'utf-8') UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 0: ordinal not in range(128) http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh File shell/make_shell_tarball.sh: http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh@52 PS4, Line 52: if [ "${USE_THRIFT11_GEN_PY:-}" == "false" ]; then Does it make sense to allow overriding this? Would a shell built with old thrift even work? http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh@53 PS4, Line 53: # thrift 0.9.3-p7 Not sure if we need this comment? -- To view, visit http://gerrit.cloudera.org:8080/15524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615 Gerrit-Change-Number: 15524 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: D
[Impala-ASF-CR] IMPALA-9401: primitive include-what-you-use script and mappings
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15552 ) Change subject: IMPALA-9401: primitive include-what-you-use script and mappings .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5592/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15552 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaf5f9ba865313afb0c581e6482514ef7f1c65367 Gerrit-Change-Number: 15552 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Wed, 25 Mar 2020 00:46:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9401: primitive include-what-you-use script and mappings
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15552 Change subject: IMPALA-9401: primitive include-what-you-use script and mappings .. IMPALA-9401: primitive include-what-you-use script and mappings This is a cleaned up version of the script I used to run include-what-you-use on the Impala codebase. The helper script assumes you have build IWYU and have a Kudu source checkout, and can then run IWYU on the entire codebase. Some mappings files are used to improve the quality of the IWYU output. There are still incorrect recommendations made, but this is sufficient to fix many common issues. Change-Id: Iaf5f9ba865313afb0c581e6482514ef7f1c65367 --- A bin/iwyu/iwyu.sh A bin/iwyu/iwyu_mappings.imp 2 files changed, 112 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/15552/1 -- To view, visit http://gerrit.cloudera.org:8080/15552 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iaf5f9ba865313afb0c581e6482514ef7f1c65367 Gerrit-Change-Number: 15552 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre .. Patch Set 8: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/15214/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15214/8//COMMIT_MSG@17 PS8, Line 17: DISABLE_SENTRY Please also mention this in the first line. http://gerrit.cloudera.org:8080/#/c/15214/8//COMMIT_MSG@19 PS8, Line 19: and thus run-sentry-service.sh cannot be successfully executed. Can you add some more information about this? For example: - This disables every Sentry related test. - The plan is to remove Sentry support in the Impala 4 line. Since Impala 3.4 was recently branched, it is ok to break Sentry from now on. -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 8 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 24 Mar 2020 23:58:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9537: Add LDAP auth to the webui
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15538 ) Change subject: IMPALA-9537: Add LDAP auth to the webui .. Patch Set 1: (3 comments) This makes sense to me as the authentication piece of the solution. I had a few readability comments but no concerns about the logic or testing. http://gerrit.cloudera.org:8080/#/c/15538/1/be/src/util/webserver.cc File be/src/util/webserver.cc: http://gerrit.cloudera.org:8080/#/c/15538/1/be/src/util/webserver.cc@122 PS1, Line 122: DEFINE_bool(webserver_require_ldap, false, Maybe in the help briefly explain the interaction between the different kinds of auth - is it that clients need to authenticate with only one of the enabled mechanisms? edit: oh i guess setting both is disallowed http://gerrit.cloudera.org:8080/#/c/15538/1/be/src/util/webserver.cc@581 PS1, Line 581: AddCookie(request_info, &response_headers); It feels a little weird that we don't set authenticated = true here. The control flow doesn't require it, so we don't need to add unnecessary logic. Maybe it would be clearer with a comment, or if the ldap and spnego branches were made obviously mutually exclusive. E.g. if (!authenticated && FLAGS_spnego) { } else if (!authenticated && FLAGS_ldap) { } or if (!authenticated) { if (FLAGS_spnego) { } else if (FLAGS_ldap) { } } http://gerrit.cloudera.org:8080/#/c/15538/1/fe/src/test/java/org/apache/impala/customcluster/CustomClusterRunner.java File fe/src/test/java/org/apache/impala/customcluster/CustomClusterRunner.java: http://gerrit.cloudera.org:8080/#/c/15538/1/fe/src/test/java/org/apache/impala/customcluster/CustomClusterRunner.java@59 PS1, Line 59: LOG.info(IOUtils.toString(p.getInputStream())); I guess this is good to aid debugging. Maybe merits a one line comment to explain what it's doing? -- To view, visit http://gerrit.cloudera.org:8080/15538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e92481929f2f06898b8496233ab4134792c9f10 Gerrit-Change-Number: 15538 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 23:31:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5591/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 8 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 24 Mar 2020 22:53:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9373: more tactical IWYU fixes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15248 ) Change subject: IMPALA-9373: more tactical IWYU fixes .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8de71866bdf3211e53560d9bfe930e7657c4d7f1 Gerrit-Change-Number: 15248 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 24 Mar 2020 22:46:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9373: more tactical IWYU fixes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15248 ) Change subject: IMPALA-9373: more tactical IWYU fixes .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5542/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8de71866bdf3211e53560d9bfe930e7657c4d7f1 Gerrit-Change-Number: 15248 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 24 Mar 2020 22:46:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9373: more tactical IWYU fixes
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15248 ) Change subject: IMPALA-9373: more tactical IWYU fixes .. Patch Set 10: Code-Review+2 This looks good to me. -- To view, visit http://gerrit.cloudera.org:8080/15248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8de71866bdf3211e53560d9bfe930e7657c4d7f1 Gerrit-Change-Number: 15248 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 24 Mar 2020 22:14:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15524 ) Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5590/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615 Gerrit-Change-Number: 15524 Gerrit-PatchSet: 4 Gerrit-Owner: David Knupp Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 22:13:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre .. Patch Set 8: Hi all, please review the revised patch and let me know if you have any additional suggestion and comment. Thanks! -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 8 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 24 Mar 2020 22:13:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre
Fang-Yu Rao has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/15214 ) Change subject: IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre .. IMPALA-8870: Bump up guava version from 14.0.1 to 28.1-jre This patch bumps up the version of Guava libraries from 14.0.1 to 28.1-jre. Due to some changes in Guava's API's, we modify the call sites accordingly. Moreover, in order to instruct the Java classes under the directory of $IMPALA_HOME/common/yarn-extras to use the new Guava libraries, we also explicitly added a dependency in the corresponding pom.xml file. On the other hand, we set DISABLE_SENTRY to true regardless of $USE_CDP_HIVE since Sentry's Guava version has not been bumped up yet and thus run-sentry-service.sh cannot be successfully executed. Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 --- M bin/impala-config.sh M common/yarn-extras/pom.xml M common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/Column.java M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java M fe/src/main/java/org/apache/impala/catalog/DataSource.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java M impala-parent/pom.xml 48 files changed, 103 insertions(+), 94 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/15214/8 -- To view, visit http://gerrit.cloudera.org:8080/15214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68 Gerrit-Change-Number: 15214 Gerrit-PatchSet: 8 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15537 ) Change subject: IMPALA-9107: Add timestamp to maven logging options. .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 24 Mar 2020 21:58:47 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@374 PS3, Line 374: table_->PrepareBucketForInsert(bucket_idx_, hash); > And hash-table suppose to be thread-safe for read access. Yeah, it definitely needs to be thread-safe if muliontiple threads are reading from it, but mutations (like SetTuple()) do not need to be thread-safe. We can also document SetTuple() as invalidating other iterators, cause we don't depend on that either. SetTuple() is only used by the hash aggregator - see be/src/exec/grouping-aggregator-ir.cc. The algorithm is basically this: it = ht->FindBucket(...); if (found in hash table) { // Merge into the existing intermediate tuple UpdateTuple(it, input_row) } else { // Try to construct a new intermediate tuple new_tuple = TryConstructTuple() if (tuple construction failed due to OOM) { SpillRow(input_row) } else { it.SetTuple(new_tuple) UpdateTuple(it, input_row) } } Instead of FindBucket()/SetTuple() you could do Find()/Insert() but that would probe the hash table twice. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 21:42:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
Hello Abhishek Rawat, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15524 to look at the new patch set (#4). Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. This is the main patch for making the the impala-shell cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests irrepsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x for the time being. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - The $IMPALA_HOME/bin/impala-shell.sh now sets up the impala-shell python environment independenty from bin/set-pythonpath.sh. The default version of thrift is thrift-0.11.0 (See IMPALA-9489). - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell with no authentication using Python 3.7.5 Opened TCP connection to localhost:21000 ... OR Starting Impala Shell with LDAP-based authentication using Python 2.7.12 Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result, this was inconsistency was reflected in the way we handled strings in the impala-shell code, but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. What ultimately seemed like a better approach was to try to weed out as many existing spurious str.encode() and str.decode() calls as I could, and try to implement what is has colloquially been called a "unicode sandwich" -- namely, "bytes on the outside, unicode on the inside, encode/decode at the edges." The primary spot in the shell where we call decode() now is when sanitising input... args = self.sanitise_input(args.decode('utf-8')) ...and also whenever a library like re required it. Similarly, str.encode() is primarily used where a library like readline or csv requires is. - PYTHONIOENCODING needs to be set to utf-8 to override the default setting for python 2. Without this, piping or redirecting stdout results in unicode errors. - from __future__ import unicode_literals was added throughout Testing: To test the changes, I ran the e2e shell tests the way we always do (against the normal build tarball), and then I set up a python 3 virtual env with the shell installed as a package, and manually ran the tests against that. No effort has been made at
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. Patch Set 4: Last GVO failed due to IMPALA-9550 -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 4 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 21:23:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/15537 ) Change subject: IMPALA-9107: Add timestamp to maven logging options. .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 24 Mar 2020 21:20:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5541/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 4 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 21:18:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 4 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 21:18:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 5 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 21:14:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. IMPALA-9443: [DOCS] Make tables optically pleasing and complete - Replaced ellipses in example columns with sample output - Fixed table formatting problems - Exhumed varname styles - Reverted table formatting at line 292 to published version formatting - Fixed table formatting at 831 Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Reviewed-on: http://gerrit.cloudera.org:8080/15476 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong --- M docs/topics/impala_perf_stats.xml M docs/topics/impala_show.xml 2 files changed, 90 insertions(+), 89 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 6 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9547: retry accept in test shell commandline
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15541 ) Change subject: IMPALA-9547: retry accept in test_shell_commandline .. IMPALA-9547: retry accept in test_shell_commandline This is a point solution to this particular socket.accept() call failing. The more general problem is described in https://www.python.org/dev/peps/pep-0475/ and fixed in Python 3.5. Change-Id: Icc9cab98b059042855ca9149427d079951471be0 Reviewed-on: http://gerrit.cloudera.org:8080/15541 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M tests/shell/test_shell_commandline.py 1 file changed, 10 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Icc9cab98b059042855ca9149427d079951471be0 Gerrit-Change-Number: 15541 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9547: retry accept in test shell commandline
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15541 ) Change subject: IMPALA-9547: retry accept in test_shell_commandline .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icc9cab98b059042855ca9149427d079951471be0 Gerrit-Change-Number: 15541 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 20:31:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15524 ) Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5589/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615 Gerrit-Change-Number: 15524 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 20:29:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
David Knupp has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15524 ) Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. .. IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3. This is the main patch for making the the impala-shell cross-compatible with python 2 and python 3. The goal is wind up with a version of the shell that will pass python e2e tests irrepsective of the version of python used to launch the shell, under the assumption that the test framework itself will continue to run with python 2.7.x for the time being. Notable changes for reviewers to consider: - With regard to validating the patch, my assumption is that simply passing the existing set of e2e shell tests is sufficient to confirm that the shell is functioning properly. No new tests were added. - A new pytest command line option was added in conftest.py to enable a user to specify a path to an alternate impala-shell executable to test. It's possible to use this to point to an instance of the impala-shell that was installed as a standalone python package in a separate virtualenv. Example usage: USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py The target virtualenv may be based on either python3 or python2. However, this has no effect on the version of python used to run the test framework, which remains tied to python 2.7.x for the foreseeable future. - The $IMPALA_HOME/bin/impala-shell.sh now sets up the impala-shell python environment independenty from bin/set-pythonpath.sh. The default version of thrift is thrift-0.11.0 (See IMPALA-9489). - The wording of the header changed a bit to include the python version used to run the shell. Starting Impala Shell with no authentication using Python 3.7.5 Opened TCP connection to localhost:21000 ... OR Starting Impala Shell with LDAP-based authentication using Python 2.7.12 Opened TCP connection to localhost:21000 ... - By far, the biggest hassle has been juggling str versus unicode versus bytes data types. Python 2.x was fairly loose and inconsistent in how it dealt with strings. As a quick demo of what I mean: Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = 'like a duck' >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8') True ...and yet there are weird unexpected gotchas. >>> d.decode('utf-8') == d.encode('utf-8') True >>> d.encode('utf-8') == bytearray(d, 'utf-8') True >>> d.decode('utf-8') == bytearray(d, 'utf-8') # fails the eq property? False As a result, this was inconsistency was reflected in the way we handled strings in the impala-shell code, but things still just worked. In python3, there's a much clearer distinction between strings and bytes, and as such, much tighter type consistency is expected by standard libs like subprocess, re, sqlparse, prettytable, etc., which are used throughout the shell. Even simple calls that worked in python 2.x: >>> import re >>> re.findall('foo', b'foobar') ['foo'] ...can throw exceptions in python 3.x: >>> import re >>> re.findall('foo', b'foobar') Traceback (most recent call last): File "", line 1, in File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall return _compile(pattern, flags).findall(string) TypeError: cannot use a string pattern on a bytes-like object Exceptions like this resulted in a many, if not most shell tests failing under python 3. What ultimately seemed like a better approach was to try to weed out as many existing spurious str.encode() and str.decode() calls as I could, and try to implement what is has colloquially been called a "unicode sandwich" -- namely, "bytes on the outside, unicode on the inside, encode/decode at the edges." The primary spot in the shell where we call decode() now is when sanitising input... args = self.sanitise_input(args.decode('utf-8')) ...and also whenever a library like re required it. Similarly, str.encode() is primarily used where a library like readline or csv requires is. - PYTHONIOENCODING needs to be set to utf-8 to override the default setting for python 2. Without this, piping or redirecting stdout results in unicode errors. - from __future__ import unicode_literals was added throughout Testing: To test the changes, I ran the e2e shell tests the way we always do (against the normal build tarball), and then I set up a python 3 virtual env with the shell installed as a package, and manually ran the tests against that. No effort has been made at this point to come up with a way to integrate testing of the shell in a python3 environment into
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 3: (6 comments) Hi Tim, thanks for your valuable feedbacks! http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@12 PS3, Line 12: robin hood algorithm to keep probe function of hash-table intact. > It's worth noting that this isn't the full robin-hood hashing approach, bec You told me before about short-circuiting the lookup but I didn't quite get it then. Now that you put it that way, I understand better what you mean. Yes, we can do the short-circuit lookup if we maintain the distance invariant. Right now, the invariant is not maintained because insert using iterator does not trigger rebalance. I'll check the code again and see if I can do something about it. http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@14 PS3, Line 14: Instead of proactively swapping elements during insertion, the > I guess we can revisit this to see what the impact is, it seems like it pro Having separate Probe() function that is special for insert might remove the redundant second pass. I'll see if this is possible. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@115 PS3, Line 115: target_distance == 0 || > Why do we need this condition? My understanding was that you always want to Ah, I see what you mean. Somehow I was under impression that element that is already perfectly placed should not be evicted. Will remove this condition. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@138 PS3, Line 138: if (curr_bucket->filled) { > Is it possible for this to be false? Why would there be an empty bucket *be Good point, I must have miss this while I refine my code. Will change this with DCHECK. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@162 PS3, Line 162: inline int64_t HashTable::BucketDistance( > I guess calculating this for linear probing is relatively straightforward, I'll see if I can put O(1) distance measurement, at least for the linear probing. I did tried that before, and it failed PlannerTests. But that probably my implementation that still incorrect. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@374 PS3, Line 374: table_->PrepareBucketForInsert(bucket_idx_, hash); > You could also rebalance there, I think, to benefit the agg. Or combine the If we rebalance here, it will reorder elements in array buckets_, which in turn breaking the order of iterator. And hash-table suppose to be thread-safe for read access. I'm worried if there are two iterator going where one do read only and the other do write through this SetTuple() function, the later iterator will break the former as well. Please correct me if my assumption is wrong, or if there is some guarantee that such case is impossible. I think I haven't fully understand the use case of this function. -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 19:03:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9537: Add LDAP auth to the webui
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15538 ) Change subject: IMPALA-9537: Add LDAP auth to the webui .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15538/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15538/1//COMMIT_MSG@9 PS1, Line 9: This patch adds a startup flag --webserver_require_ldap, which if set I think we might also need some kind of authorisation solution too? I.e. only allow privileged users to view the web UI, since it has potentially sensitive info. Maybe I'm missing how this could be achieved though. -- To view, visit http://gerrit.cloudera.org:8080/15538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e92481929f2f06898b8496233ab4134792c9f10 Gerrit-Change-Number: 15538 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 18:40:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5538/ -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 3 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 18:22:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5588/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 14 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 18:20:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Abhishek Rawat has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. IMPALA-9466: impala-shell client retry for hs2-http protocol Added retries for idempotent rpcs: OpenSession, PingImpalaHS2Service, GetResultSetMetadata, CloseImpalaOperation (non dmls), CancelOperation, GetOperationStatus, GetRuntimeProfile, GetExecSummary, GetLog Retries were also added to the 'set all' query execution and subsequent result fetch in the ImpalaHS2Client._open_session() The retries are only supported for hs2-http protocol and enabled by default. At most there are 3 tries for a failed rpc with at least 2 second wait duration between tries. Only failed rpcs due to an error in the http transport are retried and if an rpc failed because the server returned an error in the rpc response then such scenarios are not retriable. Improved error diagnostics by dumping stack trace when ImpalaShell. _execute_stmt() gets an 'Unknown Exception'. Testing: - Added a custom_cluster test which injects fault into the http transport and checks expected behavior from the various rpcs. Some of these tests leave the session in an open state and so these tests are not suitable for the e2e test framework which have metric verifiers expecting related metrics to be 0 at the end of the test. - Manually tested real world scenarios with impala-shell client communicating with an impala coordinator via a fault injecting istio mesh. - Manually tested dropping connections on an nginx ingress gateway by sending SIGTERM to all worker processes. Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 --- M shell/impala_client.py M shell/impala_shell.py M shell/shell_exceptions.py A tests/custom_cluster/test_hs2_fault_injection.py 4 files changed, 498 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/15378/14 -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 14 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15537 ) Change subject: IMPALA-9107: Add timestamp to maven logging options. .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5587/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 24 Mar 2020 17:29:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15537 ) Change subject: IMPALA-9107: Add timestamp to maven logging options. .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5540/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 24 Mar 2020 16:59:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15537 ) Change subject: IMPALA-9107: Add timestamp to maven logging options. .. Patch Set 3: This looks good to me. I'm going to run gerrit-verify-dryrun-external to make sure everything builds, and also so I can look at the output. I will +2 when that comes back. -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 24 Mar 2020 16:59:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15537 ) Change subject: IMPALA-9107: Add timestamp to maven logging options. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15537/2/bin/mvn-quiet.sh File bin/mvn-quiet.sh: http://gerrit.cloudera.org:8080/#/c/15537/2/bin/mvn-quiet.sh@34 PS2, Line 34: LOGGING_OPTIONS="-Dorg.slf4j.simpleLogger.showDateTime \ : -Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss" > Hmm. I might have pushed the wrong commit? It worked when I ran it. Anyway, Done -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 24 Mar 2020 16:49:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
Hello Laszlo Gaal, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15537 to look at the new patch set (#3). Change subject: IMPALA-9107: Add timestamp to maven logging options. .. IMPALA-9107: Add timestamp to maven logging options. We found that using awk to add a timestamp to the maven log can fail if gawk is not installed. It seems better to configure maven to add the timestamp itself. Sample output: Running mvn -U -Dorg.slf4j.simpleLogger.showDateTime=true -Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss -B install -DskipTests Directory /home/dknupp/Impala/ext-data-source 16:37:16 [INFO] Scanning for projects... 16:37:16 [INFO] 16:37:16 [INFO] Reactor Build Order: 16:37:16 [INFO] 16:37:16 [INFO] Apache Impala External Data Source [pom] 16:37:16 [INFO] Apache Impala External Data Source API [jar] 16:37:16 [INFO] Apache Impala External Data Source Sample [jar] 16:37:16 [INFO] Apache Impala External Data Source Test Library [jar] 16:37:17 [INFO] 16:37:17 [INFO] < org.apache.impala:impala-data-source > 16:37:17 [INFO] Building Apache Impala External Data Source 1.0-SNAPSHOT [1/4] 16:37:17 [INFO] [ pom ]- [etc...] Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e --- M bin/mvn-quiet.sh 1 file changed, 4 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/15537/3 -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 3 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 16:35:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15537 ) Change subject: IMPALA-9107: Add timestamp to maven logging options. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/15537/2/bin/mvn-quiet.sh File bin/mvn-quiet.sh: http://gerrit.cloudera.org:8080/#/c/15537/2/bin/mvn-quiet.sh@34 PS2, Line 34: LOGGING_OPTIONS = -Dorg.slf4j.simpleLogger.showDateTime : DATETIME_FORMAT = -Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss > You'll need to get rid of the space between the equals and the value. Might Hmm. I might have pushed the wrong commit? It worked when I ran it. Anyway, making your change now. -- To view, visit http://gerrit.cloudera.org:8080/15537 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e Gerrit-Change-Number: 15537 Gerrit-PatchSet: 2 Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 24 Mar 2020 16:25:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15511 ) Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table. .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@12 PS3, Line 12: robin hood algorithm to keep probe function of hash-table intact. It's worth noting that this isn't the full robin-hood hashing approach, because we're not short-circuiting the lookup based on the invariant (i.e. once the max distance of a hash value you'd seen on the probe exceeds your current distance, you know the key cannot be in the hash table). I think this may be a problem if you do benchmarks where most of the lookups are not present in the hash table. This might be significant for some workloads. But I guess for hash joins, this will give most of the benefit, if we assume that missing values are mostly filtered out by bloom filters in the scan. So if we see a big impact, we can look at optimising it further. http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@14 PS3, Line 14: Instead of proactively swapping elements during insertion, the I guess we can revisit this to see what the impact is, it seems like it probably means that insert is somewhat more expensive because of the two passes. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@115 PS3, Line 115: target_distance == 0 || Why do we need this condition? My understanding was that you always want to maintain the invariant that the richer key gets bumped. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@138 PS3, Line 138: if (curr_bucket->filled) { Is it possible for this to be false? Why would there be an empty bucket *before* the current position of the key (given that we don't support deletion from the hash table). I.e. Maybe this should be a DCHECK to enforce the invariant. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@162 PS3, Line 162: inline int64_t HashTable::BucketDistance( I guess calculating this for linear probing is relatively straightforward, but not sure if there's a good way to calculate it for quadratic probing (quadratic probing + robin hood hashing might not be a great combination). We did the quadratic probing because of a tendency for the CRC-based hash to cluster. I don't know if robin-hood hashing is enough to counter-act that tendency. http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@374 PS3, Line 374: table_->PrepareBucketForInsert(bucket_idx_, hash); You could also rebalance there, I think, to benefit the agg. Or combine the rebalancing with PrepareBucketForInsert -- To view, visit http://gerrit.cloudera.org:8080/15511 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986 Gerrit-Change-Number: 15511 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 16:23:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-9538 Bump up linux-syscall-support.h
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: [WIP]IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 9: (1 comment) This looks good to me. I can +2 once it's not a WIP. We should also ideally get this into Kudu's copy of gutil so we don't diverge. http://gerrit.cloudera.org:8080/#/c/15510/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15510/9//COMMIT_MSG@9 PS9, Line 9: Bump up linux-syscall-support.h to newest version Can you link to the repo/commit where you got this from. It looks like this matches commit fd00dbb from https://chromium.googlesource.com/linux-syscall-support/ -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 16:02:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5586/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 15:39:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9547: retry accept in test shell commandline
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15541 ) Change subject: IMPALA-9547: retry accept in test_shell_commandline .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icc9cab98b059042855ca9149427d079951471be0 Gerrit-Change-Number: 15541 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 15:30:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9547: retry accept in test shell commandline
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15541 ) Change subject: IMPALA-9547: retry accept in test_shell_commandline .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5539/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icc9cab98b059042855ca9149427d079951471be0 Gerrit-Change-Number: 15541 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 15:30:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9547: retry accept in test shell commandline
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15541 ) Change subject: IMPALA-9547: retry accept in test_shell_commandline .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icc9cab98b059042855ca9149427d079951471be0 Gerrit-Change-Number: 15541 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 15:29:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15544 ) Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. Patch Set 1: Going to wait for https://gerrit.cloudera.org/#/c/15533/ to get merged first, since that does the GBN upgrade. -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 14:59:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15544 Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893 .. IMPALA-9548: UdfExecutorTest failures after HIVE-22893 HIVE-22893: "Enhance data size estimation for fields computed by UDFs" modified o.a.h.hive.ql.udf.UDFSubstr and added a dependency on a few new Hive classes located in the hive-exec jar. These classes include: o.a.h.hive.ql.plan.ColStatistics o.a.h.hive.ql.stats.estimator.StatEstimator o.a.h.hive.ql.stats.estimator.StatEstimatorProvider The test UdfExecutorTest#HiveStringsTest loads the class UDFSubstr and thus needs to load the aforementioned stats classes as well. shaded-deps/pom.xml selectively pulls in certain classes from the hive-exec jar, and excludes all others. This patch simply addes the necessary stats classes to load UDFSubstr. Thus, fixing UdfExecutorTest. Testing: * Ran core tests with CDP_BUILD_NUMBER=2244454 and USE_CDP_HIVE=true, validated that UdfExecutorTest now passes Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf --- M shaded-deps/pom.xml 1 file changed, 3 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/15544/1 -- To view, visit http://gerrit.cloudera.org:8080/15544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf Gerrit-Change-Number: 15544 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar
[Impala-ASF-CR] IMPALA-9547: retry accept in test shell commandline
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/15541 ) Change subject: IMPALA-9547: retry accept in test_shell_commandline .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/15541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icc9cab98b059042855ca9149427d079951471be0 Gerrit-Change-Number: 15541 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 14:20:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. Patch Set 3: The GVO hit an unrelated issue, IMPALA-9547. -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 3 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 13:27:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 3 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 13:25:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15533 ) Change subject: IMPALA-9546: Update ranger-admin-site.xml.template after RANGER-2688 .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5538/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7750f73834368c7109965e78b147238fc6316f49 Gerrit-Change-Number: 15533 Gerrit-PatchSet: 3 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 24 Mar 2020 13:25:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8980: Remove functional*.alltypesinsert from EE tests
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15529 ) Change subject: IMPALA-8980: Remove functional*.alltypesinsert from EE tests .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/15529/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15529/1//COMMIT_MSG@11 PS1, Line 11: -Swapped out the Reset table and Drop partition SETUP tags to Truncate table QUERY statement. > There are 2 more tests where we use SETUP. Should I swap them out too and r Yes, I would like to remove SETUP completely - the advantage of using normal query blocks is that they: - are executed in the order you see in the .test file, while SETUP always runs before the QUERY - use normal Impala SQL syntax while SETUP adds new commands SETUP seems like a fossil from the early days of Impala when it couldn't write tables so it had to rely on Hive. http://gerrit.cloudera.org:8080/#/c/15529/1/testdata/workloads/functional-query/queries/QueryTest/insert.test File testdata/workloads/functional-query/queries/QueryTest/insert.test: http://gerrit.cloudera.org:8080/#/c/15529/1/testdata/workloads/functional-query/queries/QueryTest/insert.test@a675 PS1, Line 675: : : : : : : : : : : : : : : : > This test case was added when IMPALA-89 got resolved which implies that it I agree, this test was "buggy" and passed for the wrong reason, but I would prefer to fix it to test how Impala works instead of deleting it. -- To view, visit http://gerrit.cloudera.org:8080/15529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I257e936868917a2fcc6c030f6c855b247e8a0eea Gerrit-Change-Number: 15529 Gerrit-PatchSet: 1 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 13:13:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-9538 Bump up linux-syscall-support.h
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: [WIP]IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 11:23:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8980: Remove functional*.alltypesinsert from EE tests
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/15529 ) Change subject: IMPALA-8980: Remove functional*.alltypesinsert from EE tests .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/15529/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15529/1//COMMIT_MSG@11 PS1, Line 11: -Swapped out the Reset table and Drop partition SETUP tags to Truncate table QUERY statement. > Do we still use SETUP anywhere in the tests? If not, then it would be great There are 2 more tests where we use SETUP. Should I swap them out too and remove this part? http://gerrit.cloudera.org:8080/#/c/15529/1/testdata/workloads/functional-query/queries/QueryTest/insert.test File testdata/workloads/functional-query/queries/QueryTest/insert.test: http://gerrit.cloudera.org:8080/#/c/15529/1/testdata/workloads/functional-query/queries/QueryTest/insert.test@a675 PS1, Line 675: : : : : : : : : : : : : : : : > Instead of deleting we could also for test the current behavior. This test case was added when IMPALA-89 got resolved which implies that it should work as hive, which would not clear the table in this case either. This is what I based on that it is not a bug, but a wrongly added test. -- To view, visit http://gerrit.cloudera.org:8080/15529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I257e936868917a2fcc6c030f6c855b247e8a0eea Gerrit-Change-Number: 15529 Gerrit-PatchSet: 1 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 09:23:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing (tracking JIRA IMPALA-9539). Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Reviewed-on: http://gerrit.cloudera.org:8080/15462 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test 12 files changed, 831 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 09:18:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9426 Download Python dependencies even skipping bootstrap toolchain
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15297 ) Change subject: IMPALA-9426 Download Python dependencies even skipping bootstrap toolchain .. IMPALA-9426 Download Python dependencies even skipping bootstrap toolchain Download Python dependencies even skipping bootstrap toolchain. Because when you set SKIP_TOOLCHAIN_BOOTSTRAP=true, the python dependencies still need to be downloaded. The toolchain building process will not download the python dependencies autometically Change-Id: I012314793ffb521001951ab7ec3d7a3ba737c405 Reviewed-on: http://gerrit.cloudera.org:8080/15297 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M bin/impala-config.sh M buildall.sh 2 files changed, 10 insertions(+), 4 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/15297 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I012314793ffb521001951ab7ec3d7a3ba737c405 Gerrit-Change-Number: 15297 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9426 Download Python dependencies even skipping bootstrap toolchain
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15297 ) Change subject: IMPALA-9426 Download Python dependencies even skipping bootstrap toolchain .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15297 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I012314793ffb521001951ab7ec3d7a3ba737c405 Gerrit-Change-Number: 15297 Gerrit-PatchSet: 6 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 07:45:45 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP]IMPALA-9538 Bump up linux-syscall-support.h
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15510 ) Change subject: [WIP]IMPALA-9538 Bump up linux-syscall-support.h .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5585/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058 Gerrit-Change-Number: 15510 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 24 Mar 2020 07:05:14 + Gerrit-HasComments: No