[Impala-ASF-CR] IMPALA-9357: Fix race condition in alter database event
Vihang Karajgaonkar has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/15260 ) Change subject: IMPALA-9357: Fix race condition in alter_database event .. IMPALA-9357: Fix race condition in alter_database event The race condition is exposed intermittently, on certain builds which causes test_event_processing::test_self_events test to fail. This happens because we are checking for self-event identifiers in the Db object without taking a lock. When a DDL like 'comment on database is 'test'' is executed, it is possible that the event processor thread is triggered as soon as the ALTER_DATABASE event is generated. This may cause event processor fail the self-event detection since the self-event identifiers are not yet added to the Db object. The fix adds a Db lock similar to Table lock. Alter db operations in CatalogOpExecutor now take db locks instead of metastoreDdlLock_ which makes it consistent with table locking protocol. Testing: 1. Ran existing tests for events processor. 2. This test was failing on centos6 frequently (failed in 1/3 times). After the fix I ran the test in a loop for 24 hrs (197 iterations) and the test didn't fail. 3. Ran core tests with CDP and CDH builds. Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/custom_cluster/test_concurrent_ddls.py M tests/custom_cluster/test_event_processing.py 7 files changed, 241 insertions(+), 145 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/15260/12 -- To view, visit http://gerrit.cloudera.org:8080/15260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e Gerrit-Change-Number: 15260 Gerrit-PatchSet: 12 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15475 ) Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5525/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Mar 2020 04:17:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5524/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 18 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 19 Mar 2020 04:02:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/15306/17/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/17/be/src/util/cache/lirs-cache.cc@202 PS17, Line 202: // TODO: Switch back to std library atomics when this works. > Will file one right now Filed https://issues.apache.org/jira/browse/IMPALA-9533 -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 17 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 19 Mar 2020 03:41:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15475 ) Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15475/2/tests/metadata/test_recover_partitions.py File tests/metadata/test_recover_partitions.py: http://gerrit.cloudera.org:8080/#/c/15475/2/tests/metadata/test_recover_partitions.py@366 PS2, Line 366: self.client, "CREATE TABLE %s (i int) PARTITIONED BY (p string)" % fq_tbl_name) : double_quote = '"' : single_quote = "'" : back_slash = '\\' : parts = [ : single_quote, : double_quote, : back_slash + single_quote, : back_slash + double_quote, : back_slash + back_slash + single_quote, : back_slash + back_slash + double_quote : ] : for i in range(len(parts)): : # When creating partition directories, Hive replaces special characters in : # partition value string using the %xx escape. e.g. p=' will become p=%27. : hex_part = urllib.parse.quote(parts[i]) : self.create_fs_partition(tbl_location, 'p=%s' % hex_part, 'file_%d' % i, str(i)) : : self.execute_query_expect_success( : self.client, "ALTER TABLE %s RECOVER PARTITIONS" % fq_tbl_name) : result = self.execute_query_expect_success( > optional: we could also call urllib.parse.quote() in create_fs_partition() Good point! Change these to use six.moves.urllib.parse.quote() in this test. Using it in create_fs_partition() will incorrectly quote the '=', i.e. p=' becomes p%3D%27 but not p=%27. -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Mar 2020 03:32:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15475 to look at the new patch set (#3). Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. IMPALA-9492: Fix test_unescaped_string_partition failing on S3 test_unescaped_string_partition in metadata/test_recover_partitions.py use hdfs clients to create four partition directories with special characters, i.e. single quote, double quotes and back slash. It aims to test on whether ALTER TABLE RECOVER PARTITIONS can recognize those directories correctly. However, when running against s3, only two directories are created as expected, which causes the failure. The reason is that when running against s3, we use hadoop cli for operations. A shell command will be launched for each operation. Passing arguments through shell results in duplicate unescaping. So the 4 dirs, [p=', p=", p=\', p=\"] finally became [p=', p=", p=', p="], resulting in two distinct directories. When the test running against hdfs, we use webhdfs_client so don't have this issue. Actually, we shouldn't use special characters in partition paths. Hive converts them to their ascii hex values when creating partition directories. E.g. partition paths of [p=', p=", p=\', p=\"] are [p=%27, p=%22, p=%5C%27, p=%5C%22]. We should follow this rule when creating directories in test. Also we won't have the above shell issue on s3 anymore. Tests: - Added two more special partitions in test_unescaped_string_partition. - Ran test_unescaped_string_partition in S3. Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb --- M tests/metadata/test_recover_partitions.py 1 file changed, 19 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/15475/3 -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Hello Thomas Tauber-Marshall, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15306 to look at the new patch set (#18). Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. IMPALA-8690: Add LIRS cache eviction algorithm One concern for the data cache is that the LRU eviction algorithm is suceptible to being flushed by large scans of low priority data. This implements the LIRS algorithm described in "LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy to Improve Buffer Cache Performance" by Song Jiang / Xiaodon Xhang 2002. LIRS is a scan-resistent eviction algorithm with low performance penalty to LRU. This introduces the startup flag data_cache_eviction_policy to control which eviction policy to use. The only two options are LRU and LIRS, with the default continuing to be LRU. To accomodate the new algorithm and associated tests, some code moved around: 1. The RLCacheShard implementation moved from util/cache/cache.cc to util/cache/rl-cache.cc. 2. The backend cache tests were split into multiple files. util/cache/cache-test.h contains shared cache testing code. util/cache/cache-test.cc contains generic tests that should work for any algorithm. util/cache/rl-cache-test.cc are RLCacheShard specific tests util/cache/lirs-cache-test.cc are LIRS specific tests 3. To make it easy for clients of the cache code to customize the cache eviction algorithm, the public interface changed from using a template to taking the policy as an argument. 4. Cache::MemoryType is removed. 5. Cache adds an Init() method to verify the validity of startup flags Testing: - Added LIRS specific backend cache tests (lirs-cache-test) - Ran TPC-DS with a very small cache and concurrency to test corner cases with the LIRS eviction policy - Parameterized data-cache-test to run for both LRU and LIRS - Added LIRS equivalents for tests in custom_cluster/test_data_cache.py - Ran cache-bench with LRU and LIRS. The results are: Test case | Algorithm | Lookups / sec | Hit rate ZIPFIAN ratio=1.00x | LRU | 11.31M| 99.9% ZIPFIAN ratio=1.00x | LIRS | 10.09M| 99.8% ZIPFIAN ratio=3.00x | LRU | 11.36M| 95.9% ZIPFIAN ratio=3.00x | LIRS | 9.27M| 96.4% UNIFORM ratio=1.00x | LRU | 7.46M| 99.8% UNIFORM ratio=1.00x | LIRS | 6.93M| 99.8% UNIFORM ratio=3.00x | LRU | 5.63M| 33.3% UNIFORM ratio=3.00x | LIRS | 3.24M| 33.3% The takeaway is that LIRS is a bit slower on lookups and quite a bit slower on inserts. However, they both are still doing millions of operations per second, so it should not be a bottleneck for the data cache. Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/cache/CMakeLists.txt M be/src/util/cache/cache-bench.cc M be/src/util/cache/cache-internal.h M be/src/util/cache/cache-test.cc A be/src/util/cache/cache-test.h M be/src/util/cache/cache.cc M be/src/util/cache/cache.h A be/src/util/cache/lirs-cache-test.cc A be/src/util/cache/lirs-cache.cc A be/src/util/cache/rl-cache-test.cc A be/src/util/cache/rl-cache.cc M bin/rat_exclude_files.txt M tests/custom_cluster/test_data_cache.py 16 files changed, 2,595 insertions(+), 827 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/15306/18 -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 18 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 17: (11 comments) http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@83 PS15, Line 83: UNPROTECTED (HIR resident), and TOMBSTONE (HIR nonresident) > not sure I understand the difference between PROTECTED and TOMBSTONE. also, Yeah, this needed more clarity. I reworked this paragraph. Long story short: PROTECTED and UNPROTECTED entries have actual data in the cache. TOMBSTONE entries are metadata-only and do not have any actual data in the cache. http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@84 PS15, Line 84: // The size of the UNPROTECTED and PROTECTED categories is determined by the : // lirs_unprotected_percentage parameter, which is typically in the 1-5% range. : // The PROTECTED entries make up the bulk of the cache (90+%), and this contains the : // entries with the shortest interreference recency. > bit confused by the wording. the UNPROTECTED size is controlled by 'lirs_un I reworked this and reorganized this comment. http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@96 PS15, Line 96: UNINIT > is UINIT suppose to be UNINITIALIZED for short? might be good to state that After looking at this, there isn't much reason to shorten it. I changed this to UNINITIALIZED. http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@122 PS15, Line 122: DataResidency > mentioned this before, but would be nice to have an explanation of what res Added a description here along with the lifecycle. http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@327 PS15, Line 327: void TrimRecencyList(LIRSThreadState* tstate); : void EnforceTombstoneLimit(LIRSThreadState* tstate); : void EnforceProtectedCapacity(LIRSThreadState* tstate); : void EnforceUnprotectedCapacity(LIRSThreadState* tstate); > nit: docs Added descriptions http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@387 PS15, Line 387: LIRSCacheShard::~LIRSCacheShard() { > would it be cleaner to have an explicit close method that does all of this. Our current usage of this cache implementation is the DataCache, which lives for the lifetime of the Impala daemon. DataCache keeps a unique_ptr for its reference to the ShardedCache, and the ShardedCache keeps unique_ptrs for each CacheShard. DataCache::ReleaseResources() explicitly calls reset() on its unique_ptr. We could have a ReleaseResources() function here, plumb it through ShardedCache, and then call it from DataCache::ReleaseResources(). I think if we use the cache for something that has the lifecycle of a query, then a ReleaseResources() method is needed. I could go either way on this. http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@411 PS15, Line 411: // percentage >= 50.00%. This is far outside the range of what is useful for > maybe add this limitation to the flag docs Added a line to the flag docs to specify this limit. http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@423 PS15, Line 423: betwee > nit: typo Done http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@428 PS15, Line 428: Must be 0.5 or greater > same comment as above Added to the flag description. http://gerrit.cloudera.org:8080/#/c/15306/17/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/17/be/src/util/cache/lirs-cache.cc@166 PS17, Line 166: CAS > what does this stand for? compare-and-swap? Yes, it is compare and swap. Looking through the comments for these atomic swaps, I think it would be clearer to eliminate CAS. I'll go through the comments surrounding the atomic ops and double check. http://gerrit.cloudera.org:8080/#/c/15306/17/be/src/util/cache/lirs-cache.cc@202 PS17, Line 202: // TODO: Switch back to std library atomics when this works. > do we have a JIRA for this? Will file one right now -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 17 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 19 Mar 2020 03:15:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15047 ) Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5491/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1 Gerrit-Change-Number: 15047 Gerrit-PatchSet: 13 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Thu, 19 Mar 2020 02:16:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15047 ) Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1 Gerrit-Change-Number: 15047 Gerrit-PatchSet: 13 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Thu, 19 Mar 2020 02:16:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15047 ) Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView .. Patch Set 12: Code-Review+2 (1 comment) > Patch Set 12: > > (26 comments) > > I'm really sorry that because I'm not familiar with gerrit, I found that none > of the replies were sent out. Thank you for reviewing my code. I apologize > again for not replying before No worry. Thanks for your patience on updating the patch! Carry on Tim's +1. Thanks for your contribution, Xianqing! http://gerrit.cloudera.org:8080/#/c/15047/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/15047/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1178 PS10, Line 1178: evalInInlineViewPreds.add(e); > For inline view we can fix by this, but real table also has this question. Let's fix this in IMPALA-9356. -- To view, visit http://gerrit.cloudera.org:8080/15047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1 Gerrit-Change-Number: 15047 Gerrit-PatchSet: 12 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Thu, 19 Mar 2020 02:15:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView
Xianqing He has posted comments on this change. ( http://gerrit.cloudera.org:8080/15047 ) Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView .. Patch Set 12: (26 comments) I'm really sorry that because I'm not familiar with gerrit, I found that none of the replies were sent out. Thank you for reviewing my code. I apologize again for not replying before http://gerrit.cloudera.org:8080/#/c/15047/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15047/7//COMMIT_MSG@9 PS7, Line 9: This is an improvement that tries > "This is an improvement that tries to propagate" Done http://gerrit.cloudera.org:8080/#/c/15047/7//COMMIT_MSG@11 PS7, Line 11: : For example: : SELECT * : FROM functional.alltypessmall a : LEFT JOIN ( : SELECT id, upper(string_col) AS upper_val, : length(string_col) AS len : FROM functional.alltypestiny : ) b ON a.id = b.id > Please take some time to write the commit message. It's very important for Done http://gerrit.cloudera.org:8080/#/c/15047/7//COMMIT_MSG@11 PS7, Line 11: : For example: : SELECT * : FROM functional.alltypessmall a : LEFT JOIN ( : SELECT id, upper(string_col) AS upper_val, : length(string_col) AS len : FROM functional.alltypestiny : ) b ON a.id = b.id > Agree with Quanlong regarding improving the commit message. In addition, I Done http://gerrit.cloudera.org:8080/#/c/15047/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15047/9//COMMIT_MSG@24 PS9, Line 24: , > nit: need space Done http://gerrit.cloudera.org:8080/#/c/15047/9//COMMIT_MSG@24 PS9, Line 24: some predicates that : must be evaluted at a join node can also be safely evaluted by the : outer-joined inline view. > I think you mean "some predicates that must be evaluted at a join node can Done http://gerrit.cloudera.org:8080/#/c/15047/9//COMMIT_MSG@30 PS9, Line 30: . > nit: need space Done http://gerrit.cloudera.org:8080/#/c/15047/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/15047/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1618 PS7, Line 1618: return true; > Pls add javadoc comment Done http://gerrit.cloudera.org:8080/#/c/15047/11/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/15047/11/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1588 PS11, Line 1588:* - For On-clause predicates, see canEvalOnClauseConjunct() for more info. > I think this part of the comment about on-clause predicates should be moved Done http://gerrit.cloudera.org:8080/#/c/15047/11/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1643 PS11, Line 1643: e.getIds(exprTids, null); > Maybe rename this to exprTids or similar, so that it's clearer how it is di Done http://gerrit.cloudera.org:8080/#/c/15047/11/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/15047/11/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@617 PS11, Line 617:* returned by getResultTupleId(). > Document that it only returns unique expressions? If some conjuncts are pushed to having predicates, it may appear duplicate predicates, eg AGGREGATE [FINALIZE] | group by: id | having: id = 12, id = 12 Here's a simple fix for this situation. http://gerrit.cloudera.org:8080/#/c/15047/9/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/15047/9/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@985 PS9, Line 985: picked up by getBoundPredicates() > I think this comment is stale after merging this patch. Could you update it Done http://gerrit.cloudera.org:8080/#/c/15047/7/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/15047/7/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1162 PS7, Line 1162: Get the unassigned conjuncts that can be eva > "Get the unassigned conjuncts that can be evaluated in inline view and retu Done http://gerrit.cloudera.org:8080/#/c/15047/7/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1163 PS7, Line 1163:* in 'evalInInlineViewPreds'. :* If a conjunct is not an On-clause predicate and is safe to > If a conjunct is
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 17: (2 comments) still going through the lirs-cache code, but posted the comments i have so far http://gerrit.cloudera.org:8080/#/c/15306/17/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/17/be/src/util/cache/lirs-cache.cc@166 PS17, Line 166: CAS what does this stand for? compare-and-swap? http://gerrit.cloudera.org:8080/#/c/15306/17/be/src/util/cache/lirs-cache.cc@202 PS17, Line 202: // TODO: Switch back to std library atomics when this works. do we have a JIRA for this? -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 17 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 19 Mar 2020 01:05:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5490/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Mar 2020 01:05:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Mar 2020 01:05:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5523/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Mar 2020 00:56:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8205: Support number of true and false statistics for boolean column
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14666 ) Change subject: IMPALA-8205: Support number of true and false statistics for boolean column .. Patch Set 8: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5489/ -- To view, visit http://gerrit.cloudera.org:8080/14666 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I991bee8e7fdc644d908289f5fe2ee8032cc2c431 Gerrit-Change-Number: 14666 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 19 Mar 2020 00:48:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Mar 2020 00:12:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/15463/2/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/15463/2/be/src/service/query-options.cc@923 PS2, Line 923: preaggregation bytes > nit: 'preaggregation bytes limit' Done http://gerrit.cloudera.org:8080/#/c/15463/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/15463/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@581 PS2, Line 581: useStreamingPreagg_ > IIUC, the preagg_bytes_limit won't be applied if DISABLE_STREAMING_PREAGGRE That's true, that is the desired behaviour. I'm not aware of any use of the fallback mode, it was really a chicken bit from when we initially changed the behaviour. I mentioned in the commit message and we can mention in the docs. http://gerrit.cloudera.org:8080/#/c/15463/2/testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test File testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test: http://gerrit.cloudera.org:8080/#/c/15463/2/testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test@99 PS2, Line 99: > nit: whitespace Done -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Mar 2020 00:09:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Hello Aman Sinha, Thomas Tauber-Marshall, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15463 to look at the new patch set (#3). Change subject: IMPALA-9530: query option to limit preagg memory .. IMPALA-9530: query option to limit preagg memory This adds an advanced PREAGG_BYTES_LIMIT query option that allows limiting the memory consumption of streaming preaggregation operators in a query. It works by setting a maximum reservation on each grouping aggregator in a preaggregation node. The aggregators switch to passthrough mode automatically when hitting this limit, the same as if they were hitting the query memory limit. This does not override the minimum reservation computed for the aggregation - if the limit is less than the minimum reservation, the minimum reservation is used as the limit instead. The default behaviour is unchanged. Testing: Add a planner test with estimates higher and lower than limit to ensure that resource estimates correctly reflect the option. Add an end-to-end test that verifies that the option forces passthrough when the memory limit is hit. Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/preagg-bytes-limit.test M testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test 9 files changed, 203 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/15463/3 -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9357: Fix race condition in alter database event
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15260 ) Change subject: IMPALA-9357: Fix race condition in alter_database event .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5522/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e Gerrit-Change-Number: 15260 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 18 Mar 2020 23:58:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5521/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 9 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 18 Mar 2020 23:49:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5520/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 23:43:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/15462/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15462/3//COMMIT_MSG@35 PS3, Line 35: Testing: > Maybe add Q13 to testdata/workloads/functional-planner/queries/PlannerTest/ I have added tpcds q13 to the tpcds-all.test and a duplicate q7, q19 to tpch-all.test. These qualify for the rewrite and all have a QUERYOPTIONS section with cnf rewrite enabled. For tpcds q13, there was no previous entry in tpcds-all.test (I see only 24 queries in that file out of 99 total, so the 'all' is misleading). Since the rewrite is disabled by default (at least temporarily), other tests in PlannerTest are not affected. However, when I ran it with it globally enabled, 5 out of 102 tests failed. Most of these were related to TPC-H q7, TPCH-q19 since they are run against different databases - e.g tpch_nested_parquet, tpch_kudu etc. Once notable change in the latest patch upload: I changed the default max_cnf_exprs to 0 (unlimited) because the above tests do generate hundreds of exprs. I cannot think of a good default at this time. If you think we should limit it by default, let me know. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java: http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@37 PS3, Line 37: * > The single table rewrites might be useful for pushing predicates down into Agreed. I have removed the wording 'primarily intended for testing.' The default is still multi tables because otherwise we would see a lot of plan diffs. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@132 PS3, Line 132: } else if (cpred.getOp() == CompoundPredicate.Operator.NOT) { > Consider factoring the logic of the two branches out into a common function Yes, my intent was to maintain the ordering of applying the distributive property which is left-to-right. I was able to factor out the code though, so it is moved to a separate function. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@136 PS3, Line 136: // predicate: NOT (a OR b) > I think it would be more concise to use Arrays.asList() to construct the li Done. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@156 PS3, Line 156: private Predicate createPredAndIncrementCount(Expr first_lhs, Expr second_lhs, > It would be good to factor this check out into a common function. Moved this code up so that it is common to both the OR and the NOT blocks. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 23:28:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9357: Fix race condition in alter database event
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15260 ) Change subject: IMPALA-9357: Fix race condition in alter_database event .. Patch Set 11: I am commenting out the added statements in test_concurrent_ddls.py for now until IMPALA-9235 is fixed. -- To view, visit http://gerrit.cloudera.org:8080/15260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e Gerrit-Change-Number: 15260 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 18 Mar 2020 23:14:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9357: Fix race condition in alter database event
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15260 ) Change subject: IMPALA-9357: Fix race condition in alter_database event .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/15260/11/tests/custom_cluster/test_concurrent_ddls.py File tests/custom_cluster/test_concurrent_ddls.py: http://gerrit.cloudera.org:8080/#/c/15260/11/tests/custom_cluster/test_concurrent_ddls.py@24 PS11, Line 24: from tests.util.filesystem_utils import WAREHOUSE flake8: F401 'tests.util.filesystem_utils.WAREHOUSE' imported but unused http://gerrit.cloudera.org:8080/#/c/15260/11/tests/custom_cluster/test_concurrent_ddls.py@101 PS11, Line 101: f flake8: F841 local variable 'func_name' is assigned to but never used -- To view, visit http://gerrit.cloudera.org:8080/15260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e Gerrit-Change-Number: 15260 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 18 Mar 2020 23:14:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9357: Fix race condition in alter database event
Vihang Karajgaonkar has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/15260 ) Change subject: IMPALA-9357: Fix race condition in alter_database event .. IMPALA-9357: Fix race condition in alter_database event The race condition is exposed intermittently, on certain builds which causes test_event_processing::test_self_events test to fail. This happens because we are checking for self-event identifiers in the Db object without taking a lock. When a DDL like 'comment on database is 'test'' is executed, it is possible that the event processor thread is triggered as soon as the ALTER_DATABASE event is generated. This may cause event processor fail the self-event detection since the self-event identifiers are not yet added to the Db object. The fix adds a Db lock similar to Table lock. Alter db operations in CatalogOpExecutor now take db locks instead of metastoreDdlLock_ which makes it consistent with table locking protocol. Testing: 1. Ran existing tests for events processor. 2. This test was failing on centos6 frequently (failed in 1/3 times). After the fix I ran the test in a loop for 24 hrs (197 iterations) and the test didn't fail. 3. Ran core tests with CDP and CDH builds. Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/custom_cluster/test_concurrent_ddls.py M tests/custom_cluster/test_event_processing.py 7 files changed, 242 insertions(+), 145 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/15260/11 -- To view, visit http://gerrit.cloudera.org:8080/15260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e Gerrit-Change-Number: 15260 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9357: Fix race condition in alter database event
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15260 ) Change subject: IMPALA-9357: Fix race condition in alter_database event .. Patch Set 10: > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5474/ > > It looks like there is some race condition with dropFunction which > is causing this failure. I am investigating this further. Finally, I was able to nail down the race condition. It happens when a concurrent invalidate metadata is running along with create function statement. I have created IMPALA-9532 for this. -- To view, visit http://gerrit.cloudera.org:8080/15260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e Gerrit-Change-Number: 15260 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 18 Mar 2020 23:07:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Abhishek Rawat has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. IMPALA-9466: impala-shell client retry for hs2-http protocol Added retries for idempotent rpcs: OpenSession, PingImpalaHS2Service, GetResultSetMetadata, CloseImpalaOperation (non dmls), CancelOperation, GetOperationStatus, GetRuntimeProfile, GetExecSummary, GetLog Retries were also added to the 'set all' query execution and subsequent result fetch in the ImpalaHS2Client._open_session() The retries are only supported for hs2-http protocol and enabled by default. At most there are 3 tries for a failed rpc with at least 2 second wait duration between tries. Only failed rpcs due to an error in the http transport are retried and if an rpc failed because the server returned an error in the rpc response then such scenarios are not retriable. Improved error diagnostics by dumping stack trace when ImpalaShell. _execute_stmt() gets an 'Unknown Exception'. Testing: - Added a custom_cluster test which injects fault into the http transport and checks expected behavior from the various rpcs. Some of these tests leave the session in an open state and so these tests are not suitable for the e2e test framework which have metric verifiers expecting related metrics to be 0 at the end of the test. - Manually tested real world scenarios with impala-shell client communicating with an impala coordinator via a fault injecting istio mesh. - Manually tested dropping connections on an nginx ingress gateway by sending SIGTERM to all worker processes. Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 --- M shell/impala_client.py M shell/impala_shell.py A tests/custom_cluster/test_hs2_fault_injection.py 3 files changed, 459 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/15378/9 -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 9 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. Patch Set 8: (9 comments) Addressed the comments so far... http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@649 PS7, Line 649: # Wait duration (seconds) between retries. > yeah, this might not matter much, but unless there is a reason to wait, not I think exponential backoff policy makes sense. http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@660 PS8, Line 660: return self.imp_service.OpenSession(open_session_req) > nit: should be 2 spaces instead of 4? - same with the other methods Done http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@661 PS8, Line 661: rpc = OpenSession > is it necessary to define this intermediate variable, can OpenSession just Yeah we can pass the rpc functions directly. Done. http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@678 PS8, Line 678: if not raise_error: : retry_msg = 'Num remaining tries: {0}'.format(self.max_tries - num_tries) : else: : retry_msg = '' > if I'm reading this correctly, if retries are enabled. the first two tries Done. http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@692 PS8, Line 692: except Exception, e: > might be a nit, but what about QueryCancelledByShellException and MissingTh Made the behavior consistent with do_hs2_rpc. http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@947 PS8, Line 947: Retries, if enabled, are attempted : for all exceptions other than QueryCancelledByShellException : and MissingThriftMethodException > why are we retrying all TApplicationException (except for UNKNOWN_METHOD)? I think for idempotent rpcs we could retry all exceptions. If we know retrying certain exceptions (such as MissingThriftMethodException) won't help, then we could skip them. http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@954 PS8, Line 954: max_tries = self.max_tries > Nit: echoing a comment that Sahil made earlier, seems like this could be re I think in this case we could have rpcs with retry_on_error=True/False and so max_tries would depend on retry_on_error. The other alternative would be to explicitly pass 'max_tries' as param to '_do_hs2_rpc' like you had suggested earlier. Let me know, if you guys have a preference. http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@958 PS8, Line 958: retry_msg = 'Num remaining tries: {0}'.format(max_tries - num_tries) : else: : retry_msg = '' > same comment as above Done http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@974 PS8, Line 974: rpc.__name__ > should this be included in the retry logic for the open session method? Yeah, I think we should. The issue is that there is no easy way to determine if execute rpc failed or fetch rpc. We could use a generic message. -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 8 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 18 Mar 2020 23:00:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15462 to look at the new patch set (#4). Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing. Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test 12 files changed, 819 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/4 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9398: Fix shell history duplication when cmdloop breaks
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15345 ) Change subject: IMPALA-9398: Fix shell history duplication when cmdloop breaks .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4faf46134f44d91e56748642f47d448707db53c Gerrit-Change-Number: 15345 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Vincent Tran Gerrit-Comment-Date: Wed, 18 Mar 2020 22:45:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5519/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 16 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 18 Mar 2020 22:43:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 15: (9 comments) http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@83 PS15, Line 83: UNPROTECTED (HIR resident), and TOMBSTONE (HIR nonresident) not sure I understand the difference between PROTECTED and TOMBSTONE. also, what does 'resident' mean in this context? http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@84 PS15, Line 84: // The size of the UNPROTECTED and PROTECTED categories is determined by the : // lirs_unprotected_percentage parameter, which is typically in the 1-5% range. : // The PROTECTED entries make up the bulk of the cache (90+%), and this contains the : // entries with the shortest interreference recency. bit confused by the wording. the UNPROTECTED size is controlled by 'lirs_unprotected_percentage' right? and the rest of the space is for the PROTECTED entries? what about the TOMBSTONE entries? http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@96 PS15, Line 96: UNINIT is UINIT suppose to be UNINITIALIZED for short? might be good to state that somewhere http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@122 PS15, Line 122: DataResidency mentioned this before, but would be nice to have an explanation of what residency means http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@327 PS15, Line 327: void TrimRecencyList(LIRSThreadState* tstate); : void EnforceTombstoneLimit(LIRSThreadState* tstate); : void EnforceProtectedCapacity(LIRSThreadState* tstate); : void EnforceUnprotectedCapacity(LIRSThreadState* tstate); nit: docs http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@387 PS15, Line 387: LIRSCacheShard::~LIRSCacheShard() { would it be cleaner to have an explicit close method that does all of this. it makes it easier to reason about when this actually gets called. i think we have been bitten in the past where big destructors don't get called when we expect them to http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@411 PS15, Line 411: // percentage >= 50.00%. This is far outside the range of what is useful for maybe add this limitation to the flag docs http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@423 PS15, Line 423: betwee nit: typo http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@428 PS15, Line 428: Must be 0.5 or greater same comment as above -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 15 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 18 Mar 2020 22:00:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Hello Thomas Tauber-Marshall, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15306 to look at the new patch set (#16). Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. IMPALA-8690: Add LIRS cache eviction algorithm One concern for the data cache is that the LRU eviction algorithm is suceptible to being flushed by large scans of low priority data. This implements the LIRS algorithm described in "LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy to Improve Buffer Cache Performance" by Song Jiang / Xiaodon Xhang 2002. LIRS is a scan-resistent eviction algorithm with low performance penalty to LRU. This introduces the startup flag data_cache_eviction_policy to control which eviction policy to use. The only two options are LRU and LIRS, with the default continuing to be LRU. To accomodate the new algorithm and associated tests, some code moved around: 1. The RLCacheShard implementation moved from util/cache/cache.cc to util/cache/rl-cache.cc. 2. The backend cache tests were split into multiple files. util/cache/cache-test.h contains shared cache testing code. util/cache/cache-test.cc contains generic tests that should work for any algorithm. util/cache/rl-cache-test.cc are RLCacheShard specific tests util/cache/lirs-cache-test.cc are LIRS specific tests 3. To make it easy for clients of the cache code to customize the cache eviction algorithm, the public interface changed from using a template to taking the policy as an argument. 4. Cache::MemoryType is removed. 5. Cache adds an Init() method to verify the validity of startup flags Testing: - Added LIRS specific backend cache tests (lirs-cache-test) - Ran TPC-DS with a very small cache and concurrency to test corner cases with the LIRS eviction policy - Parameterized data-cache-test to run for both LRU and LIRS - Added LIRS equivalents for tests in custom_cluster/test_data_cache.py - Ran cache-bench with LRU and LIRS. The results are: Test case | Algorithm | Lookups / sec | Hit rate ZIPFIAN ratio=1.00x | LRU | 11.31M| 99.9% ZIPFIAN ratio=1.00x | LIRS | 10.09M| 99.8% ZIPFIAN ratio=3.00x | LRU | 11.36M| 95.9% ZIPFIAN ratio=3.00x | LIRS | 9.27M| 96.4% UNIFORM ratio=1.00x | LRU | 7.46M| 99.8% UNIFORM ratio=1.00x | LIRS | 6.93M| 99.8% UNIFORM ratio=3.00x | LRU | 5.63M| 33.3% UNIFORM ratio=3.00x | LIRS | 3.24M| 33.3% The takeaway is that LIRS is a bit slower on lookups and quite a bit slower on inserts. However, they both are still doing millions of operations per second, so it should not be a bottleneck for the data cache. Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 --- M be/src/runtime/io/data-cache-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/data-cache.h M be/src/util/cache/CMakeLists.txt M be/src/util/cache/cache-bench.cc M be/src/util/cache/cache-internal.h M be/src/util/cache/cache-test.cc A be/src/util/cache/cache-test.h M be/src/util/cache/cache.cc M be/src/util/cache/cache.h A be/src/util/cache/lirs-cache-test.cc A be/src/util/cache/lirs-cache.cc A be/src/util/cache/rl-cache-test.cc A be/src/util/cache/rl-cache.cc M bin/rat_exclude_files.txt M tests/custom_cluster/test_data_cache.py 16 files changed, 2,559 insertions(+), 827 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/15306/16 -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 16 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 15: (6 comments) http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/runtime/io/data-cache.cc@645 PS15, Line 645: if (handle.get() == nullptr) return false; > nit: 'if (UNLIKELY(' Added the UNLIKELY() When Insert into meta_cache fails, it calls the eviction callback. For DataCache, that punches a hole in the file, so the data is already gone by the time it returns nullptr. http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/cache.h File be/src/util/cache/cache.h: http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/cache.h@256 PS15, Line 256: the entry is immediately > nit: revise Fixed up that sentence. http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@46 PS15, Line 46: entriess > nit: typo Done http://gerrit.cloudera.org:8080/#/c/15306/15/tests/custom_cluster/test_data_cache.py File tests/custom_cluster/test_data_cache.py: http://gerrit.cloudera.org:8080/#/c/15306/15/tests/custom_cluster/test_data_cache.py@55 PS15, Line 55: int_test_data_cache_deterministic > prefix with '__' instead of 'int' Good point, that's much clearer. Done. http://gerrit.cloudera.org:8080/#/c/15306/15/tests/custom_cluster/test_data_cache.py@81 PS15, Line 81: int_ > same comment as above Done http://gerrit.cloudera.org:8080/#/c/15306/15/tests/custom_cluster/test_data_cache.py@113 PS15, Line 113: int_ > same comment as above Done -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 15 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 18 Mar 2020 21:57:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 4: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/572/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 4 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Mar 2020 21:36:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15476 to look at the new patch set (#4). Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. IMPALA-9443: [DOCS] Make tables optically pleasing and complete - Replaced ellipses in example columns with sample output - Fixed table formatting problems Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 --- M docs/topics/impala_perf_stats.xml M docs/topics/impala_show.xml 2 files changed, 97 insertions(+), 96 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/15476/4 -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 4 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 4: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/572/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 4 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Mar 2020 21:29:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8690: Add LIRS cache eviction algorithm
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15306 ) Change subject: IMPALA-8690: Add LIRS cache eviction algorithm .. Patch Set 15: (6 comments) http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/runtime/io/data-cache.cc@645 PS15, Line 645: if (handle.get() == nullptr) return false; nit: 'if (UNLIKELY(' it looks like the Insert into the "meta cache" can fail, even after data has been written to the cache file? do we need to try and delete the data from the cache file before returning false? http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/cache.h File be/src/util/cache/cache.h: http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/cache.h@256 PS15, Line 256: the entry is immediately nit: revise http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/15306/15/be/src/util/cache/lirs-cache.cc@46 PS15, Line 46: entriess nit: typo http://gerrit.cloudera.org:8080/#/c/15306/15/tests/custom_cluster/test_data_cache.py File tests/custom_cluster/test_data_cache.py: http://gerrit.cloudera.org:8080/#/c/15306/15/tests/custom_cluster/test_data_cache.py@55 PS15, Line 55: int_test_data_cache_deterministic prefix with '__' instead of 'int' http://gerrit.cloudera.org:8080/#/c/15306/15/tests/custom_cluster/test_data_cache.py@81 PS15, Line 81: int_ same comment as above http://gerrit.cloudera.org:8080/#/c/15306/15/tests/custom_cluster/test_data_cache.py@113 PS15, Line 113: int_ same comment as above -- To view, visit http://gerrit.cloudera.org:8080/15306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8 Gerrit-Change-Number: 15306 Gerrit-PatchSet: 15 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 18 Mar 2020 21:19:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Kristine Hahn has removed Tamas Mate from this change. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Removed reviewer Tamas Mate. -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 3 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins
[native-toolchain-CR] Add support for centos8
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/15465 ) Change subject: Add support for centos8 .. Patch Set 2: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/15465/1/docker/all/postinstall.sh File docker/all/postinstall.sh: http://gerrit.cloudera.org:8080/#/c/15465/1/docker/all/postinstall.sh@31 PS1, Line 31: alternatives --set python /usr/bin/python2 > Kinda. Thanks for the explananation Hector, I get it now . http://gerrit.cloudera.org:8080/#/c/15465/1/docker/redhat8.df File docker/redhat8.df: http://gerrit.cloudera.org:8080/#/c/15465/1/docker/redhat8.df@48 PS1, Line 48: RUN postinstall.sh : : COPY ./all > Done Done -- To view, visit http://gerrit.cloudera.org:8080/15465 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363 Gerrit-Change-Number: 15465 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Hector Acosta Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 20:34:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 3: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/571/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 3 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 18 Mar 2020 20:29:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8205: Support number of true and false statistics for boolean column
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14666 ) Change subject: IMPALA-8205: Support number of true and false statistics for boolean column .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5489/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/14666 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I991bee8e7fdc644d908289f5fe2ee8032cc2c431 Gerrit-Change-Number: 14666 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 18 Mar 2020 20:22:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8718: Fixed AnalysisException in inline view with outer join complex type column
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13765 ) Change subject: IMPALA-8718: Fixed AnalysisException in inline view with outer join complex type column .. Patch Set 5: @Yongzhi can this be closed now that IMPALA-8718 is marked as resolved. -- To view, visit http://gerrit.cloudera.org:8080/13765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ida65503ca4b1342b8fe0049753bc664da227dca9 Gerrit-Change-Number: 13765 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 18 Mar 2020 20:23:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Hello Tamas Mate, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15476 to look at the new patch set (#3). Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. IMPALA-9443: [DOCS] Make tables optically pleasing and complete - Replaced ellipses in example columns with sample output - Fixed table formatting problems Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 --- M docs/topics/impala_perf_stats.xml M docs/topics/impala_show.xml 2 files changed, 96 insertions(+), 99 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/15476/3 -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 3 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 3: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/571/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 3 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 18 Mar 2020 20:20:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8205: Support number of true and false statistics for boolean column
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14666 ) Change subject: IMPALA-8205: Support number of true and false statistics for boolean column .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@293 PS6, Line 293: "COUNT(CASE WHEN " + colRefSql + " = TRUE THEN 1 ELSE NULL END)"); > Hi Sahil, I do conduct a test to compare their performance; thanks for taking a look. in that case, lets just stick the the one you currently have since it looks similar to the null count -- To view, visit http://gerrit.cloudera.org:8080/14666 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I991bee8e7fdc644d908289f5fe2ee8032cc2c431 Gerrit-Change-Number: 14666 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 18 Mar 2020 20:17:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8205: Support number of true and false statistics for boolean column
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14666 ) Change subject: IMPALA-8205: Support number of true and false statistics for boolean column .. Patch Set 8: Code-Review+2 LGTM -- To view, visit http://gerrit.cloudera.org:8080/14666 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I991bee8e7fdc644d908289f5fe2ee8032cc2c431 Gerrit-Change-Number: 14666 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 18 Mar 2020 20:17:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Hello Tamas Mate, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15476 to look at the new patch set (#2). Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. IMPALA-9443: [DOCS] Make tables optically pleasing and complete - Replaced ellipses in example columns with sample output - Fixed table formatting problems Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 (cherry picked from commit 82ffc9ab96f5ac6ef418c362ac38189a4a722d97) --- M docs/topics/impala_perf_stats.xml M docs/topics/impala_show.xml 2 files changed, 95 insertions(+), 96 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/15476/2 -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 2 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15477 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 1: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/570/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/15477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4d0dada78b6ca88051845a54ca47cd9f490699b0 Gerrit-Change-Number: 15477 Gerrit-PatchSet: 1 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Mar 2020 19:16:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Kristine Hahn has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15477 Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. IMPALA-9443: [DOCS] Make tables optically pleasing and complete - Replaced ellipses in example columns with sample output - Fixed table formatting problems Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 fake changes Change-Id: I4d0dada78b6ca88051845a54ca47cd9f490699b0 --- M docs/topics/impala_perf_stats.xml M docs/topics/impala_show.xml 2 files changed, 96 insertions(+), 99 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/15477/1 -- To view, visit http://gerrit.cloudera.org:8080/15477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4d0dada78b6ca88051845a54ca47cd9f490699b0 Gerrit-Change-Number: 15477 Gerrit-PatchSet: 1 Gerrit-Owner: Kristine Hahn
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15477 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 1: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/570/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/15477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4d0dada78b6ca88051845a54ca47cd9f490699b0 Gerrit-Change-Number: 15477 Gerrit-PatchSet: 1 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Mar 2020 19:08:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 2: (3 comments) mostly minor comments, overall LGTM http://gerrit.cloudera.org:8080/#/c/15463/2/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/15463/2/be/src/service/query-options.cc@923 PS2, Line 923: preaggregation bytes nit: 'preaggregation bytes limit' http://gerrit.cloudera.org:8080/#/c/15463/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/15463/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@581 PS2, Line 581: useStreamingPreagg_ IIUC, the preagg_bytes_limit won't be applied if DISABLE_STREAMING_PREAGGREGATIONS=true, right? should that be documented somewhere? http://gerrit.cloudera.org:8080/#/c/15463/2/testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test File testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test: http://gerrit.cloudera.org:8080/#/c/15463/2/testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test@99 PS2, Line 99: nit: whitespace -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 18:42:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 1: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/569/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 1 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Mar 2020 18:34:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Kristine Hahn has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15476 Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. IMPALA-9443: [DOCS] Make tables optically pleasing and complete - Replaced ellipses in example columns with sample output - Fixed table formatting problems Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 --- M docs/topics/impala_perf_stats.xml M docs/topics/impala_show.xml 2 files changed, 95 insertions(+), 96 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/15476/1 -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 1 Gerrit-Owner: Kristine Hahn
[Impala-ASF-CR] IMPALA-9443: [DOCS] Make tables optically pleasing and complete
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15476 ) Change subject: IMPALA-9443: [DOCS] Make tables optically pleasing and complete .. Patch Set 1: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/569/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/15476 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83fd30b87730c82c87f6f7aee26d8cceb77b6308 Gerrit-Change-Number: 15476 Gerrit-PatchSet: 1 Gerrit-Owner: Kristine Hahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Mar 2020 18:25:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. Patch Set 8: (8 comments) http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@649 PS7, Line 649: # Wait duration (seconds) between retries. > I picked up a duration and yes this might need some tuning. I think it's pr yeah, this might not matter much, but unless there is a reason to wait, not sure it is necessary. i feel like if the underlying issue is connection resets due to dynamic reload by NGINX, then there isn't really a need to wait. a exponential backoff policy for the sleep duration would be nice as well, but again not sure if it makes much a difference - e.g. first and second try don't wait at all, third try can wait 2 seconds http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@660 PS8, Line 660: return self.imp_service.OpenSession(open_session_req) nit: should be 2 spaces instead of 4? - same with the other methods http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@661 PS8, Line 661: rpc = OpenSession is it necessary to define this intermediate variable, can OpenSession just be passed directly to '_do_hs2_rpc' - same with all the other methods http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@678 PS8, Line 678: if not raise_error: : retry_msg = 'Num remaining tries: {0}'.format(self.max_tries - num_tries) : else: : retry_msg = '' if I'm reading this correctly, if retries are enabled. the first two tries will have: Num remaining tries: 2 Num remaining tries: 1 And the last try won't have anything. Would it be clearer if the last try had: Num remaining tries: 0 http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@692 PS8, Line 692: except Exception, e: might be a nit, but what about QueryCancelledByShellException and MissingThriftMethodException, they are retried here, but not in do_hs2_rpc http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@947 PS8, Line 947: Retries, if enabled, are attempted : for all exceptions other than QueryCancelledByShellException : and MissingThriftMethodException why are we retrying all TApplicationException (except for UNKNOWN_METHOD)? http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@958 PS8, Line 958: retry_msg = 'Num remaining tries: {0}'.format(max_tries - num_tries) : else: : retry_msg = '' same comment as above http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@974 PS8, Line 974: rpc.__name__ should this be included in the retry logic for the open session method? -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 8 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 18 Mar 2020 18:18:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9398: Fix shell history duplication when cmdloop breaks
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15345 ) Change subject: IMPALA-9398: Fix shell history duplication when cmdloop breaks .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/15345/1/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/15345/1/shell/impala_shell.py@1366 PS1, Line 1366: try: > Thanks Tamas! After your explanation, I think I understand why your patch f I actually think the bookkeeping idea is not a bad idea, but do we know if there are any other conditions that might lead us back into the preloop() method? -- To view, visit http://gerrit.cloudera.org:8080/15345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4faf46134f44d91e56748642f47d448707db53c Gerrit-Change-Number: 15345 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Vincent Tran Gerrit-Comment-Date: Wed, 18 Mar 2020 18:15:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9042: Milestone 1: properly scan files that has full ACID schema
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15395 ) Change subject: IMPALA-9042: Milestone 1: properly scan files that has full ACID schema .. Patch Set 3: (13 comments) I think the approach makes sense. Thanks for adding so many tests. http://gerrit.cloudera.org:8080/#/c/15395/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15395/3//COMMIT_MSG@58 PS3, Line 58: * SHOW CREATE TABLE (show-create-table-full-acid.test) Can you also add describe and describe formatted tests? I assume these show hide row__id. http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/hdfs-orc-scanner.cc@181 PS3, Line 181: VLOG_QUERY << filename(); I assume you're going to remove the log lines? http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/hdfs-orc-scanner.cc@185 PS3, Line 185: if (scan_node_->hdfs_table()->IsFullAcid() && !schema_resolver_->IsFullAcid()) { This is going to be addressed at some point, right? So we can scan original files too. http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.h File be/src/exec/orc-metadata-utils.h: http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.h@46 PS3, Line 46: bool IsFullAcid() const; Maybe IsFileFullAcid()? Since it's going to get confusing in future about whether this is referring to the file or the table. http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc File be/src/exec/orc-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc@94 PS3, Line 94: if (table_idx == num_part_cols + 5) { I'd find it clearer if it was a named constant instead of the magic number http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc@101 PS3, Line 101: DCHECK DCHECK_GE http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc@231 PS3, Line 231: bool OrcSchemaResolver::IsFullAcid() const { Can you add a brief comment on the approach? It seems to be basically "if it walks like a duck and quacks like a duck, it's a duck". http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/runtime/descriptors.h@333 PS3, Line 333: IsFullAcid IsFullAcidTable? http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java: http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java@94 PS3, Line 94: // analyzer.ensureTableNotTransactional(table_, "ALTER TABLE"); ? http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/Path.java File fe/src/main/java/org/apache/impala/analysis/Path.java: http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/Path.java@473 PS3, Line 473: convertToFullAcid convertToFullAcidFilePath? http://gerrit.cloudera.org:8080/#/c/15395/3/testdata/workloads/functional-query/queries/QueryTest/acid-negative.test File testdata/workloads/functional-query/queries/QueryTest/acid-negative.test: http://gerrit.cloudera.org:8080/#/c/15395/3/testdata/workloads/functional-query/queries/QueryTest/acid-negative.test@1 PS3, Line 1: # Why commented out? http://gerrit.cloudera.org:8080/#/c/15395/3/testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test File testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test: http://gerrit.cloudera.org:8080/#/c/15395/3/testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test@16 PS3, Line 16: QUERY Can you also add a LABELS section to make sure the labels for the ROW__ID components are sane. Any maybe also a test that selects individual elements of row__id. http://gerrit.cloudera.org:8080/#/c/15395/3/testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test@17 PS3, Line 17: select row__id.*, * from functional_orc_def.alltypestiny; Do we expect the values to be deterministic across data loads? -- To view, visit http://gerrit.cloudera.org:8080/15395 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb Gerrit-Change-Number: 15395 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Mar 2020 18:08:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output
Thomas Tauber-Marshall has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15199 ) Change subject: IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output .. IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output This change modifies the output of the SHOW TABLE STATS and SHOW PARTITIONS for Kudu tables. - PARTITIONS: the #Row column has been removed - TABLE STATS: instead of showing partition informations it returns a resultset similar to HDFS table stats, #Rows, #Partitions, Size, Format and Location Example outputs can be seen in the doc changes. Testing: * kudu_stats.test is modified to verify the new result set * kudu_partition_ddl.test is modified to verify the new partitions style * Updated unit test with the new error message Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041 Reviewed-on: http://gerrit.cloudera.org:8080/15199 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins --- M docs/topics/impala_compute_stats.xml M docs/topics/impala_show.xml M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test 8 files changed, 190 insertions(+), 149 deletions(-) Approvals: Thomas Tauber-Marshall: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/15199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041 Gerrit-Change-Number: 15199 Gerrit-PatchSet: 9 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15199 ) Change subject: IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041 Gerrit-Change-Number: 15199 Gerrit-PatchSet: 8 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 18 Mar 2020 17:59:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15378 ) Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/15378/8/shell/impala_client.py@954 PS8, Line 954: max_tries = self.max_tries Nit: echoing a comment that Sahil made earlier, seems like this could be refactored? -- To view, visit http://gerrit.cloudera.org:8080/15378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5 Gerrit-Change-Number: 15378 Gerrit-PatchSet: 8 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 18 Mar 2020 17:40:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9398: Fix shell history duplication when cmdloop breaks
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15345 ) Change subject: IMPALA-9398: Fix shell history duplication when cmdloop breaks .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5488/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4faf46134f44d91e56748642f47d448707db53c Gerrit-Change-Number: 15345 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Vincent Tran Gerrit-Comment-Date: Wed, 18 Mar 2020 17:49:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9042: Milestone 1: properly scan files that has full ACID schema
Norbert Luksa has posted comments on this change. ( http://gerrit.cloudera.org:8080/15395 ) Change subject: IMPALA-9042: Milestone 1: properly scan files that has full ACID schema .. Patch Set 3: (6 comments) Thanks for working on this. Left some comments, mostly questions to get a better understanding of how ACID works in Impala. Later I will look at the tests as well. http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc File be/src/exec/orc-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc@110 PS3, Line 110: file_idx -= num_part_cols; nit: This and the same @115 could be moved outside the if. http://gerrit.cloudera.org:8080/#/c/15395/3/be/src/exec/orc-metadata-utils.cc@118 PS3, Line 118: table_idx += 1 + num_part_cols; Could you explain the values here? In this case the file_idx should be the original table_idx? http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@a297 PS3, Line 297: Removing this mean that we support write operations on full ACID tables? Do we really need to remove this one? http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@224 PS3, Line 224: ensureTableNotFullAcid There are now two ensureTableNotFullAcid methods, that only differ in the parameters and the error msg thrown. Is this method and the INSERT_ONLY_ACID_TABLE_SUPPORTED_ERROR_MSG message used anymore? Can they be removed? http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@237 PS3, Line 237: String nit: maybe change this string to enum? http://gerrit.cloudera.org:8080/#/c/15395/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@238 PS3, Line 238: throws AnalysisException { nit: missing space -- To view, visit http://gerrit.cloudera.org:8080/15395 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb Gerrit-Change-Number: 15395 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Mar 2020 17:34:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5518/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 17:34:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6505: Min-Max predicate push down in ORC scanner
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15403 ) Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner .. Patch Set 2: (3 comments) Just a few nits, otherwise lgtm. http://gerrit.cloudera.org:8080/#/c/15403/2/be/src/exec/hdfs-orc-scanner.h File be/src/exec/hdfs-orc-scanner.h: http://gerrit.cloudera.org:8080/#/c/15403/2/be/src/exec/hdfs-orc-scanner.h@316 PS2, Line 316: ScalarExprEvaluator* eval, nit: can it be const? http://gerrit.cloudera.org:8080/#/c/15403/2/be/src/exec/hdfs-orc-scanner.h@317 PS2, Line 317: orc::PredicateDataType* predicate_type, nit: we usually put output parameters at the end of the parameter list http://gerrit.cloudera.org:8080/#/c/15403/2/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15403/2/be/src/exec/hdfs-orc-scanner.cc@888 PS2, Line 888: // The date should be valid at this point. Then should we add a DCHECK? -- To view, visit http://gerrit.cloudera.org:8080/15403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845 Gerrit-Change-Number: 15403 Gerrit-PatchSet: 2 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Mar 2020 17:25:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 2: Code-Review+1 (1 comment) Changes LGTM. http://gerrit.cloudera.org:8080/#/c/15463/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/15463/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@581 PS1, Line 581: if (useStreamingPreagg_ && queryOptions.getPreagg_bytes_limit() > 0) { > > I think it would be better to have all the parameters set and then create >> One of the things I like about about the builder interface is that there's >> no need to pass in special values to indicate unset. There's some logic in >> the builder that depends on this (the max reservation is adjusted to >> LONG_MAX automatically when it's unset). I can work around that, but I tried >> changing this around and the control flow doesn't seem any simpler, so I >> didn't make the change in this PS. I see...yeah in that case keeping the code as-is is fine. Thanks for adding the test for the min reservation. -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 17:12:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15199 ) Change subject: IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041 Gerrit-Change-Number: 15199 Gerrit-PatchSet: 8 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 18 Mar 2020 16:53:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Hello Aman Sinha, Thomas Tauber-Marshall, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15463 to look at the new patch set (#2). Change subject: IMPALA-9530: query option to limit preagg memory .. IMPALA-9530: query option to limit preagg memory This adds an advanced PREAGG_BYTES_LIMIT query option that allows limiting the memory consumption of preaggregation operators in a query. It works by setting a maximum reservation on each grouping aggregator in a preaggregation node. The aggregators switch to passthrough mode automatically when hitting this limit, the same as if they were hitting the query memory limit. This does not override the minimum reservation computed for the aggregation - if the limit is less than the minimum reservation, the minimum reservation is used as the limit instead. The default behaviour is unchanged. Testing: Add a planner test with estimates higher and lower than limit to ensure that resource estimates correctly reflect the option. Add an end-to-end test that verifies that the option forces passthrough when the memory limit is hit. Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/preagg-bytes-limit.test M testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test 9 files changed, 203 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/15463/2 -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15463/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/15463/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@581 PS1, Line 581: if (useStreamingPreagg_ && queryOptions.getPreagg_bytes_limit() > 0) { > I think it would be better to have all the parameters set and then create the > ResourceProfileBuilder (similar to the previous code). It seems more > intuitive control flow. For MaxMemReservation which was not set earlier, can > we pass in -1 or 0 for no reservation ? One of the things I like about about the builder interface is that there's no need to pass in special values to indicate unset. There's some logic in the builder that depends on this (the max reservation is adjusted to LONG_MAX automatically when it's unset). I can work around that, but I tried changing this around and the control flow doesn't seem any simpler, so I didn't make the change in this PS. long maxReservationBytes = 0; // This is a magic value meaning "unset". if (useStreamingPreagg_ && queryOptions.getPreagg_bytes_limit() > 0) { maxReservationBytes = Math.max(perInstanceMinMemReservation, queryOptions.getPreagg_bytes_limit()); // Aggregations should generally not use significantly more than the max reservation, // since the bulk of the memory is reserved. perInstanceMemEstimate = Math.min(perInstanceMemEstimate, maxReservationBytes); } ResourceProfileBuilder builder = new ResourceProfileBuilder() .setMemEstimateBytes(perInstanceMemEstimate) .setMinMemReservationBytes(perInstanceMinMemReservation) .setMaxMemReservationBytes(maxReservationBytes) .setSpillableBufferBytes(bufferSize) .setMaxRowBufferBytes(maxRowBufferSize); > Also, just to clarify .. should the spillable buffer size and max row buffer > size be upper bounded by pregg_bytes_limit ? Since the calculation for these > did not consider preagg_bytes_limit, could there be cases where it is > exceeded ? (although perhaps the backend makes some adjustments on its own). This is handled by taking the max of this and the min reservation - the min reservation is guaranteed to be enough to allocate all of the buffers required. I am missing a test for this though, so I'll add one. -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 16:47:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15463 ) Change subject: IMPALA-9530: query option to limit preagg memory .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15463/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/15463/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@581 PS1, Line 581: if (useStreamingPreagg_ && queryOptions.getPreagg_bytes_limit() > 0) { I think it would be better to have all the parameters set and then create the ResourceProfileBuilder (similar to the previous code). It seems more intuitive control flow. For MaxMemReservation which was not set earlier, can we pass in -1 or 0 for no reservation ? Also, just to clarify .. should the spillable buffer size and max row buffer size be upper bounded by pregg_bytes_limit ? Since the calculation for these did not consider preagg_bytes_limit, could there be cases where it is exceeded ? (although perhaps the backend makes some adjustments on its own). -- To view, visit http://gerrit.cloudera.org:8080/15463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9 Gerrit-Change-Number: 15463 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 18 Mar 2020 15:20:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6360: Don't show full query statement on Impala WebUI by default
Norbert Luksa has posted comments on this change. ( http://gerrit.cloudera.org:8080/15288 ) Change subject: IMPALA-6360: Don't show full query statement on Impala WebUI by default .. Patch Set 15: Code-Review+1 (1 comment) Found a nit, otherwise lgtm. http://gerrit.cloudera.org:8080/#/c/15288/15/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/15288/15/tests/webserver/test_web_pages.py@422 PS15, Line 422: x nit: I would explicitly write it as "x " (notice the space after the x), like in the other test file. -- To view, visit http://gerrit.cloudera.org:8080/15288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib7109a0be5d1022b4f8d6e72441cf5dc1dc42605 Gerrit-Change-Number: 15288 Gerrit-PatchSet: 15 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 18 Mar 2020 15:09:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6360: Don't show full query statement on Impala WebUI by default
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15288 ) Change subject: IMPALA-6360: Don't show full query statement on Impala WebUI by default .. Patch Set 15: Code-Review+1 (2 comments) I'm fine with the change, just left some nits. Will +2 once they are covered. http://gerrit.cloudera.org:8080/#/c/15288/15/tests/custom_cluster/test_web_pages.py File tests/custom_cluster/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/15288/15/tests/custom_cluster/test_web_pages.py@114 PS15, Line 114: imput nit: typo http://gerrit.cloudera.org:8080/#/c/15288/15/tests/custom_cluster/test_web_pages.py@130 PS15, Line 130: part nit: partial -- To view, visit http://gerrit.cloudera.org:8080/15288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib7109a0be5d1022b4f8d6e72441cf5dc1dc42605 Gerrit-Change-Number: 15288 Gerrit-PatchSet: 15 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Wed, 18 Mar 2020 14:23:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15475 ) Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/15475/2/tests/metadata/test_recover_partitions.py File tests/metadata/test_recover_partitions.py: http://gerrit.cloudera.org:8080/#/c/15475/2/tests/metadata/test_recover_partitions.py@366 PS2, Line 366: double_quote = '"' : single_quote = "'" : back_slash = '\\' : char_to_ascii_hex = { : double_quote: '%22', : single_quote: '%27', : back_slash: '%5C' : } : parts = [ : single_quote, : double_quote, : back_slash + single_quote, : back_slash + double_quote, : back_slash + back_slash + single_quote, : back_slash + back_slash + double_quote : ] : for i in range(len(parts)): : # When creating partitions, Hive converts special characters to their ascii values : # in Hex. So we should follow the same rule in creating partition directories. : hex_part = ''.join([char_to_ascii_hex[c] for c in parts[i]]) : self.create_fs_partition(tbl_location, 'p=%s' % hex_part, 'file_%d' % i, str(i) optional: we could also call urllib.parse.quote() in create_fs_partition() or even in the filesystem_client -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 18 Mar 2020 14:16:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15475 ) Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5517/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 18 Mar 2020 14:00:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15475 ) Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5516/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 18 Mar 2020 13:45:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15475 ) Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/15475/1/tests/metadata/test_recover_partitions.py File tests/metadata/test_recover_partitions.py: http://gerrit.cloudera.org:8080/#/c/15475/1/tests/metadata/test_recover_partitions.py@370 PS1, Line 370: : > flake8: E203 whitespace before ':' Done http://gerrit.cloudera.org:8080/#/c/15475/1/tests/metadata/test_recover_partitions.py@371 PS1, Line 371: : > flake8: E203 whitespace before ':' Done http://gerrit.cloudera.org:8080/#/c/15475/1/tests/metadata/test_recover_partitions.py@372 PS1, Line 372: : > flake8: E203 whitespace before ':' Done -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 18 Mar 2020 13:14:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15475 to look at the new patch set (#2). Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. IMPALA-9492: Fix test_unescaped_string_partition failing on S3 test_unescaped_string_partition in metadata/test_recover_partitions.py use hdfs clients to create four partition directories with special characters, i.e. single quote, double quotes and back slash. It aims to test on whether ALTER TABLE RECOVER PARTITIONS can recognize those directories correctly. However, when running against s3, only two directories are created as expected, which causes the failure. The reason is that when running against s3, we use hadoop cli for operations. A shell command will be launched for each operation. Passing arguments through shell results in duplicate unescaping. So the 4 dirs, [p=', p=", p=\', p=\"] finally became [p=', p=", p=', p="], resulting in two distinct directories. When the test running against hdfs, we use webhdfs_client so don't have this issue. Actually, we shouldn't use special characters in partition paths. Hive converts them to their ascii hex values when creating partition directories. E.g. partition paths of [p=', p=", p=\', p=\"] are [p=%27, p=%22, p=%5C%27, p=%5C%22]. We should follow this rule when creating directories in test. Also we won't have the above shell issue on s3 anymore. Tests: - Added two more special partitions in test_unescaped_string_partition. - Ran test_unescaped_string_partition in S3. Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb --- M tests/metadata/test_recover_partitions.py 1 file changed, 23 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/15475/2 -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15199 ) Change subject: IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5487/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041 Gerrit-Change-Number: 15199 Gerrit-PatchSet: 8 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 18 Mar 2020 13:04:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output
Tamas Mate has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/15199 ) Change subject: IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output .. IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output This change modifies the output of the SHOW TABLE STATS and SHOW PARTITIONS for Kudu tables. - PARTITIONS: the #Row column has been removed - TABLE STATS: instead of showing partition informations it returns a resultset similar to HDFS table stats, #Rows, #Partitions, Size, Format and Location Example outputs can be seen in the doc changes. Testing: * kudu_stats.test is modified to verify the new result set * kudu_partition_ddl.test is modified to verify the new partitions style * Updated unit test with the new error message Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041 --- M docs/topics/impala_compute_stats.xml M docs/topics/impala_show.xml M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test 8 files changed, 190 insertions(+), 149 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/15199/8 -- To view, visit http://gerrit.cloudera.org:8080/15199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041 Gerrit-Change-Number: 15199 Gerrit-PatchSet: 8 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9492: Fix test unescaped string partition failing on S3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15475 ) Change subject: IMPALA-9492: Fix test_unescaped_string_partition failing on S3 .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/15475/1/tests/metadata/test_recover_partitions.py File tests/metadata/test_recover_partitions.py: http://gerrit.cloudera.org:8080/#/c/15475/1/tests/metadata/test_recover_partitions.py@370 PS1, Line 370: flake8: E203 whitespace before ':' http://gerrit.cloudera.org:8080/#/c/15475/1/tests/metadata/test_recover_partitions.py@371 PS1, Line 371: flake8: E203 whitespace before ':' http://gerrit.cloudera.org:8080/#/c/15475/1/tests/metadata/test_recover_partitions.py@372 PS1, Line 372: flake8: E203 whitespace before ':' -- To view, visit http://gerrit.cloudera.org:8080/15475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I63d149c9bdec52c2e1c0b25c8c3f0448cf7bdadb Gerrit-Change-Number: 15475 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Mar 2020 13:01:25 + Gerrit-HasComments: Yes
[native-toolchain-CR] Add support for centos8
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15465 ) Change subject: Add support for centos8 .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/15465 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363 Gerrit-Change-Number: 15465 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Hector Acosta Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 06:18:04 + Gerrit-HasComments: No
[native-toolchain-CR] Fix bison compilation with glibc 2.28
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15467 ) Change subject: Fix bison compilation with glibc 2.28 .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15467 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a Gerrit-Change-Number: 15467 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 06:16:52 + Gerrit-HasComments: No
[native-toolchain-CR] Compile thrift using the python in our toolchain.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15468 ) Change subject: Compile thrift using the python in our toolchain. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/15468/2/source/python/build.sh File source/python/build.sh: http://gerrit.cloudera.org:8080/#/c/15468/2/source/python/build.sh@42 PS2, Line 42: LDFLAGS= wrap ./configure --prefix=$LOCAL_INSTALL --enable-unicode=ucs4 Can you leave a brief comment explaining why this is needed? -- To view, visit http://gerrit.cloudera.org:8080/15468 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60 Gerrit-Change-Number: 15468 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 06:15:53 + Gerrit-HasComments: Yes
[native-toolchain-CR] Remove centos6, ubuntu12, ubuntu14 platforms
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15464 ) Change subject: Remove centos6, ubuntu12, ubuntu14 platforms .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15464 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403 Gerrit-Change-Number: 15464 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 06:13:41 + Gerrit-HasComments: No
[native-toolchain-CR] Remove CYRUS SASL
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15466 ) Change subject: Remove CYRUS_SASL .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2 Gerrit-Change-Number: 15466 Gerrit-PatchSet: 2 Gerrit-Owner: Hector Acosta Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 06:13:35 + Gerrit-HasComments: No