[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 05:30:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7499: build against CDH Kudu
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11363 ) Change subject: IMPALA-7499: build against CDH Kudu .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6e1048438b6d09a1b38c58371d6212bb6dcc06c Gerrit-Change-Number: 11363 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Thu, 06 Sep 2018 04:09:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON
Greg Rahn has posted comments on this change. ( http://gerrit.cloudera.org:8080/10950 ) Change subject: IMPALA-376: add built-in functions for parsing JSON .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514 PS13, Line 514: get_json_object > Ah, haven't considered the ANSI standards before. My initial motivation is I'm not opposed to having both functions, Hive & ANSI, but I wanted to raise awareness to see if folks felt it was necessary and having knowledge that the ANSI JSON functions are likely to also be implemented and that hopefully it could be done so in a way that makes the most sense in terms of sharing code, etc. -- To view, visit http://gerrit.cloudera.org:8080/10950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f Gerrit-Change-Number: 10950 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 06 Sep 2018 04:03:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv
Alex Rodoni has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11386 ) Change subject: [DOCS] Additional updates to the fine-grained priv .. [DOCS] Additional updates to the fine-grained priv Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Reviewed-on: http://gerrit.cloudera.org:8080/11386 Tested-by: Impala Public Jenkins Reviewed-by: Adam Holley Reviewed-by: Alex Rodoni --- M docs/shared/impala_common.xml M docs/topics/impala_authorization.xml M docs/topics/impala_grant.xml M docs/topics/impala_revoke.xml 4 files changed, 567 insertions(+), 345 deletions(-) Approvals: Impala Public Jenkins: Verified Adam Holley: Looks good to me, but someone else must approve Alex Rodoni: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/11386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Gerrit-Change-Number: 11386 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/11386 ) Change subject: [DOCS] Additional updates to the fine-grained priv .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Gerrit-Change-Number: 11386 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 03:28:19 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv
Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11386 ) Change subject: [DOCS] Additional updates to the fine-grained priv .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/11386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Gerrit-Change-Number: 11386 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 03:26:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11393 ) Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/602/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11393 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92 Gerrit-Change-Number: 11393 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 03:21:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/601/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 9 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 03:12:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/600/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 8 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 03:07:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11158 ) Change subject: IMPALA-7540. Intern most repetitive strings and network addresses in catalog .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/599/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26 Gerrit-Change-Number: 11158 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 06 Sep 2018 02:40:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. IMPALA-7510. Support principals/privileges with LocalCatalog This enables support for Sentry authorization when LocalCatalog is enabled. The design is detailed in a change to the comment on CatalogdMetaProvider, but to recap it briefly here: At a high level, this patch takes the approach of duplicating the "v1" catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog daemon publishes complete objects into the statestore topic, and the impalad fully replicates them locally. I took this approach rather than trying to do fine-grained caching and invalidation for the following reasons: - The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude smaller than table metadata. So, the benefit of fine-grained caching and eviction is not as great. - The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined with relationships between them and backwards mappings maintained from groups back to principals. This logic is implemented by the AuthorizationPolicy class. Implementing similar mapping in a fine-grained caching approach would be a reasonable amount of work. - This bit of code is under some current flux as others are working on implementing more fine grained permissioning. Thus, trying to duplicate the logic in a "fetch-on-demand" implementation might turn out to be chasing somewhat of a moving target. In order to take this approach, the patch is organized as follows: - refactored some of the role/principal removal logic from ImpaladCatalog into AuthorizationPolicy. This makes it easier to perform the similar "subscribe" with less duplicate cdoe. - changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2 catalogs in addition to v1. - passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and added an AuthorizationPolicy member there. This member is maintained when we see PRINCIPAL and PRIVILEGE objects come via the catalog updates. - had to implement LocalCatalog.isReady() to ensure that we don't allow user access until the first topic update has been consumed. - additionally had to copy some other code from ImpaladCatalog to protect against various races -- we need a CatalogDeltaLog as well as careful sequencing of the order in which the objects apply. With this patch and the following one to enable UDF support, I was able to run the tests in tests/authorization successfully with LocalCatalog enabled. Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Reviewed-on: http://gerrit.cloudera.org:8080/11358 Reviewed-by: Todd Lipcon Tested-by: Todd Lipcon --- M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M tests/common/custom_cluster_test_suite.py 9 files changed, 272 insertions(+), 67 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 10 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables
Hello Bharath Vissapragada, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/11393 to review the following change. Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables .. IMPALA-7541. Avoid initializing Metrics for IncompleteTables Each 'Metrics' object is relatively large, and incomplete tables don't support viewing metrics anyway. This makes the initialization of Metrics lazy for IncompleteTable. I compared a jmap -histo:live from the catalogd before and after this change, after having loaded one table. Object counts are reduced as follows: java.util.concurrent.atomic.LongAdder: 16760 -> 20 (-16740) com.codahale.metrics.LongAdderProxy$JdkProvider$1: 16760 -> 20 (-16740) java.util.concurrent.atomic.AtomicLong: 14686 -> 4641 (-10045) com.codahale.metrics.EWMA: 10056 -> 12 (-10044) java.util.concurrent.ConcurrentSkipListMap$Node: 3353 -> 5 (-3348) com.codahale.metrics.Meter: 3352 -> 4 (-3348) java.util.concurrent.locks.ReentrantReadWriteLock: 3353 -> 5 (-3348) com.codahale.metrics.ExponentiallyDecayingReservoir: 3352 -> 4 (-3348) java.util.concurrent.locks.ReentrantReadWriteLock$Sync$ThreadLocalHoldCounter: 3353 -> 5 (-3348) com.codahale.metrics.Histogram: 3352 -> 4 (-3348) java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync: 3352 -> 4 (-3348) java.util.concurrent.ConcurrentSkipListMap: 3352 -> 4 (-3348) java.util.concurrent.ConcurrentSkipListMap$HeadIndex: 3352 -> 4 (-3348) com.codahale.metrics.Timer: 3352 -> 4 (-3348) java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock: 3353 -> 5 (-3348) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock: 3353 -> 5 (-3348) java.util.concurrent.ConcurrentHashMap$Node: 38419 -> 35140 (-3279) java.util.concurrent.locks.ReentrantLock: 2277 -> 1161 (-1116) java.util.concurrent.locks.ReentrantLock$NonfairSync: 2309 -> 1193 (-1116) com.codahale.metrics.MetricRegistry: 1117 -> 1 (-1116) [Ljava.util.concurrent.ConcurrentHashMap$Node;: 1264 -> 148 (-1116) java.util.concurrent.ConcurrentHashMap: 1281 -> 165 (-1116) java.util.concurrent.CopyOnWriteArrayList: 1126 -> 10 (-1116) org.apache.impala.common.Metrics: 1117 -> 1 (-1116) [Ljava.lang.Object;: 4441 -> 3346 (-1095) (94 objects per table) With heap usage (bytes) reduced as follows: java.util.concurrent.atomic.LongAdder: 536320 -> 640 (-535680) com.codahale.metrics.EWMA: 482688 -> 576 (-482112) com.codahale.metrics.LongAdderProxy$JdkProvider$1: 402240 -> 480 (-401760) java.util.concurrent.atomic.AtomicLong: 352464 -> 111384 (-241080) com.codahale.metrics.ExponentiallyDecayingReservoir: 187712 -> 224 (-187488) com.codahale.metrics.Meter: 160896 -> 192 (-160704) java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync: 160896 -> 192 (-160704) java.util.concurrent.ConcurrentSkipListMap: 160896 -> 192 (-160704) java.util.concurrent.ConcurrentSkipListMap$HeadIndex: 107264 -> 128 (-107136) java.util.concurrent.ConcurrentHashMap$Node: 1229408 -> 1124480 (-104928) [Ljava.util.concurrent.ConcurrentHashMap$Node;: 532192 -> 442912 (-89280) java.util.concurrent.ConcurrentSkipListMap$Node: 80472 -> 120 (-80352) java.util.concurrent.locks.ReentrantReadWriteLock: 80472 -> 120 (-80352) com.codahale.metrics.Histogram: 80448 -> 96 (-80352) com.codahale.metrics.Timer: 80448 -> 96 (-80352) java.util.concurrent.ConcurrentHashMap: 81984 -> 10560 (-71424) java.util.concurrent.locks.ReentrantReadWriteLock$Sync$ThreadLocalHoldCounter: 53648 -> 80 (-53568) java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock: 53648 -> 80 (-53568) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock: 53648 -> 80 (-53568) java.util.concurrent.locks.ReentrantLock$NonfairSync: 73888 -> 38176 (-35712) com.codahale.metrics.MetricRegistry: 26808 -> 24 (-26784) java.util.concurrent.CopyOnWriteArrayList: 27024 -> 240 (-26784) java.util.concurrent.locks.ReentrantLock: 36432 -> 18576 (-17856) org.apache.impala.common.Metrics: 17872 -> 16 (-17856) [Ljava.lang.Object;: 260912 -> 243840 (-17072) This works out to a total of 3249KB saved (2981 bytes per table). In a large production environment with 100k tables this would save 284MB of heap and 9.4M objects. Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92 --- M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java 4 files changed, 13 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/11393/1 -- To view, visit http://gerrit.cloudera.org:8080/11393 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id:
[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11357 ) Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates .. IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates This is a follow-up to commit a90b42911f1753b14e4f9f581bad837e33dcf9bf which had an unintended side-effect even when LocalCatalog is not enabled: previously we would only re-instantiate AuthorizationChecker on "full" catalog updates; after that patch, we would re-instantiate it on every catalog update. While it appears that this is relatively cheap and didn't cause any obvious behavioral changes, it wasn't intended, and one of the goals of the --use_local_catalog flag being optional is that the existing code paths shouldn't be affected. So, this puts it back the way it was. Thanks to Adam Holley for noticing this and alerting me to my mistake. Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962 Reviewed-on: http://gerrit.cloudera.org:8080/11357 Tested-by: Impala Public Jenkins Reviewed-by: Todd Lipcon --- M fe/src/main/java/org/apache/impala/service/Frontend.java 1 file changed, 6 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Verified Todd Lipcon: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/11357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962 Gerrit-Change-Number: 11357 Gerrit-PatchSet: 8 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 9: Skipping a jenkins re-verify since it was only whitespace/comment change edited. Forwarding +2. -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 9 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 02:39:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 9: Verified+1 Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@270 PS7, Line 270: @Override > nit: extra line here and below Done http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@783 PS7, Line 783: ype.PR > nit: order Done -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 9 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 02:38:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11358 to look at the new patch set (#9). Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. IMPALA-7510. Support principals/privileges with LocalCatalog This enables support for Sentry authorization when LocalCatalog is enabled. The design is detailed in a change to the comment on CatalogdMetaProvider, but to recap it briefly here: At a high level, this patch takes the approach of duplicating the "v1" catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog daemon publishes complete objects into the statestore topic, and the impalad fully replicates them locally. I took this approach rather than trying to do fine-grained caching and invalidation for the following reasons: - The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude smaller than table metadata. So, the benefit of fine-grained caching and eviction is not as great. - The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined with relationships between them and backwards mappings maintained from groups back to principals. This logic is implemented by the AuthorizationPolicy class. Implementing similar mapping in a fine-grained caching approach would be a reasonable amount of work. - This bit of code is under some current flux as others are working on implementing more fine grained permissioning. Thus, trying to duplicate the logic in a "fetch-on-demand" implementation might turn out to be chasing somewhat of a moving target. In order to take this approach, the patch is organized as follows: - refactored some of the role/principal removal logic from ImpaladCatalog into AuthorizationPolicy. This makes it easier to perform the similar "subscribe" with less duplicate cdoe. - changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2 catalogs in addition to v1. - passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and added an AuthorizationPolicy member there. This member is maintained when we see PRINCIPAL and PRIVILEGE objects come via the catalog updates. - had to implement LocalCatalog.isReady() to ensure that we don't allow user access until the first topic update has been consumed. - additionally had to copy some other code from ImpaladCatalog to protect against various races -- we need a CatalogDeltaLog as well as careful sequencing of the order in which the objects apply. With this patch and the following one to enable UDF support, I was able to run the tests in tests/authorization successfully with LocalCatalog enabled. Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 --- M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M tests/common/custom_cluster_test_suite.py 9 files changed, 272 insertions(+), 67 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/11358/9 -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 9 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11358 to look at the new patch set (#8). Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. IMPALA-7510. Support principals/privileges with LocalCatalog This enables support for Sentry authorization when LocalCatalog is enabled. The design is detailed in a change to the comment on CatalogdMetaProvider, but to recap it briefly here: At a high level, this patch takes the approach of duplicating the "v1" catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog daemon publishes complete objects into the statestore topic, and the impalad fully replicates them locally. I took this approach rather than trying to do fine-grained caching and invalidation for the following reasons: - The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude smaller than table metadata. So, the benefit of fine-grained caching and eviction is not as great. - The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined with relationships between them and backwards mappings maintained from groups back to principals. This logic is implemented by the AuthorizationPolicy class. Implementing similar mapping in a fine-grained caching approach would be a reasonable amount of work. - This bit of code is under some current flux as others are working on implementing more fine grained permissioning. Thus, trying to duplicate the logic in a "fetch-on-demand" implementation might turn out to be chasing somewhat of a moving target. In order to take this approach, the patch is organized as follows: - refactored some of the role/principal removal logic from ImpaladCatalog into AuthorizationPolicy. This makes it easier to perform the similar "subscribe" with less duplicate cdoe. - changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2 catalogs in addition to v1. - passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and added an AuthorizationPolicy member there. This member is maintained when we see PRINCIPAL and PRIVILEGE objects come via the catalog updates. - had to implement LocalCatalog.isReady() to ensure that we don't allow user access until the first topic update has been consumed. - additionally had to copy some other code from ImpaladCatalog to protect against various races -- we need a CatalogDeltaLog as well as careful sequencing of the order in which the objects apply. With this patch and the following one to enable UDF support, I was able to run the tests in tests/authorization successfully with LocalCatalog enabled. Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 --- M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M tests/common/custom_cluster_test_suite.py 9 files changed, 274 insertions(+), 67 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/11358/8 -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 8 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11357 ) Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962 Gerrit-Change-Number: 11357 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 02:36:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 6: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/598/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 02:31:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 02:26:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/597/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 02:22:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11357 ) Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962 Gerrit-Change-Number: 11357 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 02:20:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3121/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 02:20:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 6: (18 comments) http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG@19 PS4, Line 19: is nit: are http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG@20 PS4, Line 20: randomly stuffed into catalogd what does this mean? http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@210 PS4, Line 210: query only queries? http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@211 PS4, Line 211: evict just clarifying that this knob and the memory ones are independent. in other words, if you specify a timeout, tables will still be evicted regardless of how close you get to the memory threshold set below. http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@212 PS4, Line 212: unused nit: remove http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@225 PS4, Line 225: I understand this knob to establish a target lower bound for an eviction event. Why is this in terms of table fraction as opposed to a target based on memory fraction (mb or fraction)? http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@226 PS4, Line 226: ++ remove http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/exec/catalog-op-executor.h File be/src/exec/catalog-op-executor.h: http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/exec/catalog-op-executor.h@72 PS4, Line 72: table names nit: simplify to just tables http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2000 PS4, Line 2000: /** Because of the thrift spec, I was expecting to see usage count being used somewhere. Add a todo to make use of it. http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@46 PS4, Line 46: ize. stale? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@62 PS4, Line 62: eviction http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@89 PS4, Line 89:*/ add an "_". http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@116 PS4, Line 116: rn new CatalogdTableInvalidator(catalog, invalidateTablesTimeoutSec, : i are these two validated anywhere (e.g., [0,1])? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@140 PS4, Line 140: ldPool = is there somewhat stable site to link to where these names are given, perhaps for different jdk's? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@146 PS4, Line 146: lastObservedGcCount_ = gcbean.getCollectionCount(); why not do this check before setting various class members above? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@175 PS4, Line 175: getLastGcInfo() just starting here with the doc for this method, I see that null can be returned. I have not checked the rest of the chain following this invocation, but do we want to guard this? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java File fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@94 PS4, Line 94: ile (true) { put this outside of the while loop. http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@656 PS4, Line 656: load what rule are you using to pair the load with refreshing the last-used-time? for example, why isn't the load in the true branch on L653 also refreshed? generalizing further, should the refresh be part of a Table
[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog
Hello Bharath Vissapragada, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11158 to look at the new patch set (#2). Change subject: IMPALA-7540. Intern most repetitive strings and network addresses in catalog .. IMPALA-7540. Intern most repetitive strings and network addresses in catalog This adds interning to a bunch of repeated strings in catalog objects, including: - table name - DB name - owner - column names - input/output formats - parameter keys - common parameter values ("true", "false", etc) - HBase column family names Additionally, it interns TNetworkAddresses, so that each datanode host is only stored once rather than having its own copy in each table. I verified this patch using jxray on the development catalogd and impalad. The following lines are removed entirely from the "duplicate strings" report: Overhead # char[]s # objects Value 164K (0.3%) 2,635 2,635 "127.0.0.1" 97K (0.2%) 1,038 1,038 "__HIVE_DEFAULT_PARTITION__" 95K (0.2%) 1,111 1,111 "transient_lastDdlTime" 92K (0.1%) 1,975 1,975 "d" 70K (0.1%) 997 997"EXTERNAL_TABLE" 56K (< 0.1%)1,201 1,201 "todd" 54K (< 0.1%)998 998"EXTERNAL" 46K (< 0.1%)998 998"TRUE" 44K (< 0.1%)567 567"numFilesErasureCoded" 38K (< 0.1%)612 612"totalSize" 30K (< 0.1%)567 567"numFiles" The following are reduced substantially: Before: 72K (0.1%) 1,543 1,543 "1" After: 47K (< 0.1%)1,009 1,009 "1" A few large strings remain in the report that may be worth addressing, depending on whether we think production catalogs exhibit the same repetitions: 1) Avro schemas, eg: 204K (0.3%) 3 3 "{"fields": [{"type": ["boolean", "null"], "name": "bool_col1"}, {"type": ["int", "null"], "name": "tinyint_col1"}, {"type": ...[length 52429]" (in the development catalog there are multiple tables with the same Avro schema) 2) Partition location suffixes, eg: 144K (0.2%) 1,234 1,234 "many_blocks_num_blocks_per_partition_1" 17K (< 0.1%)230 230"year=2009/month=2" 17K (< 0.1%)230 230"year=2009/month=3" 17K (< 0.1%)230 230"year=2009/month=1" (in the development catalog lots of tables have the same partitioning layout) 3) Unsure (jxray isn't reporting the reference chain, but seems likely to be partition values): 49K (< 0.1%)1,058 1,058 "2010" 28K (< 0.1%)612 612"2009" 27K (< 0.1%)585 585"0" 22K (< 0.1%)71 899"" Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26 --- A fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java M fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java 5 files changed, 240 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/11158/2 -- To view, visit http://gerrit.cloudera.org:8080/11158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26 Gerrit-Change-Number: 11158 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Tianyi Wang has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. IMPALA-7448: Invalidate recently unused tables from catalogd This patch implements an automatic invalidation mechanism in catalogd. There are two invalidation strategies: 1. Periodically the HDFS tables that are not used in a configured period "invalidate_tables_timeout" is invalidated from catalogd. 2. If the old GC generation is almost full, a certain percentage of LRU tables are invalidated. This can be enabled by backend flag "invalidate_tables_on_memory_pressure". The table usage is reported by impalad to catalogd when the tables are used during planning. Tests on time-based invalidation is added. It is manually verified that the GC callback is called if strings are randomly stuffed into catalogd. Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-service-client-wrapper.h M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/common/global-flags.cc M be/src/exec/catalog-op-executor.cc M be/src/exec/catalog-op-executor.h M be/src/service/fe-support.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java A fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java A fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java A tests/custom_cluster/test_automatic_invalidation.py 25 files changed, 788 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/11224/6 -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Tianyi Wang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. IMPALA-7448: Invalidate recently unused tables from catalogd This patch implements an automatic invalidation mechanism in catalogd. There are two invalidation strategies: 1. Periodically the HDFS tables that are not used in a configured period "invalidate_tables_timeout" is invalidated from catalogd. 2. If the old GC generation is almost full, a certain percentage of LRU tables are invalidated. This can be enabled by backend flag "invalidate_tables_on_memory_pressure". The table usage is reported by impalad to catalogd when the tables are used during planning. Tests on time-based invalidation is added. It is manually verified that the GC callback is called if strings are randomly stuffed into catalogd. Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-service-client-wrapper.h M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/common/global-flags.cc M be/src/exec/catalog-op-executor.cc M be/src/exec/catalog-op-executor.h M be/src/service/fe-support.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java A fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java A fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java A tests/custom_cluster/test_automatic_invalidation.py 25 files changed, 788 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/11224/5 -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] Replace some TODOs with specific JIRA references
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11392 ) Change subject: Replace some TODOs with specific JIRA references .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11392/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11392/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1322 PS1, Line 1322:* TODO(IMPALA-7533): currently this inherits from VersionedTableCacheKey. This means that, if line too long (96 > 90) -- To view, visit http://gerrit.cloudera.org:8080/11392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77 Gerrit-Change-Number: 11392 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 01:23:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7499: build against CDH Kudu
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11363 ) Change subject: IMPALA-7499: build against CDH Kudu .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3120/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6e1048438b6d09a1b38c58371d6212bb6dcc06c Gerrit-Change-Number: 11363 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Thu, 06 Sep 2018 00:46:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10950 ) Change subject: IMPALA-376: add built-in functions for parsing JSON .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514 PS13, Line 514: get_json_object > I'd prefer to make Hive compatibility be explicity brought up before decidi Ah, haven't considered the ANSI standards before. My initial motivation is to replace the Hive UDF get_json_object written in Java since it's out of control in memory management. There's a slight difference between the Hive get_json_object() and ANSI json_value(). json_value only returns scalar values, while get_json_object can return JSON string for a selected JSON object. Maybe the ANSI json_query() function is more similar. In Hulu, most of our Impala users are already using the get_json_object function. Some upper systems may also use this function. Since it's already a Hive convention, I think it worths to keep this name. My principle is "start from the user". Rename the function may cause trouble for them. However, the ANSI standards have more flexible syntax. What about creating a new JIRA for implementing the "returning " and "value on error" clauses? -- To view, visit http://gerrit.cloudera.org:8080/10950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f Gerrit-Change-Number: 10950 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 06 Sep 2018 00:14:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/11355 ) Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and PARQUET_DICTIONARY_FILTERING .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_dictionary_filtering.xml File docs/topics/impala_parquet_dictionary_filtering.xml: http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_dictionary_filtering.xml@87 PS4, Line 87: The NumDictFilteredRowGroups field in the query profile output shows the We should mention somewhere that row groups can be filtered out by parquet stats, and in such cases won't even be considered for dictionary filtering. http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml File docs/topics/impala_parquet_read_statistics.xml: http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml@49 PS4, Line 49: Parquet stores min/max stats which can be used to skip reading blocks if they don't The stats are (currently) used to skip row groups. "qualify a predicate" sounds unfamiliar to me. This blog post has a more detailed description: https://blog.cloudera.com/blog/2017/12/faster-performance-for-selective-queries/ http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml@61 PS4, Line 61: Of the numerical types: bool, integer, floating point Impala also supports other data types, but there are restrictions around the library and version that wrote them. I'll share some more docs with you momentarily. -- To view, visit http://gerrit.cloudera.org:8080/11355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0 Gerrit-Change-Number: 11355 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Thu, 06 Sep 2018 00:11:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/594/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 00:04:55 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/595/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 00:00:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 7: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@270 PS7, Line 270: nit: extra line here and below http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@783 PS7, Line 783: oreder nit: order http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java: http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@214 PS6, Line 214: // here. > It might be sufficient (or at least better) to do a sleeping loop checking makes sense to defer it -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 23:55:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Replace some TODOs with specific JIRA references
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11392 ) Change subject: Replace some TODOs with specific JIRA references .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/596/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77 Gerrit-Change-Number: 11392 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 23:50:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 23:44:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON
Greg Rahn has posted comments on this change. ( http://gerrit.cloudera.org:8080/10950 ) Change subject: IMPALA-376: add built-in functions for parsing JSON .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514 PS13, Line 514: get_json_object > I just want to note that ANSI SQL also has a function called json_object() I'd prefer to make Hive compatibility be explicity brought up before deciding on implementing it hence me favoring ANSI standards first. -- To view, visit http://gerrit.cloudera.org:8080/10950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f Gerrit-Change-Number: 10950 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 05 Sep 2018 23:41:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Replace some TODOs with specific JIRA references
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11392 ) Change subject: Replace some TODOs with specific JIRA references .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/11392/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11392/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@343 PS1, Line 343: // TODO(IMPALA-7534): there a race here if an invalidation comes in while we are nit: there is -- To view, visit http://gerrit.cloudera.org:8080/11392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77 Gerrit-Change-Number: 11392 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 23:39:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/593/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 23:28:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Hello Bharath Vissapragada, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11359 to look at the new patch set (#7). Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. IMPALA-7203. Support UDFs in CatalogdMetaProvider This adds support to fetch the list of function names within a database, and then to fetch the list of overloads for a given function name. These items are cached on the coordinator and invalidated when minimal function objects are seen on the minimal catalog topic stream. Aside from the straight-forward plumbing of this new RPC, it's worth noting that this patch changes the MetaProvider interface to provide Impala Functions directly instead of HMS Function objects. This means that we will now fully support all the types of functions supported in legacy catalogs. As such, test_udfs now passes. Making this change simplified the code in LocalDb but also means that DirectMetaProvider no longer supports fetching functions. When we move to trying to eliminate catalogd altogether at some point, we can revive this code. For now, I just throw an exception for the function-related code in DirectMetaProvider, as it's unused. The one other notable thing here is that TFunction now has optional fields where it used to have required ones. This was necessary in order to send invalidations as TCatalogObjects. Given that TFunction is an internal-only construct, this shouldn't raise compatibility issues. Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 --- M common/thrift/CatalogService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Function.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java 9 files changed, 232 insertions(+), 170 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11359/7 -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Hello Bharath Vissapragada, Andrew Sherman, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11388 to look at the new patch set (#3). Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile This patch adds a Java wrapper for a RuntimeProfile object. The wrapper supports some basic operations like non-hierarchical counters and informational strings. During planning, a profile is created, and passed back to the backend as part of the ExecRequest. The backend then updates the query profile based on the info emitted from the frontend. This patch also adds the first use case for this profile information: the CatalogdMetaProvider emits counters for cache hits, misses, and fetch times, broken down by metadata category. The emitted profile is a bit of a superset of the existing 'timeline' functionality. However, it seems that some tools may parse the timeline in its current location in the profile, so moving it might be incompatible. I elected to leave that alone for now and just emit counters in the new profile. WIP for a few reasons: - needs tests - maybe the amount of information is too verbose to include in every profile? Should we only include such info on misses, since we assume hits don't contribute to the timing much? Should we not bother breaking down by category? - worth trying to include the byte sizes of metadata fetched? Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/util/runtime-profile.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/main/java/org/apache/impala/service/FrontendProfile.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 9 files changed, 294 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/3 -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] Replace some TODOs with specific JIRA references
Todd Lipcon has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11392 Change subject: Replace some TODOs with specific JIRA references .. Replace some TODOs with specific JIRA references Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java 8 files changed, 30 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/11392/1 -- To view, visit http://gerrit.cloudera.org:8080/11392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77 Gerrit-Change-Number: 11392 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/592/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 05 Sep 2018 23:07:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3119/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 22:53:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11357 ) Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3118/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962 Gerrit-Change-Number: 11357 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 22:52:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11358 to look at the new patch set (#7). Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. IMPALA-7510. Support principals/privileges with LocalCatalog This enables support for Sentry authorization when LocalCatalog is enabled. The design is detailed in a change to the comment on CatalogdMetaProvider, but to recap it briefly here: At a high level, this patch takes the approach of duplicating the "v1" catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog daemon publishes complete objects into the statestore topic, and the impalad fully replicates them locally. I took this approach rather than trying to do fine-grained caching and invalidation for the following reasons: - The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude smaller than table metadata. So, the benefit of fine-grained caching and eviction is not as great. - The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined with relationships between them and backwards mappings maintained from groups back to principals. This logic is implemented by the AuthorizationPolicy class. Implementing similar mapping in a fine-grained caching approach would be a reasonable amount of work. - This bit of code is under some current flux as others are working on implementing more fine grained permissioning. Thus, trying to duplicate the logic in a "fetch-on-demand" implementation might turn out to be chasing somewhat of a moving target. In order to take this approach, the patch is organized as follows: - refactored some of the role/principal removal logic from ImpaladCatalog into AuthorizationPolicy. This makes it easier to perform the similar "subscribe" with less duplicate cdoe. - changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2 catalogs in addition to v1. - passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and added an AuthorizationPolicy member there. This member is maintained when we see PRINCIPAL and PRIVILEGE objects come via the catalog updates. - had to implement LocalCatalog.isReady() to ensure that we don't allow user access until the first topic update has been consumed. - additionally had to copy some other code from ImpaladCatalog to protect against various races -- we need a CatalogDeltaLog as well as careful sequencing of the order in which the objects apply. With this patch and the following one to enable UDF support, I was able to run the tests in tests/authorization successfully with LocalCatalog enabled. Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 --- M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M tests/common/custom_cluster_test_suite.py 9 files changed, 274 insertions(+), 67 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/11358/7 -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. Patch Set 12: Verified+1 Code-Review+2 Going to forward the +1 instead of reverifying the whole thing and waiting more hours. I differed the verified revision with the new rebased revision and the only change was to shell/impala_shell.py which is unrelated to the changes in this patch. -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 12 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 22:51:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. IMPALA-7469. Invalidate LocalCatalog cache based on topic updates This implements cache invalidation inside CatalogdMetaProvider. The design is as follows: - when the catalogd collects updates into the statestore topic, it now adds an additional entry for each table and database. These additional entries are minimal - they only include the object's name, but no metadata. This new behavior is conditional on a new flag --catalog_topic_mode. The default mode is to keep the old style, but it can be configured to mixed (support both v1 and v2) or v2-only. - the old-style topic entries are prefixed with a '1:' whereas the new minimal entries are prefixed with a '2:'. The impalad will subscribe to one or the other prefix depending on whether it is running with --use_local_catalog. Thus, old impalads will not be confused by the new entries and vice versa. - when the impalad gets these topic updates, it forwards them through to the catalog implementation. The LocalCatalog implementation forwards them to the CatalogdMetaProvider, which uses them to invalidate cached metadata as appropriate. This patch includes some basic unit tests. I also did some manual testing by connecting to different impalads and verifying that a session connected to impalad #1 saw the effects of DDLs made by impalad #2 within a short period of time (the statestore topic update frequency). Existing end-to-end tests cover these code paths pretty thoroughly: - if we didn't automatically invalidate the cache on a coordinator in response to DDL operations, then any test which expects to "read its own writes" (eg access a table after creating one) would fail - if we didn't propagate invalidations via the statestore, then all of the tests that use sync_ddl would fail. I verified the test coverage above using some of the tests in test_ddl.py -- I selectively commented out a few of the invalidation code paths in the new code and verified that tests failed until I re-introduced them. Along the way I also improved test_ddl so that, when this code is broken, it properly fails with a timeout. It also has a bit of expanded coverage for both the SYNC_DDL and non-SYNC cases. I also wrote a new custom-cluster test for LocalCatalog that verifies a few of the specific edge cases like detecting catalogd restart, SYNC_DDL behavior in mixed mode, etc. One notable exception here is the implementation of INVALIDATE METADATA This turned out to be complex to implement, so I left a lengthy TODO describing the issue and filed a JIRA. Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Reviewed-on: http://gerrit.cloudera.org:8080/11280 Reviewed-by: Todd Lipcon Tested-by: Todd Lipcon --- M be/src/catalog/catalog-server.cc M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/common/Pair.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java M tests/common/custom_cluster_test_suite.py A tests/custom_cluster/test_local_catalog.py M tests/metadata/test_ddl.py 20 files changed, 825 insertions(+), 105 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 13 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 11 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 22:42:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. IMPALA-6568 add missing Query Compilation section to profiles. The profile command is used to display low-level information about the most recent query. When a client makes request for the Profile, it sends a GetRuntimeProfile request for the last queryId to to the server. The queryId is used to find the ClientRequestState, an object which tracks information about the execution, including Profile data which is stored in several RuntimeProfile objects. The reply to the GetRuntimeProfile message contains the Profile, pretty-printed as a string. When a query is sent to the Front End to be compiled, the TExecRequest that is returned from createExecRequest() in the JVM contains a Timeline, which is a named sequence of events with timing information. This Timeline is added to the Summary Profile in the ClientRequestState. In the following cases the Front End was not setting the Timeline in the TExecRequest: - All DDL operations - Load data statements - Set statements - Explain statements And this meant that the profile would not contain the "Query Compilation" timeline. I refactored the main createExecRequest() method to - try to make the flow clearer, - allow the timeline to be set in the TExecRequest in only one place. - to set "Planning finished" in all timelines TESTING: Add a new test to test_observability.py which checks that the "Query Compilation" and "Planning finished" timelines appear in the profile for various queries designed to exercise the new code paths in createExecRequest. Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 --- M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/query_test/test_observability.py 2 files changed, 147 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/3 -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1167 PS1, Line 1167: timeline.markEvent("Planning finished"); > hmm, but even with the earlier 'planning finished', statements like LOAD DA OK, thanks, I think I see what you mean now. -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 05 Sep 2018 22:30:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/11358/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11358/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@738 PS3, Line 738: > I read this as a bug originally until I saw that newCatalogVersion may be u changed to using a boxed Long which starts as null, and added a comment. lmk if it's better http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@235 PS6, Line 235: retried via > nit: pushed from I wonder where my wires got crossed there... http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@750 PS6, Line 750: sequencer > since its specific to auth, pls call this authSequencer or something like t Done http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@862 PS6, Line 862: LOG.error("Error adding privilege: ", e); > For the case that we got here via a direct ddl, wouldn't the caller want to yea I found this surprising as well. The weird thing is that in that case it would have succeeded but not be reflected in the local impalad. I'll add a TODO here that this error handling is suspicious, but not sure how to fix it (and it's pre-existing) http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java: http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@214 PS6, Line 214: // here. > is isReady sufficient here? from catalogmetaprovider, its ready when its ad It might be sufficient (or at least better) to do a sleeping loop checking isReady but would rather defer to a separate patch since it's unrelated (I have a JIRA for waiting properly at startup) -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 22:30:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11368 ) Change subject: Elaborating error message when incorrect port number is specified in impala-shell. .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11368 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33 Gerrit-Change-Number: 11368 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 22:12:23 + Gerrit-HasComments: No
[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11368 ) Change subject: Elaborating error message when incorrect port number is specified in impala-shell. .. Elaborating error message when incorrect port number is specified in impala-shell. I recently helped debug an issue where impala-shell was being given the hiveserver2 port rather than the beeswax port. I've updated the error message a little bit to indicate that this may be the issue. Here is the new message: $impala-shell.sh -i :21050 -q 'select 1' Starting Impala Shell without Kerberos authentication Error: Unable to communicate with impalad service. This service may not be an impalad instance. A common problem is that the port specified does not match the -beeswax_port flag on the underlying impalad. Check host:port and try again. Traceback (most recent call last): File "/home/philip/src/Impala/shell/impala_shell.py", line 1709, in execute_queries_non_interactive_mode(options, query_options) File "/home/philip/src/Impala/shell/impala_shell.py", line 1565, in execute_queries_non_interactive_mode shell = ImpalaShell(options, query_options) File "/home/philip/src/Impala/shell/impala_shell.py", line 232, in __init__ self.do_connect(options.impalad) File "/home/philip/src/Impala/shell/impala_shell.py", line 798, in do_connect self._connect() File "/home/philip/src/Impala/shell/impala_shell.py", line 842, in _connect result = self.imp_client.connect() File "/home/philip/src/Impala/shell/impala_client.py", line 257, in connect result = self.ping_impala_service() File "/home/philip/src/Impala/shell/impala_client.py", line 262, in ping_impala_service return self.imp_service.PingImpalaService() File "/home/philip/src/Impala/shell/gen-py/ImpalaService/ImpalaService.py", line 229, in PingImpalaService return self.recv_PingImpalaService() File "/home/philip/src/Impala/shell/gen-py/ImpalaService/ImpalaService.py", line 245, in recv_PingImpalaService raise x thrift.Thrift.TApplicationException: Invalid method name: 'PingImpalaService' Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33 Reviewed-on: http://gerrit.cloudera.org:8080/11368 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M shell/impala_shell.py 1 file changed, 5 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11368 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33 Gerrit-Change-Number: 11368 Gerrit-PatchSet: 3 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10950 ) Change subject: IMPALA-376: add built-in functions for parsing JSON .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514 PS13, Line 514: get_json_object > Is it worth doing both? I think portability from Hive is also important (no I just want to note that ANSI SQL also has a function called json_object() with quite different functionality (it creates a JSON object from a list of key-value pairs). So, it can be confusing for users if we want to be portable with both Hive and ANSI SQL. -- To view, visit http://gerrit.cloudera.org:8080/10950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f Gerrit-Change-Number: 10950 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 05 Sep 2018 22:09:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11386 ) Change subject: [DOCS] Additional updates to the fine-grained priv .. Patch Set 2: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/67/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/11386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Gerrit-Change-Number: 11386 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 21:54:07 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/11386 ) Change subject: [DOCS] Additional updates to the fine-grained priv .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_grant.xml File docs/topics/impala_grant.xml: http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_grant.xml@93 PS1, Line 93: Impala does not currently support revoking only the WITH GRANT OPTION : from a privilege previously granted to a role. To remove the WITH GRANT : OPTION, revoke the privilege and grant it again without the WITH GRANT : OPTION flag. > This is not correct. You cannot revoke a privilege if it has the grant opt Added in impala_revoke.xml http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_revoke.xml File docs/topics/impala_revoke.xml: http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_revoke.xml@57 PS1, Line 57: privilege > The optional "GRANT OPTION FOR" is missing. Done -- To view, visit http://gerrit.cloudera.org:8080/11386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Gerrit-Change-Number: 11386 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 21:45:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11386 ) Change subject: [DOCS] Additional updates to the fine-grained priv .. Patch Set 2: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/67/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/11386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Gerrit-Change-Number: 11386 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 21:44:36 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv
Hello Impala Public Jenkins, Adam Holley, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11386 to look at the new patch set (#2). Change subject: [DOCS] Additional updates to the fine-grained priv .. [DOCS] Additional updates to the fine-grained priv Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c --- M docs/shared/impala_common.xml M docs/topics/impala_authorization.xml M docs/topics/impala_grant.xml M docs/topics/impala_revoke.xml 4 files changed, 567 insertions(+), 345 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/11386/2 -- To view, visit http://gerrit.cloudera.org:8080/11386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Gerrit-Change-Number: 11386 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11193 ) Change subject: IMPALA-7425: Change incremental stats to pull from catalogd. .. IMPALA-7425: Change incremental stats to pull from catalogd. Currently, incremental stats can consume a substantial amount of metadata memory (per table, partition, column). This metadata is transmitted from catalogd to all coordinators. As a result, memory is used for all loaded tables that use incremental stats all the time at all coordinators. A consequence is that coordinators and catalogd die from OOM more often when incremental stats are used and more network bandwidth is used. This change removes incremental stats from impalads. These stats are only needed when computing incremental statistics and merging new results with the existing results. They are not used by queries. As a result, the change requires that coordinators fetch incremental stats directly from catalogd when computing incremental stats. In addition, catalogd no longer sends incremental stats to coordinators via the statestore. The option is enabled by setting a new flag, --pull_incremental_statistics, on the catalogd and all impalad coordinators. Testing: - manual testing - added end-to-end tests with --pull_incremental_statistics enabled for the compute-stats-incremental.test - added fe CatalogTest for new catalogd service method - passes exhaustive tests when --pull_incremental_statistics is enabled and disabled Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Reviewed-on: http://gerrit.cloudera.org:8080/11193 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-service-client-wrapper.h M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/common/global-flags.cc M be/src/exec/catalog-op-executor.cc M be/src/exec/catalog-op-executor.h M be/src/service/fe-support.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/custom_cluster_test_suite.py M tests/conftest.py A tests/custom_cluster/test_pull_stats.py 27 files changed, 628 insertions(+), 54 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/11193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Gerrit-Change-Number: 11193 Gerrit-PatchSet: 16 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11193 ) Change subject: IMPALA-7425: Change incremental stats to pull from catalogd. .. Patch Set 15: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Gerrit-Change-Number: 11193 Gerrit-PatchSet: 15 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 20:49:52 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: WIP: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1081 PS2, Line 1081: TQueryExecRequest queryExecRequest = getPlannedExecRequest(queryCtx, analysisResult, timeline, explainString); line too long (114 > 90) http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1130 PS2, Line 1130: private void setMtDopForCatalogOp(AnalysisResult analysisResult, TQueryOptions queryOptions) { line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1145 PS2, Line 1145: private TExecRequest createTExecRequest(TQueryCtx queryCtx, AnalysisResult analysisResult) { line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1157 PS2, Line 1157: private void addFinalizationParamsForInsert(TQueryCtx queryCtx, AnalysisResult analysisResult, line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1195 PS2, Line 1195: private TQueryExecRequest getPlannedExecRequest(TQueryCtx queryCtx, AnalysisResult analysisResult, line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1197 PS2, Line 1197: StringBuilder explainString) throws ImpalaException { line too long (103 > 90) -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 05 Sep 2018 20:12:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv
Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11386 ) Change subject: [DOCS] Additional updates to the fine-grained priv .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_grant.xml File docs/topics/impala_grant.xml: http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_grant.xml@93 PS1, Line 93: Impala does not currently support revoking only the WITH GRANT OPTION : from a privilege previously granted to a role. To remove the WITH GRANT : OPTION, revoke the privilege and grant it again without the WITH GRANT : OPTION flag. This is not correct. You cannot revoke a privilege if it has the grant option set to true. If a privilege is granted with grant option, it takes two statements to revoke it. e.g. if: GRANT ALL ON SERVER TO ROLE foo_role; then to remove: REVOKE GRANT OPTION FOR ALL ON SERVER FROM ROLE foo_role; REVOKE ALL ON SERVER FROM ROLE foo_fole; http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_revoke.xml File docs/topics/impala_revoke.xml: http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_revoke.xml@57 PS1, Line 57: privilege The optional "GRANT OPTION FOR" is missing. -- To view, visit http://gerrit.cloudera.org:8080/11386 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c Gerrit-Change-Number: 11386 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 20:00:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: WIP: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/591/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 05 Sep 2018 19:53:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3117/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 11 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:40:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/11280/11/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/11280/11/tests/custom_cluster/test_local_catalog.py@108 PS11, Line 108: @ flake8: E303 too many blank lines (2) -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 11 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:39:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: WIP: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1167 PS1, Line 1167: FeFsTable hdfsTable = (FeFsTable) insertStmt.getTargetTable(); > I did not consider this, but I will now. hmm, but even with the earlier 'planning finished', statements like LOAD DATA don't have the 'Planning finished' in the timeline. I was thinking of taking the entirety of the function body into a new 'doCreateExecStatement', and have the wrapper function create a timeline and be responsible for setting it into the result, sort of similar to the refactor I started here: https://gerrit.cloudera.org/c/11388/2/fe/src/main/java/org/apache/impala/service/Frontend.java#1024 Would that allow you to avoid the repeated calls to result.setTimeline everywhere? -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 05 Sep 2018 19:39:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10950 ) Change subject: IMPALA-376: add built-in functions for parsing JSON .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514 PS13, Line 514: get_json_object > I also left a comment on the Jira, but I'd recommend we name and implement Is it worth doing both? I think portability from Hive is also important (not just portability from other ANSI sql dbs) -- To view, visit http://gerrit.cloudera.org:8080/10950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f Gerrit-Change-Number: 10950 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 05 Sep 2018 19:32:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950 PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning finished"); > Looks like both FE and BE set "Planning Finished" in the timeline. Is that It's certainly weird, but actually they are in two different timelines. I didn't want to change semantics with this patch, though this stuff is definitely due for a serious cleanup. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:30:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/11387 ) Change subject: WIP: IMPALA-6568 add missing Query Compilation section to profiles. .. WIP: IMPALA-6568 add missing Query Compilation section to profiles. WIP: so Todd can look at refactor TODO: change tests to check that "Planning finished" is present for explain queries The profile command is used to display low-level information about the most recent query. When a client makes request for the Profile, it sends a GetRuntimeProfile request for the last queryId to to the server. The queryId is used to find the ClientRequestState, an object which tracks information about the execution, including Profile data which is stored in several RuntimeProfile objects. The reply to the GetRuntimeProfile message contains the Profile, pretty-printed as a string. When a query is sent to the Front End to be compiled, the TExecRequest that is returned from createExecRequest() in the JVM contains a Timeline, which is a named sequence of events with timing information. This Timeline is added to the Summary Profile in the ClientRequestState. In the following cases the Front End was not setting the Timeline in the TExecRequest: - All DDL operations - Load data statements - Set statements - Explain statements And this meant that the profile would not contain the "Query Compilation" timeline. ALTERNATIVES: To fix this I set the Timeline in TExecRequest before returning it. >result.setTimeline(timeline.toThrift()); >return result; I considered writing this as >return result.setTimeline(timeline.toThrift()); To try to make sure this code gets copied in any future return statements, but I didn’t think it was worth the cost to readability. REFACTOR I refactored the main createExecRequest() method to try to make the flow clearer. TESTING: Add a new test to test_observability.py which checks that the "Query Compilation" timeline appears in the profile for various queries designed to exercise the new code paths in createExecRequest. Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 --- M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/query_test/test_observability.py 2 files changed, 134 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/2 -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/590/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 11 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:16:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11358 ) Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/11358/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11358/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@738 PS3, Line 738: I read this as a bug originally until I saw that newCatalogVersion may be updated on L782. Perhaps use two variables for clarity, such as currentCatalogVersion and nextCatalogVersion? http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@235 PS6, Line 235: retried via nit: pushed from http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@750 PS6, Line 750: sequencer since its specific to auth, pls call this authSequencer or something like this. http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@862 PS6, Line 862: LOG.error("Error adding privilege: ", e); For the case that we got here via a direct ddl, wouldn't the caller want to know that the 'add' didn't succeed? Separate from this change, but error propagation is not the clearest with this path (same goes for existing ImpaladCatalog). http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java: http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@214 PS6, Line 214: // here. is isReady sufficient here? from catalogmetaprovider, its ready when its advanced past the initial version. -- To view, visit http://gerrit.cloudera.org:8080/11358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558 Gerrit-Change-Number: 11358 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:12:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 21: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/589/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 21 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:08:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 21: (1 comment) http://gerrit.cloudera.org:8080/#/c/11314/21/tests/authorization/test_owner_privileges.py File tests/authorization/test_owner_privileges.py: http://gerrit.cloudera.org:8080/#/c/11314/21/tests/authorization/test_owner_privileges.py@249 PS21, Line 249: " flake8: E501 line too long (92 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 21 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:06:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950 PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning finished"); Looks like both FE and BE set "Planning Finished" in the timeline. Is that wrong? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:02:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 11 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:00:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/11280/7/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/11280/7/tests/custom_cluster/test_local_catalog.py@25 PS7, Line 25: > Actually ended up just implementing these tests here instead of adding the even better, thx! http://gerrit.cloudera.org:8080/#/c/11280/11/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/11280/11/tests/custom_cluster/test_local_catalog.py@42 PS11, Line 42: --per_impalad_args= nice. -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 11 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:00:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift File common/thrift/Types.thrift: http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift@205 PS6, Line 205: specifier > This is what I understood from the change here: now, TFunction can be used yep, that's right. Will amend the comments as you suggested. -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 18:55:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift File common/thrift/Types.thrift: http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift@205 PS6, Line 205: specifier > hmm, by duplicate, do you mean making a new struct? or change the argument This is what I understood from the change here: now, TFunction can be used as a query. So for example, if you want a java udf's for function 'foo' that return strings, you would just specify binary_type and ret_type and the fetch would retrieve the subset that matches. After seeing how this is used in catalogmetaprovider, I see that its just the function name is set and the others are left unset. Perhaps this will clarify it: - change L198 to state that this struct is used to represent functions in the catalog, query plans, and as specifiers to fetch function overloads by name. - state that when used as a specifier, only the name is set (required) and the previously required fields are not used. -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 18:54:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11368 ) Change subject: Elaborating error message when incorrect port number is specified in impala-shell. .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3116/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11368 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33 Gerrit-Change-Number: 11368 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 18:51:04 + Gerrit-HasComments: No
[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11368 ) Change subject: Elaborating error message when incorrect port number is specified in impala-shell. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11368 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33 Gerrit-Change-Number: 11368 Gerrit-PatchSet: 2 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 18:51:03 + Gerrit-HasComments: No
[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/11368 ) Change subject: Elaborating error message when incorrect port number is specified in impala-shell. .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11368 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33 Gerrit-Change-Number: 11368 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 18:44:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Hello Bharath Vissapragada, Tianyi Wang, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11280 to look at the new patch set (#11). Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. IMPALA-7469. Invalidate LocalCatalog cache based on topic updates This implements cache invalidation inside CatalogdMetaProvider. The design is as follows: - when the catalogd collects updates into the statestore topic, it now adds an additional entry for each table and database. These additional entries are minimal - they only include the object's name, but no metadata. This new behavior is conditional on a new flag --catalog_topic_mode. The default mode is to keep the old style, but it can be configured to mixed (support both v1 and v2) or v2-only. - the old-style topic entries are prefixed with a '1:' whereas the new minimal entries are prefixed with a '2:'. The impalad will subscribe to one or the other prefix depending on whether it is running with --use_local_catalog. Thus, old impalads will not be confused by the new entries and vice versa. - when the impalad gets these topic updates, it forwards them through to the catalog implementation. The LocalCatalog implementation forwards them to the CatalogdMetaProvider, which uses them to invalidate cached metadata as appropriate. This patch includes some basic unit tests. I also did some manual testing by connecting to different impalads and verifying that a session connected to impalad #1 saw the effects of DDLs made by impalad #2 within a short period of time (the statestore topic update frequency). Existing end-to-end tests cover these code paths pretty thoroughly: - if we didn't automatically invalidate the cache on a coordinator in response to DDL operations, then any test which expects to "read its own writes" (eg access a table after creating one) would fail - if we didn't propagate invalidations via the statestore, then all of the tests that use sync_ddl would fail. I verified the test coverage above using some of the tests in test_ddl.py -- I selectively commented out a few of the invalidation code paths in the new code and verified that tests failed until I re-introduced them. Along the way I also improved test_ddl so that, when this code is broken, it properly fails with a timeout. It also has a bit of expanded coverage for both the SYNC_DDL and non-SYNC cases. I also wrote a new custom-cluster test for LocalCatalog that verifies a few of the specific edge cases like detecting catalogd restart, SYNC_DDL behavior in mixed mode, etc. One notable exception here is the implementation of INVALIDATE METADATA This turned out to be complex to implement, so I left a lengthy TODO describing the issue and filed a JIRA. Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 --- M be/src/catalog/catalog-server.cc M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/common/Pair.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java M tests/common/custom_cluster_test_suite.py A tests/custom_cluster/test_local_catalog.py M tests/metadata/test_ddl.py 20 files changed, 825 insertions(+), 105 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/11280/11 -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 11 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift File common/thrift/Types.thrift: http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift@205 PS6, Line 205: specifier > This is clear in the context of this change (and elegant), but I think it w hmm, by duplicate, do you mean making a new struct? or change the argument of GetPartialCatalogObject so that it uses some union of TFunctionName, TTableName, database name, etc, instead of CatalogObject? http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift@206 PS6, Line 206: optional > just checking, but when changing this from required to optional, existing, Yep, it only changes the runtime validation to be less strict, not the wire/disk format. -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 18:40:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 ) Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/11280/7/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/11280/7/tests/custom_cluster/test_local_catalog.py@25 PS7, Line 25: > k, I'll add a TODO see if I can follow that example. Actually ended up just implementing these tests here instead of adding the TODO. -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 10 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 18:37:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 21: Code-Review+1 carry +1 -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 21 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 18:33:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Adam Holley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. Patch Set 20: (3 comments) carry +1 http://gerrit.cloudera.org:8080/#/c/11314/20/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/11314/20/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@999 PS20, Line 999: // If object ownership is enabled in Sentry, we need to update the owner privilege : // in the catalog so that any subsequent statements that rely on that privilege, or : // the absence, will function correctly without waiting for the next refresh. : // e.g. For create, the privileges should be available to immediately create a table. : // Additionally, if the metadata operation is successful, but sentry fails to add : // the privilege, it will be removed on the next refresh. ALTER DATABASE SET OWNER : // can be used to try adding the owner privilege again. : if (isObjectOwnershipEnabled()) { : Preconditions.checkNotNull(params.server_name); : TPrivilege filter = createDatabaseOwnerPrivilege(newDb.getName(), : params.server_name); : addPrivilegeToCatalog(newDb.getMetaStoreDb().getOwnerName(), : newDb.getMetaStoreDb().getOwnerType(), filter, resp); : } > It would be nice to move similar blocks into separate functions. Done http://gerrit.cloudera.org:8080/#/c/11314/20/fe/src/test/resources/mysql-hive-site.xml.template File fe/src/test/resources/mysql-hive-site.xml.template: http://gerrit.cloudera.org:8080/#/c/11314/20/fe/src/test/resources/mysql-hive-site.xml.template@153 PS20, Line 153: : : hive.metastore.transactional.event.listeners : org.apache.hive.hcatalog.listener.DbNotificationListener : : : hive.metastore.event.listeners : org.apache.sentry.binding.metastore.SentrySyncHMSNotificationsPostEventListener : : : hcatalog.message.factory.impl.json : org.apache.sentry.binding.metastore.messaging.json.SentryJSONMessageFactory : > How is this change related to the patch? Maybe it could be mentioned in the Done http://gerrit.cloudera.org:8080/#/c/11314/20/tests/authorization/test_owner_privileges.py File tests/authorization/test_owner_privileges.py: http://gerrit.cloudera.org:8080/#/c/11314/20/tests/authorization/test_owner_privileges.py@250 PS20, Line 250: __root_query_fail > As I see this functions are always used in pair with __verify_exceptions(). Done -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 20 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Sep 2018 18:33:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER
Adam Holley has uploaded a new patch set (#21). ( http://gerrit.cloudera.org:8080/11314 ) Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER .. IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER This patch adds calls to automatically create or remove owner privileges in the catalog based on the statement. This is similar to the existing pattern where after privileges are granted in Sentry, they are created in the catalog directly instead of pulled from Sentry. The sentry-site.xml and hive-site.xml template files have been updated to enable usage of object ownership by default for easier development with a secured impala. When object ownership is enabled: CREATE DATABASE will grant the user OWNER privileges to that database. ALTER DATABASE SET OWNER will transfer the OWNER privileges to the new owner. DROP DATABASE will revoke the OWNER privileges from the owner. This will apply to DATABASE, TABLE, and VIEW. Example: If ownership is enabled, when a table is created, the creator is the owner, and Sentry will create owner privileges for the created table so the user can continue working with it without waiting for Sentry refresh. Inserts will be available immediately. Testing: - Created new custom cluster tests for object ownership Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 --- M bin/create-test-configuration.sh M bin/impala-config.sh M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java M fe/src/main/java/org/apache/impala/util/SentryProxy.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M fe/src/test/resources/mysql-hive-site.xml.template M fe/src/test/resources/postgresql-hive-site.xml.template M fe/src/test/resources/sentry-site.xml.template A fe/src/test/resources/sentry-site_no_oo.xml.template A fe/src/test/resources/sentry-site_oo_nogrant.xml.template M testdata/bin/run-sentry-service.sh A tests/authorization/test_owner_privileges.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_test_suite.py 31 files changed, 1,043 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11314/21 -- To view, visit http://gerrit.cloudera.org:8080/11314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0 Gerrit-Change-Number: 11314 Gerrit-PatchSet: 21 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 2: > Patch Set 2: > > Forgot to git add FrontendProfile? Indeed... I will never learn. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 18:07:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 4: (13 comments) http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog-util.cc File be/src/catalog/catalog-util.cc: http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog-util.cc@256 PS3, Line 256: bool TTableName::operator<(const impala::TTableName& that) const { > This is only for the following map usage in thrift. Renaming is not possibl Could we make that a list of pairs instead to get around this? This works fine for now, but if for example we at some point added a third field in TTableName (eg a "catalog" level to the naming) we'd have a really hard-to-find bug since it's unlikely anyone would remember to update this comparator. Furthermore, such a bug could cause bad crashes since operator== and operator< would end up inconsistent. http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@209 PS4, Line 209: invalidate_tables_timeout nit: this should have a suffix like "_sec" or "_s" so it's clear the units http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@218 PS4, Line 218: invalidate_tables_gc_ if the other two flags are hidden, maybe we shouldn't document them? It seems inconsistent to hide them and then mention them in a non-hidden flag. http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java: http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@86 PS4, Line 86: /** :* A callback called after a shrinking pass is triggered. Used for test purposes. :*/ wrong doc http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@112 PS4, Line 112: Preconditions.checkArgument(unusedTableTtlSec >= 0); add back the message so that if the user hits this, they know which configuration they got wrong http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@141 PS4, Line 141: if (poolName.contains("Old")) { maybe invert this and 'continue' so you can unnest the rest of the block? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@148 PS4, Line 148: it's not a implementation of more concise: "it does not implement" http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@209 PS4, Line 209: if (table.isLoaded()) { style nit: I think easier to read this loop if we invert this: "if (!table.isLoaded()) continue;" (generally I find it easier to read code when there is less nesting going on, by handling error or "unusual" cases up front in the control flow) http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@211 PS4, Line 211: if (inactivityTime > retireAgeNano) { same here http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@255 PS4, Line 255: unusedTableTtlNano_ this should also be changed to be more often than the TTL for the same reason as below. http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@260 PS4, Line 260: unusedTableTtlSec needs update http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@266 PS4, Line 266: error this should probably be a WARNING -- I think ERROR should be reserved for unrecoverable user-impacting errors, whereas this is a recoverable error. I think we should also ensure that the error message explains the context and what will happen as a result. For example, something like: LOG.warning("Unexpected exception thrown while attempting to automatically invalidate tables. Will retry in {} seconds", sleepTime, e) that way if the user sees the message they will know that this doesn't mean that some user query was failed, or that data was lost, or whatever. Also, I think we should make sure that if an exception is thrown, we still sleep before retrying. Otherwise we might get into some state with a tight loop. Perhaps we should just add a fixed wait of 5 seconds or something here in the catch clause? http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java File
[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11357 ) Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962 Gerrit-Change-Number: 11357 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 17:43:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11193 ) Change subject: IMPALA-7425: Change incremental stats to pull from catalogd. .. Patch Set 15: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Gerrit-Change-Number: 11193 Gerrit-PatchSet: 15 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 17:38:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11193 ) Change subject: IMPALA-7425: Change incremental stats to pull from catalogd. .. Patch Set 15: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3115/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Gerrit-Change-Number: 11193 Gerrit-PatchSet: 15 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 17:38:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11359 ) Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider .. Patch Set 6: (4 comments) several more nits http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@761 PS6, Line 761: req, "missing expected function"); nit: single line http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1154 PS6, Line 1154: an nit: a http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java File fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java: http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java@191 PS6, Line 191: functions nit: function names? http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java@219 PS6, Line 219: from HMS omit... could this change from getting these functions from catalogd or directly from hms? seems like it depends on the provider. -- To view, visit http://gerrit.cloudera.org:8080/11359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53 Gerrit-Change-Number: 11359 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 17:36:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/587/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 19 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 05 Sep 2018 17:29:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11193 ) Change subject: IMPALA-7425: Change incremental stats to pull from catalogd. .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/585/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Gerrit-Change-Number: 11193 Gerrit-PatchSet: 14 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 17:25:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 16: (20 comments) Thanks for the review and sorry for the large number of patches. Please look at patch set 17-20 as one commit. http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG@10 PS16, Line 10: do > typo: to Done http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG@51 PS16, Line 51: > Maybe you could mention your results about the speedup. I have added this information to line 26. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc File be/src/benchmarks/convert-timestamp-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@254 PS16, Line 254: v1 > nit: use better names, e.g.: Done http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@592 PS16, Line 592: (uint64_t)MICROS_PER_SEC > nit: please use static_cast Is there a reason to use static_cast for primitive numeric types? IMO C style casts add less noise and expresses the intent with the same clarity (pointers/references are totally different stories...). http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@604 PS16, Line 604: && nanos == other.nanos > If nanos are more variable in the benchmarks then maybe it'd be worth to ch I have just realized that this is not needed anymore. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@627 PS16, Line 627: / ONE_BILLIONTH > * NANOS_PER_SEC ? The results are actually not always the same due to rounding. I left it this way in TimestampValue to avoid changing external behavior, so I also left it like this in the benchmark. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@634 PS16, Line 634: / ONE_BILLIONTH > * NANOS_PER_SEC ? See last answer. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/exprs/expr-test.cc@6508 PS16, Line 6508: ScopedLocalUnixTimestampConversionOverride > I just noticed that this class has a bug. FLAGS_use_local_tz_for_unix_timestamp_conversions should be false by default in tests, but I agree that restoring the original value fits the goal of the class better. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc File be/src/runtime/timestamp-test.cc: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@241 PS16, Line 241: (double) seconds + (double) > nit: static_cast See my comment in convert-timezone-benchmark.cc. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@241 PS16, Line 241: 1000 > nit: MILLIS_PER_SEC? Attila mentioned this too. I think that 1000 is easier to read than MILLIS_PER_SEC. (hmm, maybe if both of you have mentioned it than I am wrong here :) http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@250 PS16, Line 250: 8 > nit: please give it a name, e.g. MARGIN_OF_ERROR Done http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@261 PS16, Line 261: 1000 > nit: MILLIS_PER_SEC from gutil/walltime.h? Same as for line 241. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@266 PS16, Line 266: 1000 > nit: MICROS_PER_MILLI from walltime.h? Same as for line 241. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@280 PS16, Line 280: 60*60 > it appears twice Done http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@188 PS16, Line 188: (int64_t)1000*1000*1000 > nit: static_cast I was not happy to write this here, but walltime.h constants are not available in this file. I did not want to include walltime.h, because if timestamp-value.h is included (indirectly) from a very large number of files, and I did not want to make compilation slower. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@318 PS16, Line 318: int64_t SplitTime(int64_t* ticks) > Maybe it would be cleaner to change the signature to void SplitTime(const i I see the point, and I trust the compiler that it will eliminate the variabled, but I think that this would make most calling functions a bit more complicated. For example FromUnixTimeNanos would need +2 variables and for unix_time += SplitTime(); http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@317 PS16, Line 317: template
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 ) Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/586/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 18 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 05 Sep 2018 17:15:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON
Greg Rahn has posted comments on this change. ( http://gerrit.cloudera.org:8080/10950 ) Change subject: IMPALA-376: add built-in functions for parsing JSON .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514 PS13, Line 514: get_json_object I also left a comment on the Jira, but I'd recommend we name and implement the ANSI SQL function json_value(). It will be more portable and is the ANSI SQL standard and appears to be very similar. -- To view, visit http://gerrit.cloudera.org:8080/10950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f Gerrit-Change-Number: 10950 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 05 Sep 2018 17:10:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions
Hello Zoltan Borok-Nagy, Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11183 to look at the new patch set (#20). Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions .. IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions Impala used to convert from sub-second unix time to TimestampValue (which is split to date_ and time_ similarly to boost::posix_time::ptime) by first splitting the input into seconds and sub-seconds part, converting the seconds part wit boost::posix_time::from_time_t(), and then adding the sub-seconds part to this timestamp. Different tricks are used to speed up different functions: - UTC functions that expect a single integer as input can split it into date_ and time_ directly. - Non-UTC functions need seconds for timezone conversion, because CCTZ expects time points as seconds. These were optimized by adding the subsecond part to time_ instead of adding it to a ptime. This can be done safely because the sub-second part is between [0, 1 sec), so it cannot overflow into a different day or timezone. Benchmarks show 2x - 6x speedup for the measured functions. The main motivation is IMPALA-5050: "Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner" - reading these types will run micro/milli->TimestampValue conversion for every row. Other changes: - TimestampValue::UtcFromUnixTimeMillis was added - currently this is only used in tests but it will be useful for IMPALA-5050 - Some functions were moved from .h to .inline.h. - FromUnixTimeMicros was changed to do the utc->local conversion depending on flag use_local_tz_for_unix_timestamp_conversions to be consistent with other similar functions. This function was only used in tests until now but it will be useful for IMPALA-5050. - When a result mismatch is detected in convert-timestamp-benchmark.cc it now prints non-equal values. - Benchmarks were added for micro + nano conversions. Note that only single threaded benchmarks were added because I do not expect any difference in the multi threaded case. - DCHECKs were added to TimeStampValue::Validate to ensure that time_ is between [0, 24 hour). Testing: - timestamp-test.cc was extended to give better coverage for sub-second conversions. Edge cases were already covered pretty well. Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa --- M be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/exec/data-source-scan-node.cc M be/src/exec/hdfs-orc-scanner.cc M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/expr-test.cc M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h M be/src/runtime/timestamp-value.inline.h 9 files changed, 378 insertions(+), 108 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/11183/20 -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 20 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy