[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 05:30:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7499: build against CDH Kudu

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11363 )

Change subject: IMPALA-7499: build against CDH Kudu
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6e1048438b6d09a1b38c58371d6212bb6dcc06c
Gerrit-Change-Number: 11363
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 06 Sep 2018 04:09:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON

2018-09-05 Thread Greg Rahn (Code Review)
Greg Rahn has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10950 )

Change subject: IMPALA-376: add built-in functions for parsing JSON
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514
PS13, Line 514: get_json_object
> Ah, haven't considered the ANSI standards before. My initial motivation is
I'm not opposed to having both functions, Hive & ANSI, but I wanted to raise 
awareness to see if folks felt it was necessary and having knowledge that the 
ANSI JSON functions are likely to also be implemented and that hopefully it 
could be done so in a way that makes the most sense in terms of sharing code, 
etc.



--
To view, visit http://gerrit.cloudera.org:8080/10950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f
Gerrit-Change-Number: 10950
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 06 Sep 2018 04:03:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv

2018-09-05 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11386 )

Change subject: [DOCS] Additional updates to the fine-grained priv
..

[DOCS] Additional updates to the fine-grained priv

Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Reviewed-on: http://gerrit.cloudera.org:8080/11386
Tested-by: Impala Public Jenkins 
Reviewed-by: Adam Holley 
Reviewed-by: Alex Rodoni 
---
M docs/shared/impala_common.xml
M docs/topics/impala_authorization.xml
M docs/topics/impala_grant.xml
M docs/topics/impala_revoke.xml
4 files changed, 567 insertions(+), 345 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Adam Holley: Looks good to me, but someone else must approve
  Alex Rodoni: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Gerrit-Change-Number: 11386
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv

2018-09-05 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11386 )

Change subject: [DOCS] Additional updates to the fine-grained priv
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Gerrit-Change-Number: 11386
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 03:28:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv

2018-09-05 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11386 )

Change subject: [DOCS] Additional updates to the fine-grained priv
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/11386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Gerrit-Change-Number: 11386
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 03:26:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11393 )

Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/602/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92
Gerrit-Change-Number: 11393
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 03:21:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/601/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 9
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 03:12:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/600/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 03:07:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11158 )

Change subject: IMPALA-7540. Intern most repetitive strings and network 
addresses in catalog
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/599/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26
Gerrit-Change-Number: 11158
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:40:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..

IMPALA-7510. Support principals/privileges with LocalCatalog

This enables support for Sentry authorization when LocalCatalog is
enabled. The design is detailed in a change to the comment on
CatalogdMetaProvider, but to recap it briefly here:

At a high level, this patch takes the approach of duplicating the "v1"
catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog
daemon publishes complete objects into the statestore topic, and the
impalad fully replicates them locally.

I took this approach rather than trying to do fine-grained caching and
invalidation for the following reasons:

- The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude
  smaller than table metadata. So, the benefit of fine-grained caching
  and eviction is not as great.

- The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined
  with relationships between them and backwards mappings maintained from
  groups back to principals. This logic is implemented by the
  AuthorizationPolicy class. Implementing similar mapping in a
  fine-grained caching approach would be a reasonable amount of work.

- This bit of code is under some current flux as others are working on
  implementing more fine grained permissioning. Thus, trying to
  duplicate the logic in a "fetch-on-demand" implementation might turn
  out to be chasing somewhat of a moving target.

In order to take this approach, the patch is organized as follows:

- refactored some of the role/principal removal logic from ImpaladCatalog
  into AuthorizationPolicy. This makes it easier to perform the similar
  "subscribe" with less duplicate cdoe.

- changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2
  catalogs in addition to v1.

- passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and
  added an AuthorizationPolicy member there. This member is maintained
  when we see PRINCIPAL and PRIVILEGE objects come via the catalog
  updates.

- had to implement LocalCatalog.isReady() to ensure that we don't allow
  user access until the first topic update has been consumed.

- additionally had to copy some other code from ImpaladCatalog to
  protect against various races -- we need a CatalogDeltaLog as well as
  careful sequencing of the order in which the objects apply.

With this patch and the following one to enable UDF support, I was able
to run the tests in tests/authorization successfully with LocalCatalog
enabled.

Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Reviewed-on: http://gerrit.cloudera.org:8080/11358
Reviewed-by: Todd Lipcon 
Tested-by: Todd Lipcon 
---
M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M tests/common/custom_cluster_test_suite.py
9 files changed, 272 insertions(+), 67 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 10
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/11393

to review the following change.


Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables
..

IMPALA-7541. Avoid initializing Metrics for IncompleteTables

Each 'Metrics' object is relatively large, and incomplete tables don't
support viewing metrics anyway. This makes the initialization of Metrics
lazy for IncompleteTable.

I compared a jmap -histo:live from the catalogd before and after this change,
after having loaded one table.

Object counts are reduced as follows:

  java.util.concurrent.atomic.LongAdder: 16760 -> 20 (-16740)
  com.codahale.metrics.LongAdderProxy$JdkProvider$1: 16760 -> 20 (-16740)
  java.util.concurrent.atomic.AtomicLong: 14686 -> 4641 (-10045)
  com.codahale.metrics.EWMA: 10056 -> 12 (-10044)
  java.util.concurrent.ConcurrentSkipListMap$Node: 3353 -> 5 (-3348)
  com.codahale.metrics.Meter: 3352 -> 4 (-3348)
  java.util.concurrent.locks.ReentrantReadWriteLock: 3353 -> 5 (-3348)
  com.codahale.metrics.ExponentiallyDecayingReservoir: 3352 -> 4 (-3348)
  
java.util.concurrent.locks.ReentrantReadWriteLock$Sync$ThreadLocalHoldCounter: 
3353 -> 5 (-3348)
  com.codahale.metrics.Histogram: 3352 -> 4 (-3348)
  java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync: 3352 -> 4 
(-3348)
  java.util.concurrent.ConcurrentSkipListMap: 3352 -> 4 (-3348)
  java.util.concurrent.ConcurrentSkipListMap$HeadIndex: 3352 -> 4 (-3348)
  com.codahale.metrics.Timer: 3352 -> 4 (-3348)
  java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock: 3353 -> 5 (-3348)
  java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock: 3353 -> 5 (-3348)
  java.util.concurrent.ConcurrentHashMap$Node: 38419 -> 35140 (-3279)
  java.util.concurrent.locks.ReentrantLock: 2277 -> 1161 (-1116)
  java.util.concurrent.locks.ReentrantLock$NonfairSync: 2309 -> 1193 (-1116)
  com.codahale.metrics.MetricRegistry: 1117 -> 1 (-1116)
  [Ljava.util.concurrent.ConcurrentHashMap$Node;: 1264 -> 148 (-1116)
  java.util.concurrent.ConcurrentHashMap: 1281 -> 165 (-1116)
  java.util.concurrent.CopyOnWriteArrayList: 1126 -> 10 (-1116)
  org.apache.impala.common.Metrics: 1117 -> 1 (-1116)
  [Ljava.lang.Object;: 4441 -> 3346 (-1095)

(94 objects per table)

With heap usage (bytes) reduced as follows:

  java.util.concurrent.atomic.LongAdder: 536320 -> 640 (-535680)
  com.codahale.metrics.EWMA: 482688 -> 576 (-482112)
  com.codahale.metrics.LongAdderProxy$JdkProvider$1: 402240 -> 480 (-401760)
  java.util.concurrent.atomic.AtomicLong: 352464 -> 111384 (-241080)
  com.codahale.metrics.ExponentiallyDecayingReservoir: 187712 -> 224 (-187488)
  com.codahale.metrics.Meter: 160896 -> 192 (-160704)
  java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync: 160896 -> 192 
(-160704)
  java.util.concurrent.ConcurrentSkipListMap: 160896 -> 192 (-160704)
  java.util.concurrent.ConcurrentSkipListMap$HeadIndex: 107264 -> 128 (-107136)
  java.util.concurrent.ConcurrentHashMap$Node: 1229408 -> 1124480 (-104928)
  [Ljava.util.concurrent.ConcurrentHashMap$Node;: 532192 -> 442912 (-89280)
  java.util.concurrent.ConcurrentSkipListMap$Node: 80472 -> 120 (-80352)
  java.util.concurrent.locks.ReentrantReadWriteLock: 80472 -> 120 (-80352)
  com.codahale.metrics.Histogram: 80448 -> 96 (-80352)
  com.codahale.metrics.Timer: 80448 -> 96 (-80352)
  java.util.concurrent.ConcurrentHashMap: 81984 -> 10560 (-71424)
  
java.util.concurrent.locks.ReentrantReadWriteLock$Sync$ThreadLocalHoldCounter: 
53648 -> 80 (-53568)
  java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock: 53648 -> 80 
(-53568)
  java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock: 53648 -> 80 
(-53568)
  java.util.concurrent.locks.ReentrantLock$NonfairSync: 73888 -> 38176 (-35712)
  com.codahale.metrics.MetricRegistry: 26808 -> 24 (-26784)
  java.util.concurrent.CopyOnWriteArrayList: 27024 -> 240 (-26784)
  java.util.concurrent.locks.ReentrantLock: 36432 -> 18576 (-17856)
  org.apache.impala.common.Metrics: 17872 -> 16 (-17856)
  [Ljava.lang.Object;: 260912 -> 243840 (-17072)

This works out to a total of 3249KB saved (2981 bytes per table). In a
large production environment with 100k tables this would save 284MB of
heap and 9.4M objects.

Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92
---
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
4 files changed, 13 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/11393/1
--
To view, visit http://gerrit.cloudera.org:8080/11393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: 

[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11357 )

Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full 
delta updates
..

IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates

This is a follow-up to commit a90b42911f1753b14e4f9f581bad837e33dcf9bf
which had an unintended side-effect even when LocalCatalog is not
enabled: previously we would only re-instantiate AuthorizationChecker on
"full" catalog updates; after that patch, we would re-instantiate it on
every catalog update.

While it appears that this is relatively cheap and didn't cause any
obvious behavioral changes, it wasn't intended, and one of the goals of
the --use_local_catalog flag being optional is that the existing code
paths shouldn't be affected. So, this puts it back the way it was.

Thanks to Adam Holley for noticing this and alerting me to my mistake.

Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962
Reviewed-on: http://gerrit.cloudera.org:8080/11357
Tested-by: Impala Public Jenkins 
Reviewed-by: Todd Lipcon 
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
1 file changed, 6 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Todd Lipcon: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962
Gerrit-Change-Number: 11357
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 9:

Skipping a jenkins re-verify since it was only whitespace/comment change 
edited. Forwarding +2.


--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 9
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:39:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 9: Verified+1 Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@270
PS7, Line 270:   @Override
> nit: extra line here and below
Done


http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@783
PS7, Line 783: ype.PR
> nit: order
Done



--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 9
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:38:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11358

to look at the new patch set (#9).

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..

IMPALA-7510. Support principals/privileges with LocalCatalog

This enables support for Sentry authorization when LocalCatalog is
enabled. The design is detailed in a change to the comment on
CatalogdMetaProvider, but to recap it briefly here:

At a high level, this patch takes the approach of duplicating the "v1"
catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog
daemon publishes complete objects into the statestore topic, and the
impalad fully replicates them locally.

I took this approach rather than trying to do fine-grained caching and
invalidation for the following reasons:

- The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude
  smaller than table metadata. So, the benefit of fine-grained caching
  and eviction is not as great.

- The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined
  with relationships between them and backwards mappings maintained from
  groups back to principals. This logic is implemented by the
  AuthorizationPolicy class. Implementing similar mapping in a
  fine-grained caching approach would be a reasonable amount of work.

- This bit of code is under some current flux as others are working on
  implementing more fine grained permissioning. Thus, trying to
  duplicate the logic in a "fetch-on-demand" implementation might turn
  out to be chasing somewhat of a moving target.

In order to take this approach, the patch is organized as follows:

- refactored some of the role/principal removal logic from ImpaladCatalog
  into AuthorizationPolicy. This makes it easier to perform the similar
  "subscribe" with less duplicate cdoe.

- changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2
  catalogs in addition to v1.

- passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and
  added an AuthorizationPolicy member there. This member is maintained
  when we see PRINCIPAL and PRIVILEGE objects come via the catalog
  updates.

- had to implement LocalCatalog.isReady() to ensure that we don't allow
  user access until the first topic update has been consumed.

- additionally had to copy some other code from ImpaladCatalog to
  protect against various races -- we need a CatalogDeltaLog as well as
  careful sequencing of the order in which the objects apply.

With this patch and the following one to enable UDF support, I was able
to run the tests in tests/authorization successfully with LocalCatalog
enabled.

Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
---
M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M tests/common/custom_cluster_test_suite.py
9 files changed, 272 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/11358/9
--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 9
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11358

to look at the new patch set (#8).

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..

IMPALA-7510. Support principals/privileges with LocalCatalog

This enables support for Sentry authorization when LocalCatalog is
enabled. The design is detailed in a change to the comment on
CatalogdMetaProvider, but to recap it briefly here:

At a high level, this patch takes the approach of duplicating the "v1"
catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog
daemon publishes complete objects into the statestore topic, and the
impalad fully replicates them locally.

I took this approach rather than trying to do fine-grained caching and
invalidation for the following reasons:

- The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude
  smaller than table metadata. So, the benefit of fine-grained caching
  and eviction is not as great.

- The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined
  with relationships between them and backwards mappings maintained from
  groups back to principals. This logic is implemented by the
  AuthorizationPolicy class. Implementing similar mapping in a
  fine-grained caching approach would be a reasonable amount of work.

- This bit of code is under some current flux as others are working on
  implementing more fine grained permissioning. Thus, trying to
  duplicate the logic in a "fetch-on-demand" implementation might turn
  out to be chasing somewhat of a moving target.

In order to take this approach, the patch is organized as follows:

- refactored some of the role/principal removal logic from ImpaladCatalog
  into AuthorizationPolicy. This makes it easier to perform the similar
  "subscribe" with less duplicate cdoe.

- changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2
  catalogs in addition to v1.

- passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and
  added an AuthorizationPolicy member there. This member is maintained
  when we see PRINCIPAL and PRIVILEGE objects come via the catalog
  updates.

- had to implement LocalCatalog.isReady() to ensure that we don't allow
  user access until the first topic update has been consumed.

- additionally had to copy some other code from ImpaladCatalog to
  protect against various races -- we need a CatalogDeltaLog as well as
  careful sequencing of the order in which the objects apply.

With this patch and the following one to enable UDF support, I was able
to run the tests in tests/authorization successfully with LocalCatalog
enabled.

Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
---
M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M tests/common/custom_cluster_test_suite.py
9 files changed, 274 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/11358/8
--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11357 )

Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full 
delta updates
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962
Gerrit-Change-Number: 11357
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:36:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 6:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/598/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:31:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:26:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/597/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:22:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11357 )

Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full 
delta updates
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962
Gerrit-Change-Number: 11357
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:20:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3121/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 02:20:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 6:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG@19
PS4, Line 19: is
nit: are


http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG@20
PS4, Line 20: randomly stuffed into catalogd
what does this mean?


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@210
PS4, Line 210: query
only queries?


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@211
PS4, Line 211: evict
just clarifying that this knob and the memory ones are independent. in other 
words, if you specify a timeout, tables will still be evicted regardless of how 
close you get to the memory threshold set below.


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@212
PS4, Line 212: unused
nit: remove


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@225
PS4, Line 225:
I understand this knob to establish a target lower bound for an eviction event. 
Why is this in terms of table fraction as opposed to a target based on memory 
fraction (mb or fraction)?


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@226
PS4, Line 226: ++
remove


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/exec/catalog-op-executor.h
File be/src/exec/catalog-op-executor.h:

http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/exec/catalog-op-executor.h@72
PS4, Line 72: table names
nit: simplify to just tables


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2000
PS4, Line 2000:   /**
Because of the thrift spec, I was expecting to see usage count being used 
somewhere. Add a todo to make use of it.


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@46
PS4, Line 46: ize.
stale?


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@62
PS4, Line 62:
eviction


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@89
PS4, Line 89:*/
add an "_".


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@116
PS4, Line 116: rn new CatalogdTableInvalidator(catalog, 
invalidateTablesTimeoutSec,
 :   i
are these two validated anywhere (e.g., [0,1])?


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@140
PS4, Line 140: ldPool =
is there somewhat stable site to link to where these names are given, perhaps 
for different jdk's?


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@146
PS4, Line 146: lastObservedGcCount_ = gcbean.getCollectionCount();
why not do this check before setting various class members above?


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@175
PS4, Line 175: getLastGcInfo()
just starting here with the doc for this method, I see that null can be 
returned. I have not checked the rest of the chain following this invocation, 
but do we want to guard this?


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@94
PS4, Line 94: ile (true) {
put this outside of the while loop.


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@656
PS4, Line 656: load
what rule are you using to pair the load with refreshing the last-used-time? 
for example, why isn't the load in the true branch on L653 also refreshed? 
generalizing further, should the refresh be part of a Table 

[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11158

to look at the new patch set (#2).

Change subject: IMPALA-7540. Intern most repetitive strings and network 
addresses in catalog
..

IMPALA-7540. Intern most repetitive strings and network addresses in catalog

This adds interning to a bunch of repeated strings in catalog objects,
including:
- table name
- DB name
- owner
- column names
- input/output formats
- parameter keys
- common parameter values ("true", "false", etc)
- HBase column family names

Additionally, it interns TNetworkAddresses, so that each datanode host
is only stored once rather than having its own copy in each table.

I verified this patch using jxray on the development catalogd and
impalad. The following lines are removed entirely from the "duplicate
strings" report:

 Overhead   # char[]s # objects  Value
 164K (0.3%) 2,635   2,635  "127.0.0.1"
 97K (0.2%)  1,038   1,038  "__HIVE_DEFAULT_PARTITION__"
 95K (0.2%)  1,111   1,111  "transient_lastDdlTime"
 92K (0.1%)  1,975   1,975  "d"
 70K (0.1%)  997 997"EXTERNAL_TABLE"
 56K (< 0.1%)1,201   1,201  "todd"
 54K (< 0.1%)998 998"EXTERNAL"
 46K (< 0.1%)998 998"TRUE"
 44K (< 0.1%)567 567"numFilesErasureCoded"
 38K (< 0.1%)612 612"totalSize"
 30K (< 0.1%)567 567"numFiles"

The following are reduced substantially:

Before: 72K (0.1%)  1,543   1,543  "1"
After:  47K (< 0.1%)1,009   1,009  "1"

A few large strings remain in the report that may be worth addressing, depending
on whether we think production catalogs exhibit the same repetitions:

1) Avro schemas, eg:
 204K (0.3%) 3   3  "{"fields": [{"type": ["boolean", "null"], 
"name": "bool_col1"}, {"type": ["int", "null"], "name": "tinyint_col1"}, 
{"type": ...[length 52429]"

(in the development catalog there are multiple tables with the same Avro
schema)

2) Partition location suffixes, eg:
 144K (0.2%) 1,234   1,234  "many_blocks_num_blocks_per_partition_1"
 17K (< 0.1%)230 230"year=2009/month=2"
 17K (< 0.1%)230 230"year=2009/month=3"
 17K (< 0.1%)230 230"year=2009/month=1"

(in the development catalog lots of tables have the same partitioning
layout)

3) Unsure (jxray isn't reporting the reference chain, but seems likely
   to be partition values):
 49K (< 0.1%)1,058   1,058  "2010"
 28K (< 0.1%)612 612"2009"
 27K (< 0.1%)585 585"0"
 22K (< 0.1%)71  899""

Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26
---
A fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java
M fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
5 files changed, 240 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/11158/2
--
To view, visit http://gerrit.cloudera.org:8080/11158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26
Gerrit-Change-Number: 11158
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-05 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..

IMPALA-7448: Invalidate recently unused tables from catalogd

This patch implements an automatic invalidation mechanism in catalogd.
There are two invalidation strategies:
1. Periodically the HDFS tables that are not used in a configured
   period "invalidate_tables_timeout" is invalidated from catalogd.
2. If the old GC generation is almost full, a certain percentage of LRU
   tables are invalidated. This can be enabled by backend flag
   "invalidate_tables_on_memory_pressure".

The table usage is reported by impalad to catalogd when the tables are
used during planning.
Tests on time-based invalidation is added. It is manually verified that
the GC callback is called if strings are randomly stuffed into catalogd.

Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-service-client-wrapper.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/common/global-flags.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/service/fe-support.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
A fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
A tests/custom_cluster/test_automatic_invalidation.py
25 files changed, 788 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/11224/6
--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-05 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..

IMPALA-7448: Invalidate recently unused tables from catalogd

This patch implements an automatic invalidation mechanism in catalogd.
There are two invalidation strategies:
1. Periodically the HDFS tables that are not used in a configured
   period "invalidate_tables_timeout" is invalidated from catalogd.
2. If the old GC generation is almost full, a certain percentage of LRU
   tables are invalidated. This can be enabled by backend flag
   "invalidate_tables_on_memory_pressure".

The table usage is reported by impalad to catalogd when the tables are
used during planning.
Tests on time-based invalidation is added. It is manually verified that
the GC callback is called if strings are randomly stuffed into catalogd.

Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-service-client-wrapper.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/common/global-flags.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/service/fe-support.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
A fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
A tests/custom_cluster/test_automatic_invalidation.py
25 files changed, 788 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/11224/5
--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] Replace some TODOs with specific JIRA references

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11392 )

Change subject: Replace some TODOs with specific JIRA references
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11392/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11392/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1322
PS1, Line 1322:* TODO(IMPALA-7533): currently this inherits from 
VersionedTableCacheKey. This means that, if
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11392
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77
Gerrit-Change-Number: 11392
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 01:23:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7499: build against CDH Kudu

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11363 )

Change subject: IMPALA-7499: build against CDH Kudu
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3120/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6e1048438b6d09a1b38c58371d6212bb6dcc06c
Gerrit-Change-Number: 11363
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 06 Sep 2018 00:46:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON

2018-09-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10950 )

Change subject: IMPALA-376: add built-in functions for parsing JSON
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514
PS13, Line 514: get_json_object
> I'd prefer to make Hive compatibility be explicity brought up before decidi
Ah, haven't considered the ANSI standards before. My initial motivation is to 
replace the Hive UDF get_json_object written in Java since it's out of control 
in memory management.

There's a slight difference between the Hive get_json_object() and ANSI 
json_value(). json_value only returns scalar values, while get_json_object can 
return JSON string for a selected JSON object. Maybe the ANSI json_query() 
function is more similar.

In Hulu, most of our Impala users are already using the get_json_object 
function. Some upper systems may also use this function. Since it's already a 
Hive convention, I think it worths to keep this name.

My principle is "start from the user". Rename the function may cause trouble 
for them. However, the ANSI standards have more flexible syntax. What about 
creating a new JIRA for implementing the "returning " and "value on 
error" clauses?



--
To view, visit http://gerrit.cloudera.org:8080/10950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f
Gerrit-Change-Number: 10950
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 06 Sep 2018 00:14:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING

2018-09-05 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11355 )

Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and 
PARQUET_DICTIONARY_FILTERING
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_dictionary_filtering.xml
File docs/topics/impala_parquet_dictionary_filtering.xml:

http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_dictionary_filtering.xml@87
PS4, Line 87:   The NumDictFilteredRowGroups field in the 
query profile output shows the
We should mention somewhere that row groups can be filtered out by parquet 
stats, and in such cases won't even be considered for dictionary filtering.


http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml
File docs/topics/impala_parquet_read_statistics.xml:

http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml@49
PS4, Line 49:   Parquet stores min/max stats which can be used to skip 
reading blocks if they don't
The stats are (currently) used to skip row groups. "qualify a predicate" sounds 
unfamiliar to me. This blog post has a more detailed description: 
https://blog.cloudera.com/blog/2017/12/faster-performance-for-selective-queries/


http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml@61
PS4, Line 61: Of the numerical types: bool, integer, floating point
Impala also supports other data types, but there are restrictions around the 
library and version that wrote them. I'll share some more docs with you 
momentarily.



--
To view, visit http://gerrit.cloudera.org:8080/11355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0
Gerrit-Change-Number: 11355
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 06 Sep 2018 00:11:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/594/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 00:04:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/595/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 00:00:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 7: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@270
PS7, Line 270:
nit: extra line here and below


http://gerrit.cloudera.org:8080/#/c/11358/7/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@783
PS7, Line 783: oreder
nit: order


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@214
PS6, Line 214: // here.
> It might be sufficient (or at least better) to do a sleeping loop checking
makes sense to defer it



--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 23:55:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Replace some TODOs with specific JIRA references

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11392 )

Change subject: Replace some TODOs with specific JIRA references
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/596/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11392
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77
Gerrit-Change-Number: 11392
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 23:50:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 23:44:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON

2018-09-05 Thread Greg Rahn (Code Review)
Greg Rahn has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10950 )

Change subject: IMPALA-376: add built-in functions for parsing JSON
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514
PS13, Line 514: get_json_object
> I just want to note that ANSI SQL also has a function called json_object()
I'd prefer to make Hive compatibility be explicity brought up before deciding 
on implementing it hence me favoring ANSI standards first.



--
To view, visit http://gerrit.cloudera.org:8080/10950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f
Gerrit-Change-Number: 10950
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 05 Sep 2018 23:41:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Replace some TODOs with specific JIRA references

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11392 )

Change subject: Replace some TODOs with specific JIRA references
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11392/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11392/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@343
PS1, Line 343: // TODO(IMPALA-7534): there a race here if an invalidation 
comes in while we are
nit: there is



--
To view, visit http://gerrit.cloudera.org:8080/11392
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77
Gerrit-Change-Number: 11392
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 23:39:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/593/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 23:28:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11359

to look at the new patch set (#7).

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..

IMPALA-7203. Support UDFs in CatalogdMetaProvider

This adds support to fetch the list of function names within a database,
and then to fetch the list of overloads for a given function name. These
items are cached on the coordinator and invalidated when minimal
function objects are seen on the minimal catalog topic stream.

Aside from the straight-forward plumbing of this new RPC, it's worth
noting that this patch changes the MetaProvider interface to provide
Impala Functions directly instead of HMS Function objects. This means
that we will now fully support all the types of functions supported in
legacy catalogs. As such, test_udfs now passes.

Making this change simplified the code in LocalDb but also means that
DirectMetaProvider no longer supports fetching functions. When we move
to trying to eliminate catalogd altogether at some point, we can revive
this code. For now, I just throw an exception for the function-related
code in DirectMetaProvider, as it's unused.

The one other notable thing here is that TFunction now has optional
fields where it used to have required ones. This was necessary in order
to send invalidations as TCatalogObjects. Given that TFunction is an
internal-only construct, this shouldn't raise compatibility issues.

Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
---
M common/thrift/CatalogService.thrift
M common/thrift/Types.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
9 files changed, 232 insertions(+), 170 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11359/7
--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Impala Public Jenkins, Vuk 
Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11388

to look at the new patch set (#3).

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..

WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

This patch adds a Java wrapper for a RuntimeProfile object. The wrapper
supports some basic operations like non-hierarchical counters and
informational strings.

During planning, a profile is created, and passed back to the backend as
part of the ExecRequest. The backend then updates the query profile
based on the info emitted from the frontend.

This patch also adds the first use case for this profile information:
the CatalogdMetaProvider emits counters for cache hits, misses, and
fetch times, broken down by metadata category.

The emitted profile is a bit of a superset of the existing 'timeline'
functionality. However, it seems that some tools may parse the timeline
in its current location in the profile, so moving it might be
incompatible. I elected to leave that alone for now and just emit
counters in the new profile.

WIP for a few reasons:
- needs tests
- maybe the amount of information is too verbose to include in every
  profile? Should we only include such info on misses, since we assume
  hits don't contribute to the timing much? Should we not bother
  breaking down by category?
- worth trying to include the byte sizes of metadata fetched?

Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/util/runtime-profile.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A fe/src/main/java/org/apache/impala/service/FrontendProfile.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
9 files changed, 294 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/3
--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] Replace some TODOs with specific JIRA references

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11392


Change subject: Replace some TODOs with specific JIRA references
..

Replace some TODOs with specific JIRA references

Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
8 files changed, 30 insertions(+), 28 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/11392/1
--
To view, visit http://gerrit.cloudera.org:8080/11392
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8d008b62e11b8057cade503721046cd696c3ca77
Gerrit-Change-Number: 11392
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/592/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 05 Sep 2018 23:07:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3119/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:53:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11357 )

Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full 
delta updates
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3118/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962
Gerrit-Change-Number: 11357
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:52:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11358

to look at the new patch set (#7).

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..

IMPALA-7510. Support principals/privileges with LocalCatalog

This enables support for Sentry authorization when LocalCatalog is
enabled. The design is detailed in a change to the comment on
CatalogdMetaProvider, but to recap it briefly here:

At a high level, this patch takes the approach of duplicating the "v1"
catalog flow for PRINCIPAL and PRIVILEGE catalog objects. Namely, the catalog
daemon publishes complete objects into the statestore topic, and the
impalad fully replicates them locally.

I took this approach rather than trying to do fine-grained caching and
invalidation for the following reasons:

- The PRINCIPAL and PRIVILEGE metadata is typically many orders of magnitude
  smaller than table metadata. So, the benefit of fine-grained caching
  and eviction is not as great.

- The PRINCIPAL and PRIVILEGE catalog objects are fairly tightly intertwined
  with relationships between them and backwards mappings maintained from
  groups back to principals. This logic is implemented by the
  AuthorizationPolicy class. Implementing similar mapping in a
  fine-grained caching approach would be a reasonable amount of work.

- This bit of code is under some current flux as others are working on
  implementing more fine grained permissioning. Thus, trying to
  duplicate the logic in a "fetch-on-demand" implementation might turn
  out to be chasing somewhat of a moving target.

In order to take this approach, the patch is organized as follows:

- refactored some of the role/principal removal logic from ImpaladCatalog
  into AuthorizationPolicy. This makes it easier to perform the similar
  "subscribe" with less duplicate cdoe.

- changed catalogd to publish PRINCIPAL and PRIVILEGE objects to v2
  catalogs in addition to v1.

- passed through LocalCatalog.getAuthPolicy to CatalogdMetaProvider, and
  added an AuthorizationPolicy member there. This member is maintained
  when we see PRINCIPAL and PRIVILEGE objects come via the catalog
  updates.

- had to implement LocalCatalog.isReady() to ensure that we don't allow
  user access until the first topic update has been consumed.

- additionally had to copy some other code from ImpaladCatalog to
  protect against various races -- we need a CatalogDeltaLog as well as
  careful sequencing of the order in which the objects apply.

With this patch and the following one to enable UDF support, I was able
to run the tests in tests/authorization successfully with LocalCatalog
enabled.

Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
---
M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M tests/common/custom_cluster_test_suite.py
9 files changed, 274 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/11358/7
--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..


Patch Set 12: Verified+1 Code-Review+2

Going to forward the +1 instead of reverifying the whole thing and waiting more 
hours. I differed the verified revision with the new rebased revision and the 
only change was to shell/impala_shell.py which is unrelated to the changes in 
this patch.


--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 12
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:51:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..

IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

This implements cache invalidation inside CatalogdMetaProvider. The
design is as follows:

- when the catalogd collects updates into the statestore topic, it now
  adds an additional entry for each table and database. These additional
  entries are minimal - they only include the object's name, but no
  metadata. This new behavior is conditional on a new flag
  --catalog_topic_mode. The default mode is to keep the old style, but
  it can be configured to mixed (support both v1 and v2) or v2-only.

- the old-style topic entries are prefixed with a '1:' whereas the new
  minimal entries are prefixed with a '2:'. The impalad will subscribe
  to one or the other prefix depending on whether it is running with
  --use_local_catalog. Thus, old impalads will not be confused by the
  new entries and vice versa.

- when the impalad gets these topic updates, it forwards them through to
  the catalog implementation. The LocalCatalog implementation forwards
  them to the CatalogdMetaProvider, which uses them to invalidate
  cached metadata as appropriate.

This patch includes some basic unit tests. I also did some manual
testing by connecting to different impalads and verifying that a session
connected to impalad #1 saw the effects of DDLs made by impalad #2
within a short period of time (the statestore topic update frequency).

Existing end-to-end tests cover these code paths pretty thoroughly:

- if we didn't automatically invalidate the cache on a coordinator
  in response to DDL operations, then any test which expects to
  "read its own writes" (eg access a table after creating one)
  would fail
- if we didn't propagate invalidations via the statestore, then
  all of the tests that use sync_ddl would fail.

I verified the test coverage above using some of the tests in
test_ddl.py -- I selectively commented out a few of the invalidation
code paths in the new code and verified that tests failed until I
re-introduced them. Along the way I also improved test_ddl so that, when
this code is broken, it properly fails with a timeout. It also has a bit
of expanded coverage for both the SYNC_DDL and non-SYNC cases.

I also wrote a new custom-cluster test for LocalCatalog that verifies
a few of the specific edge cases like detecting catalogd restart,
SYNC_DDL behavior in mixed mode, etc.

One notable exception here is the implementation of INVALIDATE METADATA
This turned out to be complex to implement, so I left a lengthy TODO
describing the issue and filed a JIRA.

Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Reviewed-on: http://gerrit.cloudera.org:8080/11280
Reviewed-by: Todd Lipcon 
Tested-by: Todd Lipcon 
---
M be/src/catalog/catalog-server.cc
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/common/Pair.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M tests/common/custom_cluster_test_suite.py
A tests/custom_cluster/test_local_catalog.py
M tests/metadata/test_ddl.py
20 files changed, 825 insertions(+), 105 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 13
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..


Patch Set 11: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 11
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:42:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..

IMPALA-6568 add missing Query Compilation section to profiles.

The profile command is used to display low-level information about the
most recent query. When a client makes request for the Profile, it
sends a GetRuntimeProfile request for the last queryId to to the server.
The queryId is used to find the ClientRequestState, an object which tracks
information about the execution, including Profile data which is stored in
several RuntimeProfile objects. The reply to the GetRuntimeProfile message
contains the Profile, pretty-printed as a string.

When a query is sent to the Front End to be compiled, the TExecRequest
that is returned from createExecRequest() in the JVM contains a Timeline,
which is a named sequence of events with timing information. This Timeline
is added to the Summary Profile in the ClientRequestState.

In the following cases the Front End was not setting the Timeline in the 
TExecRequest:
 - All DDL operations
 - Load data statements
 - Set statements
 - Explain statements
And this meant that the profile would not contain the "Query Compilation" 
timeline.

I refactored the main createExecRequest() method to
- try to make the flow clearer,
- allow the timeline to be set in the TExecRequest in only one place.
- to set "Planning finished" in all timelines

TESTING:

Add a new test to test_observability.py which checks that the "Query
Compilation" and "Planning finished" timelines appear in the profile for
various queries designed to exercise the new code paths in createExecRequest.

Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/query_test/test_observability.py
2 files changed, 147 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/3
--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1167
PS1, Line 1167: timeline.markEvent("Planning finished");
> hmm, but even with the earlier 'planning finished', statements like LOAD DA
OK, thanks, I think I see what you mean now.



--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:30:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11358/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11358/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@738
PS3, Line 738:
> I read this as a bug originally until I saw that newCatalogVersion may be u
changed to using a boxed Long which starts as null, and added a comment. lmk if 
it's better


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@235
PS6, Line 235: retried via
> nit: pushed from
I wonder where my wires got crossed there...


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@750
PS6, Line 750: sequencer
> since its specific to auth, pls call this authSequencer or something like t
Done


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@862
PS6, Line 862:   LOG.error("Error adding privilege: ", e);
> For the case that we got here via a direct ddl, wouldn't the caller want to
yea I found this surprising as well. The weird thing is that in that case it 
would have succeeded but not be reflected in the local impalad. I'll add a TODO 
here that this error handling is suspicious, but not sure how to fix it (and 
it's pre-existing)


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@214
PS6, Line 214: // here.
> is isReady sufficient here? from catalogmetaprovider, its ready when its ad
It might be sufficient (or at least better) to do a sleeping loop checking 
isReady but would rather defer to a separate patch since it's unrelated (I have 
a JIRA for waiting properly at startup)



--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:30:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11368 )

Change subject: Elaborating error message when incorrect port number is 
specified in impala-shell.
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33
Gerrit-Change-Number: 11368
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:12:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11368 )

Change subject: Elaborating error message when incorrect port number is 
specified in impala-shell.
..

Elaborating error message when incorrect port number is specified in 
impala-shell.

I recently helped debug an issue where impala-shell was being given
the hiveserver2 port rather than the beeswax port. I've updated
the error message a little bit to indicate that this may be the issue.
Here is the new message:

  $impala-shell.sh -i :21050 -q 'select 1'
  Starting Impala Shell without Kerberos authentication
  Error: Unable to communicate with impalad service. This service may
  not be an impalad instance. A common problem is that the port
  specified does not match the -beeswax_port flag on the underlying
  impalad. Check host:port and try again.
  Traceback (most recent call last):
File "/home/philip/src/Impala/shell/impala_shell.py", line 1709, in 
  execute_queries_non_interactive_mode(options, query_options)
File "/home/philip/src/Impala/shell/impala_shell.py", line 1565, in 
execute_queries_non_interactive_mode
  shell = ImpalaShell(options, query_options)
File "/home/philip/src/Impala/shell/impala_shell.py", line 232, in __init__
  self.do_connect(options.impalad)
File "/home/philip/src/Impala/shell/impala_shell.py", line 798, in 
do_connect
  self._connect()
File "/home/philip/src/Impala/shell/impala_shell.py", line 842, in _connect
  result = self.imp_client.connect()
File "/home/philip/src/Impala/shell/impala_client.py", line 257, in connect
  result = self.ping_impala_service()
File "/home/philip/src/Impala/shell/impala_client.py", line 262, in 
ping_impala_service
  return self.imp_service.PingImpalaService()
File "/home/philip/src/Impala/shell/gen-py/ImpalaService/ImpalaService.py", 
line 229, in PingImpalaService
  return self.recv_PingImpalaService()
File "/home/philip/src/Impala/shell/gen-py/ImpalaService/ImpalaService.py", 
line 245, in recv_PingImpalaService
  raise x
  thrift.Thrift.TApplicationException: Invalid method name: 'PingImpalaService'

Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33
Reviewed-on: http://gerrit.cloudera.org:8080/11368
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M shell/impala_shell.py
1 file changed, 5 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33
Gerrit-Change-Number: 11368
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON

2018-09-05 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10950 )

Change subject: IMPALA-376: add built-in functions for parsing JSON
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514
PS13, Line 514: get_json_object
> Is it worth doing both? I think portability from Hive is also important (no
I just want to note that ANSI SQL also has a function called json_object() with 
quite different functionality (it creates a JSON object from a list of 
key-value pairs).

So, it can be confusing for users if we want to be portable with both Hive and 
ANSI SQL.



--
To view, visit http://gerrit.cloudera.org:8080/10950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f
Gerrit-Change-Number: 10950
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:09:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11386 )

Change subject: [DOCS] Additional updates to the fine-grained priv
..


Patch Set 2: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/67/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/11386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Gerrit-Change-Number: 11386
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 21:54:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv

2018-09-05 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11386 )

Change subject: [DOCS] Additional updates to the fine-grained priv
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_grant.xml
File docs/topics/impala_grant.xml:

http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_grant.xml@93
PS1, Line 93: Impala does not currently support revoking only the WITH 
GRANT OPTION
:   from a privilege previously granted to a role. To remove 
the WITH GRANT
:   OPTION, revoke the privilege and grant it again 
without the WITH GRANT
:   OPTION flag.
> This is not correct.  You cannot revoke a privilege if it has the grant opt
Added in impala_revoke.xml


http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_revoke.xml
File docs/topics/impala_revoke.xml:

http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_revoke.xml@57
PS1, Line 57: privilege
> The optional "GRANT OPTION FOR" is missing.
Done



--
To view, visit http://gerrit.cloudera.org:8080/11386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Gerrit-Change-Number: 11386
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 21:45:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11386 )

Change subject: [DOCS] Additional updates to the fine-grained priv
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/67/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Gerrit-Change-Number: 11386
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 21:44:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv

2018-09-05 Thread Alex Rodoni (Code Review)
Hello Impala Public Jenkins, Adam Holley,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11386

to look at the new patch set (#2).

Change subject: [DOCS] Additional updates to the fine-grained priv
..

[DOCS] Additional updates to the fine-grained priv

Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
---
M docs/shared/impala_common.xml
M docs/topics/impala_authorization.xml
M docs/topics/impala_grant.xml
M docs/topics/impala_revoke.xml
4 files changed, 567 insertions(+), 345 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/11386/2
--
To view, visit http://gerrit.cloudera.org:8080/11386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Gerrit-Change-Number: 11386
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11193 )

Change subject: IMPALA-7425: Change incremental stats to pull from catalogd.
..

IMPALA-7425: Change incremental stats to pull from catalogd.

Currently, incremental stats can consume a substantial
amount of metadata memory (per table, partition, column).
This metadata is transmitted from catalogd to all coordinators.
As a result, memory is used for all loaded tables that use
incremental stats all the time at all coordinators. A consequence
is that coordinators and catalogd die from OOM more often
when incremental stats are used and more network bandwidth is used.

This change removes incremental stats from impalads. These stats
are only needed when computing incremental statistics and merging
new results with the existing results. They are not used by queries.
As a result, the change requires that coordinators fetch
incremental stats directly from catalogd when computing incremental stats.
In addition, catalogd no longer sends incremental stats to coordinators
via the statestore.

The option is enabled by setting a new flag, --pull_incremental_statistics,
on the catalogd and all impalad coordinators.

Testing:
  - manual testing
  - added end-to-end tests with --pull_incremental_statistics enabled
for the compute-stats-incremental.test
  - added fe CatalogTest for new catalogd service method
  - passes exhaustive tests when --pull_incremental_statistics is enabled
and disabled

Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Reviewed-on: http://gerrit.cloudera.org:8080/11193
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-service-client-wrapper.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/common/global-flags.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/service/fe-support.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/common/custom_cluster_test_suite.py
M tests/conftest.py
A tests/custom_cluster/test_pull_stats.py
27 files changed, 628 insertions(+), 54 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Gerrit-Change-Number: 11193
Gerrit-PatchSet: 16
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11193 )

Change subject: IMPALA-7425: Change incremental stats to pull from catalogd.
..


Patch Set 15: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Gerrit-Change-Number: 11193
Gerrit-PatchSet: 15
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 20:49:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: WIP: IMPALA-6568 add missing Query Compilation section to 
profiles.
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1081
PS2, Line 1081: TQueryExecRequest queryExecRequest = 
getPlannedExecRequest(queryCtx, analysisResult, timeline, explainString);
line too long (114 > 90)


http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1130
PS2, Line 1130:   private void setMtDopForCatalogOp(AnalysisResult 
analysisResult, TQueryOptions queryOptions) {
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1145
PS2, Line 1145:   private TExecRequest createTExecRequest(TQueryCtx queryCtx, 
AnalysisResult analysisResult) {
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1157
PS2, Line 1157:   private void addFinalizationParamsForInsert(TQueryCtx 
queryCtx, AnalysisResult analysisResult,
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1195
PS2, Line 1195:   private TQueryExecRequest getPlannedExecRequest(TQueryCtx 
queryCtx, AnalysisResult analysisResult,
line too long (100 > 90)


http://gerrit.cloudera.org:8080/#/c/11387/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1197
PS2, Line 1197:   StringBuilder 
explainString) throws ImpalaException {
line too long (103 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 05 Sep 2018 20:12:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Additional updates to the fine-grained priv

2018-09-05 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11386 )

Change subject: [DOCS] Additional updates to the fine-grained priv
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_grant.xml
File docs/topics/impala_grant.xml:

http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_grant.xml@93
PS1, Line 93: Impala does not currently support revoking only the WITH 
GRANT OPTION
:   from a privilege previously granted to a role. To remove 
the WITH GRANT
:   OPTION, revoke the privilege and grant it again 
without the WITH GRANT
:   OPTION flag.
This is not correct.  You cannot revoke a privilege if it has the grant option 
set to true.  If a privilege is granted with grant option, it takes two 
statements to revoke it.
e.g.
if:
GRANT ALL ON SERVER TO ROLE foo_role;
then to remove:
REVOKE GRANT OPTION FOR ALL ON SERVER FROM ROLE foo_role;
REVOKE ALL ON SERVER FROM ROLE foo_fole;


http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_revoke.xml
File docs/topics/impala_revoke.xml:

http://gerrit.cloudera.org:8080/#/c/11386/1/docs/topics/impala_revoke.xml@57
PS1, Line 57: privilege
The optional "GRANT OPTION FOR" is missing.



--
To view, visit http://gerrit.cloudera.org:8080/11386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I964a34a4a5a94d88bb09f66e7b0d25fe5b4d6d7c
Gerrit-Change-Number: 11386
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 20:00:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: WIP: IMPALA-6568 add missing Query Compilation section to 
profiles.
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/591/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:53:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3117/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 11
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:40:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11280/11/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/11280/11/tests/custom_cluster/test_local_catalog.py@108
PS11, Line 108: @
flake8: E303 too many blank lines (2)



--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 11
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:39:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: WIP: IMPALA-6568 add missing Query Compilation section to 
profiles.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1167
PS1, Line 1167:   FeFsTable hdfsTable = (FeFsTable) 
insertStmt.getTargetTable();
> I did not consider this, but I will now.
hmm, but even with the earlier 'planning finished', statements like LOAD DATA 
don't have the 'Planning finished' in the timeline. I was thinking of taking 
the entirety of the function body into a new 'doCreateExecStatement', and have 
the wrapper function create a timeline and be responsible for setting it into 
the result, sort of similar to the refactor I started here: 
https://gerrit.cloudera.org/c/11388/2/fe/src/main/java/org/apache/impala/service/Frontend.java#1024

Would that allow you to avoid the repeated calls to result.setTimeline 
everywhere?



--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:39:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10950 )

Change subject: IMPALA-376: add built-in functions for parsing JSON
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514
PS13, Line 514: get_json_object
> I also left a comment on the Jira, but I'd recommend we name and implement
Is it worth doing both? I think portability from Hive is also important (not 
just portability from other ANSI sql dbs)



--
To view, visit http://gerrit.cloudera.org:8080/10950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f
Gerrit-Change-Number: 10950
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:32:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950
PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning 
finished");
> Looks like both FE and BE set "Planning Finished" in the timeline. Is that
It's certainly weird, but actually they are in two different timelines. I 
didn't want to change semantics with this patch, though this stuff is 
definitely due for a serious cleanup.



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:30:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: WIP: IMPALA-6568 add missing Query Compilation section to 
profiles.
..

WIP: IMPALA-6568 add missing Query Compilation section to profiles.

WIP: so Todd can look at refactor
TODO: change tests to check that "Planning finished" is present for explain 
queries

The profile command is used to display low-level information about the
most recent query. When a client makes request for the Profile, it
sends a GetRuntimeProfile request for the last queryId to to the server.
The queryId is used to find the ClientRequestState, an object which tracks
information about the execution, including Profile data which is stored in
several RuntimeProfile objects. The reply to the GetRuntimeProfile message
contains the Profile, pretty-printed as a string.

When a query is sent to the Front End to be compiled, the TExecRequest
that is returned from createExecRequest() in the JVM contains a Timeline,
which is a named sequence of events with timing information. This Timeline
is added to the Summary Profile in the ClientRequestState.

In the following cases the Front End was not setting the Timeline in the 
TExecRequest:
 - All DDL operations
 - Load data statements
 - Set statements
 - Explain statements
And this meant that the profile would not contain the "Query Compilation" 
timeline.

ALTERNATIVES:

To fix this I set the Timeline in TExecRequest before returning it.

>result.setTimeline(timeline.toThrift());
>return result;

I considered writing this as

>return result.setTimeline(timeline.toThrift());

To try to make sure this code gets copied in any future return statements,
but I didn’t think it was worth the cost to readability.

REFACTOR

I refactored the main createExecRequest() method to try to make the flow 
clearer.

TESTING:

Add a new test to test_observability.py which checks that the "Query
Compilation" timeline appears in the profile for various queries designed
to exercise the new code paths in createExecRequest.

Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/query_test/test_observability.py
2 files changed, 134 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/2
--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/590/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 11
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:16:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7510. Support principals/privileges with LocalCatalog

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11358 )

Change subject: IMPALA-7510. Support principals/privileges with LocalCatalog
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11358/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11358/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@738
PS3, Line 738:
I read this as a bug originally until I saw that newCatalogVersion may be 
updated on L782. Perhaps use two variables for clarity, such as 
currentCatalogVersion and nextCatalogVersion?


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@235
PS6, Line 235: retried via
nit: pushed from


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@750
PS6, Line 750: sequencer
since its specific to auth, pls call this authSequencer or something like this.


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@862
PS6, Line 862:   LOG.error("Error adding privilege: ", e);
For the case that we got here via a direct ddl, wouldn't the caller want to 
know that the 'add' didn't succeed? Separate from this change, but error 
propagation is not the clearest with this path (same goes for existing 
ImpaladCatalog).


http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11358/6/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@214
PS6, Line 214: // here.
is isReady sufficient here? from catalogmetaprovider, its ready when its 
advanced past the initial version.



--
To view, visit http://gerrit.cloudera.org:8080/11358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccce5aabdb6afe466fdaeae0fb3700c66e658558
Gerrit-Change-Number: 11358
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:12:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 21:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/589/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 21
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:08:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 21:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11314/21/tests/authorization/test_owner_privileges.py
File tests/authorization/test_owner_privileges.py:

http://gerrit.cloudera.org:8080/#/c/11314/21/tests/authorization/test_owner_privileges.py@249
PS21, Line 249: "
flake8: E501 line too long (92 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 21
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:06:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950
PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning 
finished");
Looks like both FE and BE set "Planning Finished" in the timeline. Is that 
wrong?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:02:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 11
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:00:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11280/7/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/11280/7/tests/custom_cluster/test_local_catalog.py@25
PS7, Line 25:
> Actually ended up just implementing these tests here instead of adding the
even better, thx!


http://gerrit.cloudera.org:8080/#/c/11280/11/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/11280/11/tests/custom_cluster/test_local_catalog.py@42
PS11, Line 42: --per_impalad_args=
nice.



--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 11
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:00:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift
File common/thrift/Types.thrift:

http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift@205
PS6, Line 205: specifier
> This is what I understood from the change here: now, TFunction can be used
yep, that's right. Will amend the comments as you suggested.



--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:55:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift
File common/thrift/Types.thrift:

http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift@205
PS6, Line 205: specifier
> hmm, by duplicate, do you mean making a new struct? or change the argument
This is what I understood from the change here: now, TFunction can be used as a 
query. So for example, if you want a java udf's for function 'foo' that return 
strings, you would just specify binary_type and ret_type and the fetch would 
retrieve the subset that matches.
After seeing how this is used in catalogmetaprovider, I see that its just the 
function name is set and the others are left unset. Perhaps this will clarify 
it:
- change L198 to state that this struct is used to represent functions in the 
catalog, query plans, and as specifiers to fetch function overloads by name.
- state that when used as a specifier, only the name is set (required) and the 
previously required fields are not used.



--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:54:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11368 )

Change subject: Elaborating error message when incorrect port number is 
specified in impala-shell.
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3116/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33
Gerrit-Change-Number: 11368
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:51:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11368 )

Change subject: Elaborating error message when incorrect port number is 
specified in impala-shell.
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33
Gerrit-Change-Number: 11368
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:51:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] Elaborating error message when incorrect port number is specified in impala-shell.

2018-09-05 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11368 )

Change subject: Elaborating error message when incorrect port number is 
specified in impala-shell.
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33
Gerrit-Change-Number: 11368
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:44:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Tianyi Wang, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11280

to look at the new patch set (#11).

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..

IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

This implements cache invalidation inside CatalogdMetaProvider. The
design is as follows:

- when the catalogd collects updates into the statestore topic, it now
  adds an additional entry for each table and database. These additional
  entries are minimal - they only include the object's name, but no
  metadata. This new behavior is conditional on a new flag
  --catalog_topic_mode. The default mode is to keep the old style, but
  it can be configured to mixed (support both v1 and v2) or v2-only.

- the old-style topic entries are prefixed with a '1:' whereas the new
  minimal entries are prefixed with a '2:'. The impalad will subscribe
  to one or the other prefix depending on whether it is running with
  --use_local_catalog. Thus, old impalads will not be confused by the
  new entries and vice versa.

- when the impalad gets these topic updates, it forwards them through to
  the catalog implementation. The LocalCatalog implementation forwards
  them to the CatalogdMetaProvider, which uses them to invalidate
  cached metadata as appropriate.

This patch includes some basic unit tests. I also did some manual
testing by connecting to different impalads and verifying that a session
connected to impalad #1 saw the effects of DDLs made by impalad #2
within a short period of time (the statestore topic update frequency).

Existing end-to-end tests cover these code paths pretty thoroughly:

- if we didn't automatically invalidate the cache on a coordinator
  in response to DDL operations, then any test which expects to
  "read its own writes" (eg access a table after creating one)
  would fail
- if we didn't propagate invalidations via the statestore, then
  all of the tests that use sync_ddl would fail.

I verified the test coverage above using some of the tests in
test_ddl.py -- I selectively commented out a few of the invalidation
code paths in the new code and verified that tests failed until I
re-introduced them. Along the way I also improved test_ddl so that, when
this code is broken, it properly fails with a timeout. It also has a bit
of expanded coverage for both the SYNC_DDL and non-SYNC cases.

I also wrote a new custom-cluster test for LocalCatalog that verifies
a few of the specific edge cases like detecting catalogd restart,
SYNC_DDL behavior in mixed mode, etc.

One notable exception here is the implementation of INVALIDATE METADATA
This turned out to be complex to implement, so I left a lengthy TODO
describing the issue and filed a JIRA.

Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
---
M be/src/catalog/catalog-server.cc
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/common/Pair.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M tests/common/custom_cluster_test_suite.py
A tests/custom_cluster/test_local_catalog.py
M tests/metadata/test_ddl.py
20 files changed, 825 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/11280/11
--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 11
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift
File common/thrift/Types.thrift:

http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift@205
PS6, Line 205: specifier
> This is clear in the context of this change (and elegant), but I think it w
hmm, by duplicate, do you mean making a new struct? or change the argument of 
GetPartialCatalogObject so that it uses some union of TFunctionName, 
TTableName, database name, etc, instead of CatalogObject?


http://gerrit.cloudera.org:8080/#/c/11359/6/common/thrift/Types.thrift@206
PS6, Line 206: optional
> just checking, but when changing this from required to optional, existing,
Yep, it only changes the runtime validation to be less strict, not the 
wire/disk format.



--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:40:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11280/7/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/11280/7/tests/custom_cluster/test_local_catalog.py@25
PS7, Line 25:
> k, I'll add a TODO see if I can follow that example.
Actually ended up just implementing these tests here instead of adding the TODO.



--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 10
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:37:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-05 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 21: Code-Review+1

carry +1


--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 21
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:33:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-05 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 20:

(3 comments)

carry +1

http://gerrit.cloudera.org:8080/#/c/11314/20/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/11314/20/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@999
PS20, Line 999: // If object ownership is enabled in Sentry, we need to 
update the owner privilege
  : // in the catalog so that any subsequent statements that 
rely on that privilege, or
  : // the absence, will function correctly without waiting for 
the next refresh.
  : // e.g. For create, the privileges should be available to 
immediately create a table.
  : // Additionally, if the metadata operation is successful, 
but sentry fails to add
  : // the privilege, it will be removed on the next refresh. 
ALTER DATABASE SET OWNER
  : // can be used to try adding the owner privilege again.
  : if (isObjectOwnershipEnabled()) {
  :   Preconditions.checkNotNull(params.server_name);
  :   TPrivilege filter = 
createDatabaseOwnerPrivilege(newDb.getName(),
  :   params.server_name);
  :   
addPrivilegeToCatalog(newDb.getMetaStoreDb().getOwnerName(),
  :   newDb.getMetaStoreDb().getOwnerType(), filter, resp);
  : }
> It would be nice to move similar blocks into separate functions.
Done


http://gerrit.cloudera.org:8080/#/c/11314/20/fe/src/test/resources/mysql-hive-site.xml.template
File fe/src/test/resources/mysql-hive-site.xml.template:

http://gerrit.cloudera.org:8080/#/c/11314/20/fe/src/test/resources/mysql-hive-site.xml.template@153
PS20, Line 153: 
  : 
  :   hive.metastore.transactional.event.listeners
  :   
org.apache.hive.hcatalog.listener.DbNotificationListener
  : 
  : 
  :   hive.metastore.event.listeners
  :   
org.apache.sentry.binding.metastore.SentrySyncHMSNotificationsPostEventListener
  : 
  : 
  :   hcatalog.message.factory.impl.json
  :   
org.apache.sentry.binding.metastore.messaging.json.SentryJSONMessageFactory
  : 
> How is this change related to the patch? Maybe it could be mentioned in the
Done


http://gerrit.cloudera.org:8080/#/c/11314/20/tests/authorization/test_owner_privileges.py
File tests/authorization/test_owner_privileges.py:

http://gerrit.cloudera.org:8080/#/c/11314/20/tests/authorization/test_owner_privileges.py@250
PS20, Line 250: __root_query_fail
> As I see this functions are always used in pair with __verify_exceptions().
Done



--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 20
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:33:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-05 Thread Adam Holley (Code Review)
Adam Holley has uploaded a new patch set (#21). ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..

IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

This patch adds calls to automatically create or remove owner
privileges in the catalog based on the statement.  This is similar to
the existing pattern where after privileges are granted in Sentry,
they are created in the catalog directly instead of pulled from
Sentry.

The sentry-site.xml and hive-site.xml template files have been updated
to enable usage of object ownership by default for easier development
with a secured impala.

When object ownership is enabled:
CREATE DATABASE will grant the user OWNER privileges to that database.
ALTER DATABASE SET OWNER will transfer the OWNER privileges to the
new owner.
DROP DATABASE will revoke the OWNER privileges from the owner.
This will apply to DATABASE, TABLE, and VIEW.

Example:
If ownership is enabled, when a table is created, the creator is the
owner, and Sentry will create owner privileges for the created table so
the user can continue working with it without waiting for Sentry
refresh.  Inserts will be available immediately.

Testing:
- Created new custom cluster tests for object ownership

Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
---
M bin/create-test-configuration.sh
M bin/impala-config.sh
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java
M fe/src/main/java/org/apache/impala/util/SentryProxy.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M fe/src/test/resources/mysql-hive-site.xml.template
M fe/src/test/resources/postgresql-hive-site.xml.template
M fe/src/test/resources/sentry-site.xml.template
A fe/src/test/resources/sentry-site_no_oo.xml.template
A fe/src/test/resources/sentry-site_oo_nogrant.xml.template
M testdata/bin/run-sentry-service.sh
A tests/authorization/test_owner_privileges.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_test_suite.py
31 files changed, 1,043 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11314/21
--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 21
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 2:

> Patch Set 2:
>
> Forgot to git add FrontendProfile?

Indeed... I will never learn.


--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:07:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 4:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog-util.cc
File be/src/catalog/catalog-util.cc:

http://gerrit.cloudera.org:8080/#/c/11224/3/be/src/catalog/catalog-util.cc@256
PS3, Line 256: bool TTableName::operator<(const impala::TTableName& that) const 
{
> This is only for the following map usage in thrift. Renaming is not possibl
Could we make that a list of pairs instead to get around this? This works fine 
for now, but if for example we at some point added a third field in TTableName 
(eg a "catalog" level to the naming) we'd have a really hard-to-find bug since 
it's unlikely anyone would remember to update this comparator. Furthermore, 
such a bug could cause bad crashes since operator== and operator< would end up 
inconsistent.


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@209
PS4, Line 209: invalidate_tables_timeout
nit: this should have a suffix like "_sec" or "_s" so it's clear the units


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@218
PS4, Line 218: invalidate_tables_gc_
if the other two flags are hidden, maybe we shouldn't document them? It seems 
inconsistent to hide them and then mention them in a non-hidden flag.


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@86
PS4, Line 86: /**
:* A callback called after a shrinking pass is triggered. Used 
for test purposes.
:*/
wrong doc


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@112
PS4, Line 112: Preconditions.checkArgument(unusedTableTtlSec >= 0);
add back the message so that if the user hits this, they know which 
configuration they got wrong


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@141
PS4, Line 141: if (poolName.contains("Old")) {
maybe invert this and 'continue' so you can unnest the rest of the block?


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@148
PS4, Line 148: it's not a implementation of
more concise: "it does not implement"


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@209
PS4, Line 209: if (table.isLoaded()) {
style nit: I think easier to read this loop if we invert this: "if 
(!table.isLoaded()) continue;"
(generally I find it easier to read code when there is less nesting going on, 
by handling error or "unusual" cases up front in the control flow)


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@211
PS4, Line 211:   if (inactivityTime > retireAgeNano) {
same here


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@255
PS4, Line 255: unusedTableTtlNano_
this should also be changed to be more often than the TTL for the same reason 
as below.


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@260
PS4, Line 260: unusedTableTtlSec
needs update


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@266
PS4, Line 266: error
this should probably be a WARNING -- I think ERROR should be reserved for 
unrecoverable user-impacting errors, whereas this is a recoverable error. I 
think we should also ensure that the error message explains the context and 
what will happen as a result. For example, something like:

LOG.warning("Unexpected exception thrown while attempting to automatically 
invalidate tables. Will retry in {} seconds", sleepTime, e)

that way if the user sees the message they will know that this doesn't mean 
that some user query was failed, or that data was lost, or whatever.

Also, I think we should make sure that if an exception is thrown, we still 
sleep before retrying. Otherwise we might get into some state with a tight 
loop. Perhaps we should just add a fixed wait of 5 seconds or something here in 
the catch clause?


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
File 

[Impala-ASF-CR] IMPALA-7137 fixup: only recreate AuthorizationChecker on full delta updates

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11357 )

Change subject: IMPALA-7137 fixup: only recreate AuthorizationChecker on full 
delta updates
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I47f5945fd64c0adf3fe7fd82076f64672358f962
Gerrit-Change-Number: 11357
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 17:43:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11193 )

Change subject: IMPALA-7425: Change incremental stats to pull from catalogd.
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Gerrit-Change-Number: 11193
Gerrit-PatchSet: 15
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 17:38:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11193 )

Change subject: IMPALA-7425: Change incremental stats to pull from catalogd.
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3115/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Gerrit-Change-Number: 11193
Gerrit-PatchSet: 15
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 17:38:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-05 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 6:

(4 comments)

several more nits

http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@761
PS6, Line 761: req, "missing expected function");
nit: single line


http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1154
PS6, Line 1154: an
nit: a


http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java:

http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java@191
PS6, Line 191: functions
nit: function names?


http://gerrit.cloudera.org:8080/#/c/11359/6/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java@219
PS6, Line 219:  from HMS
omit... could this change from getting these functions from catalogd or 
directly from hms? seems like it depends on the provider.



--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 17:36:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11183 )

Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue 
conversions
..


Patch Set 19:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/587/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11183
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa
Gerrit-Change-Number: 11183
Gerrit-PatchSet: 19
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 05 Sep 2018 17:29:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7425: Change incremental stats to pull from catalogd.

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11193 )

Change subject: IMPALA-7425: Change incremental stats to pull from catalogd.
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/585/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Gerrit-Change-Number: 11193
Gerrit-PatchSet: 14
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 17:25:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions

2018-09-05 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11183 )

Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue 
conversions
..


Patch Set 16:

(20 comments)

Thanks for the review and sorry for the large number of patches. Please look at 
patch set 17-20 as one commit.

http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG@10
PS16, Line 10: do
> typo: to
Done


http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG@51
PS16, Line 51:
> Maybe you could mention your results about the speedup.
I have added this information to line 26.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc
File be/src/benchmarks/convert-timestamp-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@254
PS16, Line 254: v1
> nit: use better names, e.g.:
Done


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@592
PS16, Line 592: (uint64_t)MICROS_PER_SEC
> nit: please use static_cast
Is there a reason to use static_cast for primitive numeric types? IMO C style 
casts add less noise and expresses the intent with the same clarity 
(pointers/references are totally different stories...).


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@604
PS16, Line 604: && nanos == other.nanos
> If nanos are more variable in the benchmarks then maybe it'd be worth to ch
I have just realized that this is not needed anymore.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@627
PS16, Line 627: / ONE_BILLIONTH
> * NANOS_PER_SEC ?
The results are actually not always the same due to rounding. I left it this 
way in TimestampValue to avoid changing external behavior, so I also left it 
like this in the benchmark.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@634
PS16, Line 634: / ONE_BILLIONTH
> * NANOS_PER_SEC ?
See last answer.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/exprs/expr-test.cc@6508
PS16, Line 6508: ScopedLocalUnixTimestampConversionOverride
> I just noticed that this class has a bug.
FLAGS_use_local_tz_for_unix_timestamp_conversions should be false by default in 
tests, but I agree that restoring the original value fits the goal of the class 
better.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@241
PS16, Line 241: (double) seconds + (double)
> nit: static_cast
See my comment in convert-timezone-benchmark.cc.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@241
PS16, Line 241: 1000
> nit: MILLIS_PER_SEC?
Attila mentioned this too. I think that 1000 is easier to read than 
MILLIS_PER_SEC. (hmm, maybe if both of you have mentioned it than I am wrong 
here :)


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@250
PS16, Line 250: 8
> nit: please give it a name, e.g. MARGIN_OF_ERROR
Done


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@261
PS16, Line 261: 1000
> nit: MILLIS_PER_SEC from gutil/walltime.h?
Same as for line 241.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@266
PS16, Line 266: 1000
> nit: MICROS_PER_MILLI from walltime.h?
Same as for line 241.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@280
PS16, Line 280: 60*60
> it appears twice
Done


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@188
PS16, Line 188: (int64_t)1000*1000*1000
> nit: static_cast
I was not happy to write this here, but walltime.h constants are not available 
in this file. I did not want to include walltime.h, because if 
timestamp-value.h is included (indirectly) from a very large number of files, 
and I did not want to make compilation slower.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@318
PS16, Line 318: int64_t SplitTime(int64_t* ticks)
> Maybe it would be cleaner to change the signature to void SplitTime(const i
I see the point, and I trust the compiler that it will eliminate the variabled, 
but I think that this would make most calling functions a bit more complicated. 
For example FromUnixTimeNanos would need +2 variables and for

  unix_time += SplitTime();


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@317
PS16, Line 317:   template 

[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11183 )

Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue 
conversions
..


Patch Set 18:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/586/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11183
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa
Gerrit-Change-Number: 11183
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 05 Sep 2018 17:15:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-376: add built-in functions for parsing JSON

2018-09-05 Thread Greg Rahn (Code Review)
Greg Rahn has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10950 )

Change subject: IMPALA-376: add built-in functions for parsing JSON
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/10950/13/common/function-registry/impala_functions.py@514
PS13, Line 514: get_json_object
I also left a comment on the Jira, but I'd recommend we name and implement the 
ANSI SQL function json_value(). It will be more portable and is the ANSI SQL 
standard and appears to be very similar.



--
To view, visit http://gerrit.cloudera.org:8080/10950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f
Gerrit-Change-Number: 10950
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 05 Sep 2018 17:10:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions

2018-09-05 Thread Csaba Ringhofer (Code Review)
Hello Zoltan Borok-Nagy, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11183

to look at the new patch set (#20).

Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue 
conversions
..

IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions

Impala used to convert from sub-second unix time to TimestampValue
(which is split to date_ and time_ similarly to
boost::posix_time::ptime) by first splitting the input into seconds
and sub-seconds part, converting the seconds part wit
boost::posix_time::from_time_t(), and then adding the sub-seconds
part to this timestamp.

Different tricks are used to speed up different functions:
- UTC functions that expect a single integer as input can
  split it into date_ and time_ directly.
- Non-UTC functions need seconds for timezone conversion,
  because CCTZ expects time points as seconds. These
  were optimized by adding the subsecond part to time_
  instead of adding it to a ptime. This can be done safely
  because the sub-second part is between [0, 1 sec), so
  it cannot overflow into a different day or timezone.

Benchmarks show 2x - 6x speedup for the measured functions.

The main motivation is IMPALA-5050: "Add support to read
TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet
scanner" - reading these types will run
micro/milli->TimestampValue conversion for every row.

Other changes:
- TimestampValue::UtcFromUnixTimeMillis was added - currently this
  is only used in tests but it will be useful for IMPALA-5050
- Some functions were moved from .h to .inline.h.
- FromUnixTimeMicros was changed to do the utc->local conversion
  depending on flag use_local_tz_for_unix_timestamp_conversions
  to be consistent with other similar functions. This function was
  only used in tests until now but it will be useful for IMPALA-5050.
- When a result mismatch is detected in
  convert-timestamp-benchmark.cc it now prints non-equal values.
- Benchmarks were added for micro + nano conversions.
  Note that only single threaded benchmarks were added because I
  do not expect any difference in the multi threaded case.
- DCHECKs were added to TimeStampValue::Validate to
  ensure that time_ is between [0, 24 hour).

Testing:
- timestamp-test.cc was extended to give better coverage
  for sub-second conversions. Edge cases were already
  covered pretty well.

Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/runtime/timestamp-value.inline.h
9 files changed, 378 insertions(+), 108 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/11183/20
--
To view, visit http://gerrit.cloudera.org:8080/11183
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa
Gerrit-Change-Number: 11183
Gerrit-PatchSet: 20
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


  1   2   >