[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5406/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 16
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 23:58:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9154: Make runtime filter propagation asynchronous

2020-01-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14975 )

Change subject: IMPALA-9154: Make runtime filter propagation asynchronous
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-backend-state.h@21
PS2, Line 21: #include 
I don't think you need this here


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-backend-state.cc@593
PS2, Line 593: if (state->num_inflight_rpcs() == 0) {
Its a little subtle why this is correct - since we hold FilterState::lock_ 
while we issue all of the PublishFilter rpcs, for us to be able to get the lock 
again here on line 591 means that all PublishFilters that will be issued for 
this FilterState must have already been issued, this filter has been disabled, 
and if we reach 0 we know that there won't be any more PublishFilters issued.

Can you add a DCHECK(state->disabled()) and a brief comment mentioning this?


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h
File be/src/runtime/coordinator-filter-state.h:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h@58
PS2, Line 58: /// Once a filter is disabled, subsequent updates for that filter 
are ignored.
We should document the thread safety of this class now, eg. if you take my 
advice from some of my other comments it will be something like "This class is 
not thread safe. Callers must always take 'lock()' themselves when calling any 
FilterState functions if thread safety is needed"


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h@92
PS2, Line 92: get_lock()
just 'lock()'


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h@161
PS2, Line 161: update_filter_done_cv_
publish_filter_done_cv_


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h@178
PS2, Line 178:   SpinLock update_lock;
This isn't used anywhere anymore, right?


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.h@100
PS2, Line 100: /// Lock ordering: (lower-numbered acquired before 
higher-numbered)
Please update this comment to reflect the new protocol - what's listed here as 
filter_lock_ was replaced by FilterRoutingTable::lock in a previous patch so 
you can go ahead and update that too, and filter_update_lock_ has been replaced 
with FilterState::lock_ (and of course check that we are in fact following the 
ordering shown here for those locks)


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.cc@1050
PS2, Line 1050:   for (auto& filter : filter_routing_table_->id_to_filter) {
  : filter.second.WaitForPublishFilter();
  :   }
  :
  :   for (auto& filter : filter_routing_table_->id_to_filter) {
  : FilterState* state = 
  : state->DisableAndRelease(filter_mem_tracker_);
  :   }
> I have noticed that this patch failed the core tests in the ASAN build due
Yes - you definitely have to do WaitForPublishFilter() and DisableAndRelease() 
atomically with respect to FilterState::lock_ somehow, since otherwise after 
caling WaitForPublishFilter() you could get another call to UpdateFilter() 
which will attempt to make PublishFilter rpcs even though we think all the 
PublishFilter rpcs for this FilterState are done.

Possibly the most straight forward way of doing that (if you follow my other 
comments) is to not have WaitForPublishFilter() take FilterState::lock_, but 
instead take FilterState::get_lock() ourselves here in the 'for' loop and hold 
it while calling both WaitForPublishFilter() and DisableAndRelease()


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.cc@1315
PS2, Line 1315:   unique_lock l(lock_);
I think that this is the only place where you take FilterState::lock_ from 
within a FilterState function instead of using FilterState::get_lock(). Of 
course, usually that sort of thing is better encapsulation, but there isn't 
really a good way to avoid having a FilterState::get_lock() function, eg. 
because its needed in Coordinator::UpdateFilter, so I think it might be better 
to just be consistent and always require that callers take 
FilterState::get_lock() themselves before calling any FilterState functions.



--
To view, visit 

[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 22:48:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..

IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

Currently we always do the auth check for tables/dbs individually.
If the user has privileges higher in the hierarchy then it is not
necessary to do these checks as they will all succeed.

This change adds a higher level check before the individual checks.
The optimization is only enabled for Sentry, as Ranger has deny
policies, so server/db level access does not guarantee db/table level
access.

Testing:
- the existing auth related test coverage seems enough
- there are no tests for deny policies yet - adding them seems a bigger
  task so I created a follow up jira: IMPALA-9252

Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Reviewed-on: http://gerrit.cloudera.org:8080/14867
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
1 file changed, 53 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 16:

(1 comment)

Thanks Csaba's continuous reviews!

Changes:
* Refactored FromeClause#reset()
* Fix test failure in AnalyzeStmtsTest#TestClone()

http://gerrit.cloudera.org:8080/#/c/14894/15/fe/src/main/java/org/apache/impala/analysis/FromClause.java
File fe/src/main/java/org/apache/impala/analysis/FromClause.java:

http://gerrit.cloudera.org:8080/#/c/14894/15/fe/src/main/java/org/apache/impala/analysis/FromClause.java@126
PS15, Line 126: ++i) {
> beautification: I would prefer to move this to the end of the loop, as "dir
Good point!



--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 16
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 23:31:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 16:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5399/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 16
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 23:31:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5405/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:40:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically
for the partition events. Before the patch, the self-event identifiers
for a partition were stored at a table level when generating the
partition events. This was problematic since unlike ADD_PARTITION and
DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events
generated, only the first event is identified as a self-event and the
rest of the events are processed. This patch fixes this by adding the
self-event identifiers to each partition so that when the event is later
received, each ALTER_PARTITION uses the state stored in HdfsPartition to
valuate the self-events. The patch makes sure that the event processor
takes a table lock during self-event evaluation to avoid races with
other parts of the code which try to modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a
loaded table (incomplete tables are not refreshed) when a ALTER_TABLE
event is received instead of invalidating the table. This makes the
events processor consistent with respect to all the other event types.
In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when
user modifies the comment on the database.
2. In case of queries like "alter table add if not exists partition...",
the partition is not added since its pre-existing. The self-event
identifiers should not be added in such cases since no event is expected
from such queries.
3. Changed wait_for_event_processing test util method in
EventProcessorUtils to use a more deterministic way to determine if the
catalog updates have propogated to impalad instead of waiting for a
random duration of time.  This also speeds up the event processing tests
significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala
queries and makes sure that the event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. Ran core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Reviewed-on: http://gerrit.cloudera.org:8080/14799
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableNotLoadedException.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
14 files changed, 1,091 insertions(+), 666 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 10
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..


Patch Set 9: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 22:45:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Vihang Karajgaonkar, Kurt Deschler, Csaba Ringhofer, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14894

to look at the new patch set (#16).

Change subject: IMPALA-9009: Core support for Ranger column masking
..

IMPALA-9009: Core support for Ranger column masking

Ranger provides column masking policies about how to show masked values
to specific users when reading specific columns. This patch adds support
to rewrite the query AST based on column masking policies.

We perform the column masking policies by replacing the TableRef with a
subquery doing the masking. For instance, the following query
  select c_id, c_name from customer c join orders on c_id = o_cid
will be transfomed into
  select c_id, c_name  from (
select mask1(c_id) as c_id, mask2(c_name) as c_name from customer
  ) c
  join orders
  on c_id = o_cid

The transfomation is done in AST resolution. Just like view resolution,
if the table needs masking we replace it with a subquery(InlineViewRef)
containing the masking expressions.

This patch only adds support for mask types that don't require builtin
mask functions. So currently supported masking types are MASK_NULL and
CUSTOM.

Current Limitations:
 - Users are required to have privileges on all columns of a masked
   table(IMPALA-9223), since the table mask subquery contains all the
   columns.

Tests:
 - Add e2e tests for masked results
 - Run core tests

Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
A fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationFactory.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M tests/authorization/test_ranger.py
M tests/common/impala_test_suite.py
26 files changed, 968 insertions(+), 188 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/14894/16
--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 16
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 16: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 16
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 11 Jan 2020 04:05:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP]IMPALA-8778: Support Apache Hudi Read Optimized Table

2020-01-10 Thread Yanjia Gary Li (Code Review)
Yanjia Gary Li has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14711 )

Change subject: [WIP]IMPALA-8778: Support Apache Hudi Read Optimized Table
..


Patch Set 7:

> Patch Set 7:
>
> (3 comments)
>
> Hi! Thanks for working on this, left some comments there.

Thanks for reviewing. Now I realized this PR should be far more complicated 
than the current commit, but the good thing is I can get some insights from 
https://jira.apache.org/jira/browse/IMPALA-5717
I will let you guys know when this is ready to review.


--
To view, visit http://gerrit.cloudera.org:8080/14711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Gerrit-Change-Number: 14711
Gerrit-PatchSet: 7
Gerrit-Owner: Yanjia Gary Li 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Yanjia Gary Li 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 11 Jan 2020 06:04:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5400/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 11:59:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5401/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 15
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 11:59:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Vihang Karajgaonkar, Kurt Deschler, Csaba Ringhofer, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14894

to look at the new patch set (#14).

Change subject: IMPALA-9009: Core support for Ranger column masking
..

IMPALA-9009: Core support for Ranger column masking

Ranger provides column masking policies about how to show masked values
to specific users when reading specific columns. This patch adds support
to rewrite the query AST based on column masking policies.

We perform the column masking policies by replacing the TableRef with a
subquery doing the masking. For instance, the following query
  select c_id, c_name from customer c join orders on c_id = o_cid
will be transfomed into
  select c_id, c_name  from (
select mask1(c_id) as c_id, mask2(c_name) as c_name from customer
  ) c
  join orders
  on c_id = o_cid

The transfomation is done in AST resolution. Just like view resolution,
if the table needs masking we replace it with a subquery(InlineViewRef)
containing the masking expressions.

This patch only adds support for mask types that don't require builtin
mask functions. So currently supported masking types are MASK_NULL and
CUSTOM.

Current Limitations:
 - Users are required to have privileges on all columns of a masked
   table(IMPALA-9223), since the table mask subquery contains all the
   columns.

Tests:
 - Add e2e tests for masked results
 - Run core tests

Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
A fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationFactory.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M tests/authorization/test_ranger.py
M tests/common/impala_test_suite.py
25 files changed, 949 insertions(+), 178 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/14894/14
--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 15:

(2 comments)

Changes: Fix a bug in FromClause#reset() that I forget to migrate join 
properties back to the unmasked table.

http://gerrit.cloudera.org:8080/#/c/14894/13/fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java:

http://gerrit.cloudera.org:8080/#/c/14894/13/fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java@49
PS13, Line 49: // Disable table masking since we don't actually read the 
data.
> nit: missing .
Done


http://gerrit.cloudera.org:8080/#/c/14894/13/fe/src/main/java/org/apache/impala/analysis/FromClause.java
File fe/src/main/java/org/apache/impala/analysis/FromClause.java:

http://gerrit.cloudera.org:8080/#/c/14894/13/fe/src/main/java/org/apache/impala/analysis/FromClause.java@116
PS13, Line 116:* sure we get the same results in later resolution, the 
unresolved tableRefs should
  :* use fully qualified paths. Otherwise, non-fully qualified 
paths might incorrectly
  :* match a local view.
  :* However, we don't un-resolve views because local views 
don't have fully qualified
  :* paths. Due to this we don't unmask a TableMasking view if 
the underlying
> The new origTblRef will only overwrite the old one in line 130, so it needs
origTblRef can be InlineViewRef after unmasking if it's a TableRef for a view 
(either local views in WITH-clause or catalog view). If it's a view after 
unmasking we don't replace it with a unresolved one. I spent a lot of time 
understanding this part (why except views) and updated the comments. Also 
refactor this function into clearly two parts: unmasking and 'un-resolving'.

Here also has a bug that if we unmask the table, we forget to migrate back the 
join properties. Added test coverage for this.



--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 15
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 11:40:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Vihang Karajgaonkar, Kurt Deschler, Csaba Ringhofer, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14894

to look at the new patch set (#15).

Change subject: IMPALA-9009: Core support for Ranger column masking
..

IMPALA-9009: Core support for Ranger column masking

Ranger provides column masking policies about how to show masked values
to specific users when reading specific columns. This patch adds support
to rewrite the query AST based on column masking policies.

We perform the column masking policies by replacing the TableRef with a
subquery doing the masking. For instance, the following query
  select c_id, c_name from customer c join orders on c_id = o_cid
will be transfomed into
  select c_id, c_name  from (
select mask1(c_id) as c_id, mask2(c_name) as c_name from customer
  ) c
  join orders
  on c_id = o_cid

The transfomation is done in AST resolution. Just like view resolution,
if the table needs masking we replace it with a subquery(InlineViewRef)
containing the masking expressions.

This patch only adds support for mask types that don't require builtin
mask functions. So currently supported masking types are MASK_NULL and
CUSTOM.

Current Limitations:
 - Users are required to have privileges on all columns of a masked
   table(IMPALA-9223), since the table mask subquery contains all the
   columns.

Tests:
 - Add e2e tests for masked results
 - Run core tests

Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
A fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationFactory.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M tests/authorization/test_ranger.py
M tests/common/impala_test_suite.py
25 files changed, 949 insertions(+), 178 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/14894/15
--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 15
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9277: Catch exception thrown from orc::ColumnSelector::updateSelectedByTypeId

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14994 )

Change subject: IMPALA-9277: Catch exception thrown from 
orc::ColumnSelector::updateSelectedByTypeId
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f706bc832298cb5089e539b7a818cb86d02199f
Gerrit-Change-Number: 14994
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 10 Jan 2020 13:06:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9277: Catch exception thrown from orc::ColumnSelector::updateSelectedByTypeId

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14994 )

Change subject: IMPALA-9277: Catch exception thrown from 
orc::ColumnSelector::updateSelectedByTypeId
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5396/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f706bc832298cb5089e539b7a818cb86d02199f
Gerrit-Change-Number: 14994
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 10 Jan 2020 13:06:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5395/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 15
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 12:33:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8046: Support CREATE TABLE from an ORC file

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14811 )

Change subject: IMPALA-8046: Support CREATE TABLE from an ORC file
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14811/11/fe/src/main/java/org/apache/impala/analysis/ParquetSchemaExtractor.java
File fe/src/main/java/org/apache/impala/analysis/ParquetSchemaExtractor.java:

http://gerrit.cloudera.org:8080/#/c/14811/11/fe/src/main/java/org/apache/impala/analysis/ParquetSchemaExtractor.java@113
PS11, Line 113:* 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#backward-compatibility-rules-1
line too long (104 > 90)


http://gerrit.cloudera.org:8080/#/c/14811/11/fe/src/main/java/org/apache/impala/analysis/ParquetSchemaExtractor.java@186
PS11, Line 186:* 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#backward-compatibility-rules
line too long (102 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/14811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77cd84cda2ed86516937a67eb320fd41e3f1cf2d
Gerrit-Change-Number: 14811
Gerrit-PatchSet: 11
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 10 Jan 2020 12:49:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8046: Support CREATE TABLE from an ORC file

2020-01-10 Thread Norbert Luksa (Code Review)
Norbert Luksa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14811 )

Change subject: IMPALA-8046: Support CREATE TABLE from an ORC file
..


Patch Set 11:

Rebased.


--
To view, visit http://gerrit.cloudera.org:8080/14811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77cd84cda2ed86516937a67eb320fd41e3f1cf2d
Gerrit-Change-Number: 14811
Gerrit-PatchSet: 11
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 10 Jan 2020 12:48:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8046: Support CREATE TABLE from an ORC file

2020-01-10 Thread Norbert Luksa (Code Review)
Norbert Luksa has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/14811 )

Change subject: IMPALA-8046: Support CREATE TABLE from an ORC file
..

IMPALA-8046: Support CREATE TABLE from an ORC file

Impala supports creating a table using the schema of a file.
However, only Parquet is supported currently. This commit adds
support for creating tables from ORC files

The change relies on the ORC Java API with version 1.5 or
greater, because of a bug in earlier versions. Therefore, ORC is
listed as an external dependency, instead of relying on Hive's
ORC version (from Hive3, Hive also lists it as a dependency).

Also, the commit performs a little clean-up on the ParquetHelper
class, renaming it to ParquetSchemaExtractor and removing outdated
comments.

To create a table from an ORC file, run:
CREATE TABLE tablename LIKE ORC '/path/to/file'

Tests:
 * Added analysis tests for primitive and complex types.
 * Added e2e tests for creating tables from ORC files.

Change-Id: I77cd84cda2ed86516937a67eb320fd41e3f1cf2d
---
M bin/impala-config.sh
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
A fe/src/main/java/org/apache/impala/analysis/OrcSchemaExtractor.java
R fe/src/main/java/org/apache/impala/analysis/ParquetSchemaExtractor.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
A fe/src/main/java/org/apache/impala/util/FileAnalysisUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M impala-parent/pom.xml
M shaded-deps/pom.xml
A 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test
M tests/common/skip.py
M tests/metadata/test_ddl.py
15 files changed, 495 insertions(+), 81 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14811/11
-- 
To view, visit http://gerrit.cloudera.org:8080/14811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77cd84cda2ed86516937a67eb320fd41e3f1cf2d
Gerrit-Change-Number: 14811
Gerrit-PatchSet: 11
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8046: Support CREATE TABLE from an ORC file

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14811 )

Change subject: IMPALA-8046: Support CREATE TABLE from an ORC file
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5402/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77cd84cda2ed86516937a67eb320fd41e3f1cf2d
Gerrit-Change-Number: 14811
Gerrit-PatchSet: 11
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 10 Jan 2020 13:19:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2020-01-10 Thread Csaba Ringhofer (Code Review)
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14867

to look at the new patch set (#11).

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..

IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

Currently we always do the auth check for tables/dbs individually.
If the user has privileges higher in the hierarchy then it is not
necessary to do these checks as they will all succeed.

This change adds a higher level check before the individual checks.
The optimization is only enabled for Sentry, as Ranger has deny
policies, so server/db level access does not guarantee db/table level
access.

Testing:
- the existing auth related test coverage seems enough
- there are no tests for deny policies yet - adding them seems a bigger
  task so I created a follow up jira: IMPALA-9252

Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
1 file changed, 53 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/14867/11
--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2020-01-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14867/10/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/14867/10/fe/src/main/java/org/apache/impala/service/Frontend.java@1029
PS10, Line 1029:
> missed changing this to getProviderName()?
Done


http://gerrit.cloudera.org:8080/#/c/14867/10/fe/src/main/java/org/apache/impala/service/Frontend.java@1084
PS10, Line 1084: ) {
> do you think we should have preconditions in this method and in userHasDbLe
I reorganized the code a bit to avoid similar issues.



--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 15:12:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-8778: Support Apache Hudi Read Optimized Table

2020-01-10 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14711 )

Change subject: [WIP]IMPALA-8778: Support Apache Hudi Read Optimized Table
..


Patch Set 7:

(1 comment)

Thanks for applying the changes. You could also add some scanner tests that 
create a Hudi table and issue a few queries against it.

To add those tests please look at the 'test_scanners.py' file and the .test 
files in the QueryTest directory.

http://gerrit.cloudera.org:8080/#/c/14711/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14711/7//COMMIT_MSG@12
PS7, Line 12: Filte
nit: Filter



--
To view, visit http://gerrit.cloudera.org:8080/14711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Gerrit-Change-Number: 14711
Gerrit-PatchSet: 7
Gerrit-Owner: Yanjia Gary Li 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Yanjia Gary Li 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 10 Jan 2020 15:17:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5403/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 15:41:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 15: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5395/


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 15
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 17:04:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9277: Catch exception thrown from orc::ColumnSelector::updateSelectedByTypeId

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14994 )

Change subject: IMPALA-9277: Catch exception thrown from 
orc::ColumnSelector::updateSelectedByTypeId
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f706bc832298cb5089e539b7a818cb86d02199f
Gerrit-Change-Number: 14994
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 10 Jan 2020 17:35:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9277: Catch exception thrown from orc::ColumnSelector::updateSelectedByTypeId

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14994 )

Change subject: IMPALA-9277: Catch exception thrown from 
orc::ColumnSelector::updateSelectedByTypeId
..

IMPALA-9277: Catch exception thrown from 
orc::ColumnSelector::updateSelectedByTypeId

orc::ColumnSelector::updateSelectedByTypeId can throw an exception on
malformed ORC files. The exception wasn't caught by Impala therefore it
caused program termination.

The fix is to simply catch the exception and return with a parse error
instead.

Testing:
* added corrupt ORC file and e2e test

Change-Id: I2f706bc832298cb5089e539b7a818cb86d02199f
Reviewed-on: http://gerrit.cloudera.org:8080/14994
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-orc-scanner.cc
M testdata/data/README
A testdata/data/corrupt_schema.orc
M tests/query_test/test_scanners.py
4 files changed, 26 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2f706bc832298cb5089e539b7a818cb86d02199f
Gerrit-Change-Number: 14994
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-9154: Make runtime filter propagation asynchronous

2020-01-10 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/14975 )

Change subject: IMPALA-9154: Make runtime filter propagation asynchronous
..

IMPALA-9154: Make runtime filter propagation asynchronous

This patch fixes a bug introduced by IMPALA-7984 that ports the
functions implementing the aggregation and propagation of runtime
filters from Thrift RPC to KRPC.

Specifically, in IMPALA-7984, the propagation of an aggregated
runtime filter was implemented using the synchronous KRPC. Hence, when
there is a very limited number of KRPC threads for Impala's data stream
service, e.g., 1, there will be a deadlock if the node running the
Coordinator is trying to propagate the aggregated filter to the same
node running the Coordinator since there is no available thread to
receive the aggregated filter.

This patch makes the propagation of an aggregated runtime filter
asynchronous to address the issue described above. To prevent the
memory consumed by the aggregated filter from being reclaimed when the
aggregated filter is still referenced by some inflight KRPC's, we add an
additional field in the class Coordinator::FilterState to keep track of
the number of inflight KRPC's for the propagation of this aggregated
filter to make sure that we will reclaim the memory only when all the
associated KRPC's have completed. Moreover, when ReleaseExecResources()
is invoked by the Coordinator to release all the resources associated
with query execution, including the memory consumed by the aggregated
runtime filters, we make sure the consumed memory by the aggregated
filters is released only when the inflight KRPC's associated with each
aggregated filter have finished.

Testing:
- Passed primitive_many_fragments.test with the database tpch30 in an
  Impala minicluster started with the parameter
  --impalad_args=--datastream_service_num_svc_threads=1.
- Passed the exhaustive tests in the DEBUG build.

Change-Id: Ifb6726d349be701f3a0602b2ad5a934082f188a0
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
5 files changed, 150 insertions(+), 71 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/14975/2
--
To view, visit http://gerrit.cloudera.org:8080/14975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb6726d349be701f3a0602b2ad5a934082f188a0
Gerrit-Change-Number: 14975
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2020-01-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 15: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14894/15/fe/src/main/java/org/apache/impala/analysis/FromClause.java
File fe/src/main/java/org/apache/impala/analysis/FromClause.java:

http://gerrit.cloudera.org:8080/#/c/14894/15/fe/src/main/java/org/apache/impala/analysis/FromClause.java@126
PS15, Line 126: get(i++).reset()
beautification: I would prefer to move this to the end of the loop, as "dirty 
for expressions" do not seem to be common in Impala. Continues could be 
replaced with returns by creating a function like 
unmaskAndUnresolveTableRef(int i).

It could be also mentioned in a comment that that the recursion happens here 
for views.



--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 15
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 17:50:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2020-01-10 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..


Patch Set 11: Code-Review+2

Thanks for making the changes. Looks good to me.


--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 17:59:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9154: Make runtime filter propagation asynchronous

2020-01-10 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14975 )

Change subject: IMPALA-9154: Make runtime filter propagation asynchronous
..


Patch Set 2:

(1 comment)

> Patch Set 1:
>
> (9 comments)
>
> You've got several formatting issues. Instead of me pointing all of them out, 
> please run clang-format

Hi all, I have some thoughts on the locking protocol when 
Coordinator::ReleaseExecResources() is called. Please also let me know if you 
have other ideas. Thanks!

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.cc@1050
PS2, Line 1050:   for (auto& filter : filter_routing_table_->id_to_filter) {
  : filter.second.WaitForPublishFilter();
  :   }
  :
  :   for (auto& filter : filter_routing_table_->id_to_filter) {
  : FilterState* state = 
  : state->DisableAndRelease(filter_mem_tracker_);
  :   }
I have noticed that this patch failed the core tests in the ASAN build due to 
some heap-use-after-free error.

After some initial thoughts, I think this two for-loops should not be 
separated. Instead, inside WaitForPublishFilter(), after acquiring the lock for 
a FilterState 'state', we should immediately call state->DisableAndRelease() to 
prevent the related memory from being accessed later in UpdateFilter() or 
ApplyUpdate().

This scenario above seems possible when there are more than one instances of 
FilterState in 'filter_routing_table_' because it is possible that 
'num_inflight_rpcs_' for some FilterState was 0 when its WaitForPublishFilter() 
was called but later on when we are checking the next FilterState, there could 
be another thread that updates 'num_inflight_rpcs_' of the previous FilterState 
and tries to access some memory associated with the previous FilterState.

I do not have a concrete proof about my theory above since there is not enough 
information in the log. But I will first try to combine these two for-loops in 
a proper way and run the core tests in the ASAN build again.



--
To view, visit http://gerrit.cloudera.org:8080/14975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb6726d349be701f3a0602b2ad5a934082f188a0
Gerrit-Change-Number: 14975
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:06:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..


Patch Set 12: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:10:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically
for the partition events. Before the patch, the self-event identifiers
for a partition were stored at a table level when generating the
partition events. This was problematic since unlike ADD_PARTITION and
DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events
generated, only the first event is identified as a self-event and the
rest of the events are processed. This patch fixes this by adding the
self-event identifiers to each partition so that when the event is later
received, each ALTER_PARTITION uses the state stored in HdfsPartition to
valuate the self-events. The patch makes sure that the event processor
takes a table lock during self-event evaluation to avoid races with
other parts of the code which try to modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a
loaded table (incomplete tables are not refreshed) when a ALTER_TABLE
event is received instead of invalidating the table. This makes the
events processor consistent with respect to all the other event types.
In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when
user modifies the comment on the database.
2. In case of queries like "alter table add if not exists partition...",
the partition is not added since its pre-existing. The self-event
identifiers should not be added in such cases since no event is expected
from such queries.
3. Changed wait_for_event_processing test util method in
EventProcessorUtils to use a more deterministic way to determine if the
catalog updates have propogated to impalad instead of waiting for a
random duration of time.  This also speeds up the event processing tests
significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala
queries and makes sure that the event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. Ran core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableNotLoadedException.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
14 files changed, 1,091 insertions(+), 666 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/14799/8
--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14799/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14799/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2265
PS7, Line 2265: table == null ||
> nit: don't need it. already checked above
Thanks for catching that. Fixed.


http://gerrit.cloudera.org:8080/#/c/14799/7/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/7/tests/custom_cluster/test_event_processing.py@245
PS7, Line 245: [
> nit: Not sure if moving the "[" above to be right after "True: " cound fix
Done



--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:09:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5397/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:10:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..


Patch Set 8: Code-Review+2

Carrying code-review votes from Quanlong and Anurag.


--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:10:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14799/8/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/8/tests/custom_cluster/test_event_processing.py@283
PS8, Line 283: "
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/8/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/8/tests/util/event_processor_utils.py@42
PS8, Line 42: )
flake8: E501 line too long (91 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:10:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:11:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition 
events
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5398/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:11:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9154: Make runtime filter propagation asynchronous

2020-01-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14975 )

Change subject: IMPALA-9154: Make runtime filter propagation asynchronous
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5404/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb6726d349be701f3a0602b2ad5a934082f188a0
Gerrit-Change-Number: 14975
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:15:04 +
Gerrit-HasComments: No