[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 24: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 24
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 29 Apr 2019 21:24:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Reviewed-on: http://gerrit.cloudera.org:8080/12889
Tested-by: Impala Public Jenkins 
Reviewed-by: Bharath Vissapragada 
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/resources/hive-site.xml.py
A tests/custom_cluster/test_event_processing.py
10 files changed, 592 insertions(+), 13 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Bharath Vissapragada: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 25
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 24: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 24
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 29 Apr 2019 20:59:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 24:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2964/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 24
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 29 Apr 2019 16:12:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 24:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4101/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 24
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 29 Apr 2019 15:26:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-29 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 24:

Rebased after CDH change.


-- 
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 24
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 29 Apr 2019 15:26:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-29 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#24). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/resources/hive-site.xml.py
A tests/custom_cluster/test_event_processing.py
10 files changed, 592 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/24
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 24
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 23:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4100/


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 23
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sun, 28 Apr 2019 01:55:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 23:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4100/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 23
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sun, 28 Apr 2019 00:06:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 23: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4047/


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 23
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 19 Apr 2019 18:07:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 23:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4047/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 23
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 19 Apr 2019 16:24:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 23: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 23
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 19 Apr 2019 16:24:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 22: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4044/


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 22
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 19 Apr 2019 00:45:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 22: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 22
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 23:06:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 21: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 21
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 23:06:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 22:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4044/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 22
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 23:06:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 21:

This python file generates the templates for hive-site.xml - Every mini-cluster 
will have this config set. This is just like the event processing flags are set.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 21
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 23:03:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 21:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2838/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 21
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 22:56:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Bharath Krishna (Code Review)
Bharath Krishna has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 21:

How does this work for the tests in the MetastoreEventsProcessorTest , does it 
need dml.events = true as well?


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 21
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 22:41:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 21:

Build failed as the HMS flag that enables insert notification was not set by 
default. Added hive.metastore.dml.events=true config in hive-site.xml.py


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 21
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 22:13:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#21). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/resources/hive-site.xml.py
A tests/custom_cluster/test_event_processing.py
10 files changed, 576 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/21
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 21
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 20: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4040/


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 20
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 07:19:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 20: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 20
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 01:59:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-17 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 19: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 19
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 01:59:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 19:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2832/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 19
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 01:38:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-17 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 19:

(2 comments)

Thanks Bharath. Changed the tests to account for slower builds.

http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@122
PS18, Line 122: time.sleep(bu
> Lets use a longer timeout for slower builds like ASAN. How about 2 for regu
Done


http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@133
PS18, Line 133:
> Breaks if someone updates the page. Parse it into keyvalue pairs and pick u
Done



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 19
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 18 Apr 2019 00:57:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-17 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#19). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 575 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/19
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 19
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-17 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 18: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@109
PS18, Line 109: :
nit: formatting


http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@122
PS18, Line 122: time.sleep(2)
Lets use a longer timeout for slower builds like ASAN. How about 2 for regular 
and 4s for slower? (Look for the usages of build_flavor_timeout)


http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@133
PS18, Line 133: 31
Breaks if someone updates the page. Parse it into keyvalue pairs and pick up 
the value mapping to the key?



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 18
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Apr 2019 20:50:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 18:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2805/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 18
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 16 Apr 2019 21:43:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-16 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 18:

Thanks for the pointers Bharath. I tried to make the wait for event processing 
more deterministic by scraping the /events page for last_sync_event_id to know 
if the events have been processed.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 18
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 16 Apr 2019 20:59:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-16 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#18). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 569 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/18
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 18
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py@59
PS16, Line 59:  time.sleep(10)
> Hmm no, I can't think of any. This custom cluster starts with an event proc
Spoke with Anurag offline. We discussed a way to scrape the /events page to get 
the last synced event ID. We can poll that to make sure events are processed.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 16
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 16 Apr 2019 00:33:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2787/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 17
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 15 Apr 2019 21:28:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-15 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#17). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 525 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/17
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 17
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 16:

(5 comments)

lgtm, can +2 once the remaining nits are fixed.

http://gerrit.cloudera.org:8080/#/c/12889/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3587
PS16, Line 3587: Map> filesBeforeInsert = new HashMap<>();
nit: Move closer to it's usage.


http://gerrit.cloudera.org:8080/#/c/12889/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3723
PS16, Line 3723:   // Get the files before insert. This is used to compute the 
delta of files to
   :   // fire insert events.
   :   if (catalog_.isExternalEventProcessingEnabled()) {
   :  if (!update.is_overwrite) {
   :for (FeFsPartition partition : 
affectedExistingPartitions) {
   :  filesBeforeInsert.put(partition.getId(),
   :  ((HdfsPartition) partition).getFileNames());
   :   }
   : }
   :   }
nit: move this inside createInsertEvents()?


http://gerrit.cloudera.org:8080/#/c/12889/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3773
PS16, Line 3773: This list is empty if this is an insert overwrite operation.
nit: May be rephrase to what HMS expects? That way semantics are more clear.


http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py@59
PS16, Line 59:  time.sleep(10)
Is there a more deterministic way of doing this? Might turn out to be a flaky 
test in builds like ASAN.


http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py@79
PS16, Line 79:  time.sleep(10)
same.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 16
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 15 Apr 2019 17:30:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2777/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 16
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 15 Apr 2019 10:19:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-15 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#16). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 528 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/16
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 16
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-14 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 14:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3608
PS14, Line 3608:  if (catalog_.isExternalEventProcessingEnabled()) {
> We don't wish to do this if event processing is disabled right?
we can just collect it and discard if event processing is disabled. I think the 
extra branch affects readability, especially in this part of the code which is 
already too complex.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3704
PS14, Line 3704: if (catalog_.isExternalEventProcessingEnabled()) {
> We don't wish this path be taken if event processor is disabled. Right?
same as above.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3743
PS14, Line 3743:   // After loading metadata, fire insert event if external 
event processing is
   :   // enabled.
   :   if (table.getNumClusteringCols() > 0) {
   : createInsertEventsForPartitions(table, 
filesBeforeInsertForPartitions,
   : update.is_overwrite);
   :   } else  {
   : createInsertEventForTable(table, 
filesBeforeInsertForTable, update.is_overwrite);
   :   }
> As mentioned above this is not possible because of two different data struc
Right. Why do we need to care about 'id' if the table is not partitioned. We 
will only have one partition (which is the only partition).  That is what I 
meant when I said "using table.getNumClusteringCols() you'd know if 
filesBeforeInsert corresponds to a partitioned or a non partitioned table ".


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3787
PS14, Line 3787: deltaFiles = ((HdfsPartition)part).getFileNames();
> This looks right to me. We want deltaFiles to be empty when it is an overwr
"We want deltaFiles to be empty when it is an overwrite". Ok this is not 
obvious. Document this then? I thought that HMS expected that the delta should 
contain the new set of files even incase of an overwrite.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 14
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sun, 14 Apr 2019 21:40:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2772/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 15
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 13 Apr 2019 09:00:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-13 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#15). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 548 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/15
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 15
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-13 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 15:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3587
PS14, Line 3587:  Map> filesBeforeInsertForPartitions = new 
HashMap<>();
   :   Set filesBeforeInsertForTable = new HashSet<>();
> Both of them can be handled in a single map? Also, rename to filesBeforeIns
Here we track fileNames of all affectedPartitions using a Map. Later, this map is used to retrieve and compare fileNames to 
calculate deltas using partitionIds. However, partitionId cannot be used for a 
non-partitioned table because they change during load().

One solution was to use partitionNames instead of partitionIds. We decided to 
go with Ids because using partitionNames was too error prone.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3589
PS14, Line 3589: affectedExistingPartitions = new
> just say affectedPartitions ?
Changed it to affectedExistingPartitions. Want to stress on 'existing' as these 
are existing partitions that are involved in this insert event. There could be 
some new partitions created in the insert which are not stored here.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3608
PS14, Line 3608:  if (catalog_.isExternalEventProcessingEnabled()) {
> not needed.
We don't wish to do this if event processing is disabled right?


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3704
PS14, Line 3704: if (catalog_.isExternalEventProcessingEnabled()) {
> not needed.
We don't wish this path be taken if event processor is disabled. Right?


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3736
PS14, Line 3736: filesBeforeInsertForTable = ((HdfsPartition) 
Iterables
> Like I mentioned above, I don't see why we should use two different datastr
As mentioned before, we cannot keep track of filenames using partitionIds for 
non-partitioned tables as we drop and create the single partition during load. 
(partitionId changes.). Hence two datastructures.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3743
PS14, Line 3743:   // After loading metadata, fire insert event if external 
event processing is
   :   // enabled.
   :   if (table.getNumClusteringCols() > 0) {
   : createInsertEventsForPartitions(table, 
filesBeforeInsertForPartitions,
   : update.is_overwrite);
   :   } else  {
   : createInsertEventForTable(table, 
filesBeforeInsertForTable, update.is_overwrite);
   :   }
> How about changing the signature to
As mentioned above this is not possible because of two different data 
structures for filesBeforeInsert for partition case and non-partitioned case


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3787
PS14, Line 3787: deltaFiles = ((HdfsPartition)part).getFileNames();
> shouldn't this go before if? tests didn't catch this?
This looks right to me. We want deltaFiles to be empty when it is an overwrite. 
We only calculate the new files if its not an overwrite.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3791
PS14, Line 3791: deltaFiles.size
> need this?
Removed.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3813
PS14, Line 3813: deltaFiles = ((HdfsPartition)singlePart).getFileNames();
> same question, shouldn't deltaFiles go before if?
We want delta files to be empty if it is an overwrite.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 15
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 13 Apr 2019 08:19:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2768/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 14
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 13 Apr 2019 02:56:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 14:

(9 comments)

Some more suggestions. I think it is getting closer.

http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3587
PS14, Line 3587:  Map> filesBeforeInsertForPartitions = new 
HashMap<>();
   :   Set filesBeforeInsertForTable = new HashSet<>();
Both of them can be handled in a single map? Also, rename to filesBeforeInsert?


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3589
PS14, Line 3589: existingPartitionsTouchedByInsert
just say affectedPartitions ?


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3608
PS14, Line 3608:  if (catalog_.isExternalEventProcessingEnabled()) {
not needed.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3704
PS14, Line 3704: if (catalog_.isExternalEventProcessingEnabled()) {
not needed.


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3736
PS14, Line 3736: filesBeforeInsertForTable = ((HdfsPartition) 
Iterables
Like I mentioned above, I don't see why we should use two different 
datastructures here. Even an unparitioned table is implemented as a single 
partitioned table. I suggest to refactor like this.

if (eventProcessingEnabled && !overwrite) .{
  for (part: affectedPartitions) {
filesBeforeInsert.put(part.id, part.getFiles());
  }
}


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3743
PS14, Line 3743:   // After loading metadata, fire insert event if external 
event processing is
   :   // enabled.
   :   if (table.getNumClusteringCols() > 0) {
   : createInsertEventsForPartitions(table, 
filesBeforeInsertForPartitions,
   : update.is_overwrite);
   :   } else  {
   : createInsertEventForTable(table, 
filesBeforeInsertForTable, update.is_overwrite);
   :   }
How about changing the signature to

fireInsertEvents(table, filsBeforeInsert, is_overwrite)

using table.getNumClusteringCols() you'd know if filesBeforeInsert corresponds 
to a partitioned or a non partitioned table (you could create your event 
accordingly)


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3787
PS14, Line 3787: deltaFiles = ((HdfsPartition)part).getFileNames();
shouldn't this go before if? tests didn't catch this?


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3791
PS14, Line 3791: String.valueOf(
need this?


http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3813
PS14, Line 3813: deltaFiles = ((HdfsPartition)singlePart).getFileNames();
same question, shouldn't deltaFiles go before if?

deltaFiles = allCurrentFiles;
if (!overrwrite) {
  deltaFiles.removeAll(oldFiles)
}



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 14
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 13 Apr 2019 03:08:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 548 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/14
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 14
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 14:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/12889/13/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/12889/13/common/thrift/CatalogService.thrift@193
PS13, Line 193:   // True if the update corresponds to an "insert overwrite" 
operation
> nit: I think we should say "True if this update corresponds to an 'insert o
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@396
PS13, Line 396:  /**
  :  * Util method to issue invalidate on a given table on the 
catalog. This method
  :  * atomically invalidates the table if it exists in the 
catalog. No-op if the table
  :  * does not exist
  :  */
  : p
> don't think this needs a separate method. Inline it at the caller?
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@582
PS13, Line 582:
> ..handler..Also, add some more color to it? Like it handles the inserts at
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@586
PS13, Line 586:
  : InsertEven
> instead say. Null if the table is unprartitioned...or something like that?
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@614
PS13, Line 614:
> this is obvious, remove?
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@619
PS13, Line 619: **
  :  * Process partition inserts
  :  */
  : private void processPartit
> braces
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@630
PS13, Line 630:   Preconditions.checkState(fsList.size() == 
partVals.size());
> Preconditions.checkNotNull(insertPartition_);
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@664
PS13, Line 664: d
> unpartitioned ..
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@667
PS13, Line 667:   try {
> Preconditions.checkArgument(partition == null)
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3583
PS13, Line 3583: Collection parts =
   :   FeCatalogUtils.loadAllParti
> Remove, this is obv?
Done


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3589
PS13, Line 3589:   List existingPartitionsTouchedByInsert = 
new ArrayList<>();
> I suggest to refactor the code like this. Lemme know what you think. I thin
Thanks for the suggestion. Sounds good. Refactored code according to your 
suggestion. However, tracking files for partitions is slightly different from 
that with table inserts. non-partitioned tables change ids after load, so we 
cannot track using a map. Hence the if...else for calculating files.


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3767
PS13, Line 3767: * fireInsertEvent() if external event processing is enabled. 
This is no-op otherwise.
   :*
> shouldn't this be done only for the affected partitions?
The filesBeforeInsertForPartitions map contains only the affected partitions.


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3823
PS13, Line 3823: *  insert. In case of partitioned table, this event also 
contains the partition
   :*  values of existing partitions which were touched by the 
insert.
   :*/
   :   private void fireInsertEvent(Table tbl, List 
partVals,
   :
> insertData.setReplace(isOverwrite)
Done



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: 

[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/13/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/12889/13/common/thrift/CatalogService.thrift@193
PS13, Line 193:   // True if an insert was overwrite operation
nit: I think we should say "True if this update corresponds to an 'insert 
overwrite' operation".

Looks like TUpdateCatalogRequest can be more generic and not aways for an 
insert operation.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 13
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 13 Apr 2019 01:46:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 13:

(12 comments)

I think we could still refactor the code better. I've added some suggestions, 
lemme know. I'll take another pass after the refactor.

http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@396
PS13, Line 396:  /**
  :  * Util to get partition key values of a given map of 
partition keys and values
  :  */
  : protected String getPartitionKeyValuesAsString(Map partSpec) {
  :   return 
Joiner.on(",").withKeyValueSeparator("=").join(partSpec);
  : }
don't think this needs a separate method. Inline it at the caller?


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@582
PS13, Line 582:
..handler..Also, add some more color to it? Like it handles the inserts at the 
table and partition scope.


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@586
PS13, Line 586: this is not a partition
  : // insert.
instead say. Null if the table is unprartitioned...or something like that?


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@614
PS13, Line 614: This means insert events generated by impala will also be 
processed
this is obvious, remove?


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@619
PS13, Line 619:  if (insertPartition_ != null)
  : processPartitionInserts();
  :   else
  : processTableInserts();
braces


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@630
PS13, Line 630:   Map partSpec = new HashMap<>();
Preconditions.checkNotNull(insertPartition_);


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@664
PS13, Line 664:
unpartitioned ..


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@667
PS13, Line 667:   // For non-partitioned tables, refresh the whole table.
Preconditions.checkArgument(partition == null)


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3583
PS13, Line 3583: // For non-partitioned tables, parts below will contain a 
single value. The
   :   // partition key will be empty.
Remove, this is obv?


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3589
PS13, Line 3589:   Map> filesBeforeInsertForPartitions = 
new HashMap<>();
I suggest to refactor the code like this. Lemme know what you think. I think 
that will be much cleaner and keeps the event code together than mixing it with 
the insert code.

Also, I think for "overwrite" you don't need to compute a delta, which means 
you don't need to compute the fileListing "before" the load.

dirtyParts = new List<>;

if (partitioned) {
   for each partition affected in the logic below;
   dirtyParts.append(affectedPart)
} else {
   dirtyParts.append(onlyPartition)
}


if (eventProcessingEnabled || !overwrite) {
   currentFileListing = getFilesForDirtyParts();
}

loadMetadata();

if (eventProcessingEnabled) {
   fireEvents(fileListingBeforeLoad, dirtyParts());
}

fireEvents() {
   computeDelta();
   fireEvents();

}


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3767
PS13, Line 3767:  Collection partsPostInsert =
   : 
((FeFsTable)table).loadPartitions(filesBeforeInsertForPartitions.keySet());
shouldn't this be done only for the affected partitions?


http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3823
PS13, Line 3823:  if (isOverwrite) {
   :   insertData.setReplace(true);
   : } else {
   :   insertData.setReplace(false);
   : }
insertData.setReplace(isOverwrite)



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 13: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 13
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 17:51:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@516
PS12, Line 516: false
> Yes. Corrected.
thanks



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 12
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 17:51:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2751/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 13
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 07:00:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@516
PS12, Line 516: true,
> you probably meant to use true here?
Yes. Corrected.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 13
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 06:14:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#13). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 540 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/13
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 13
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-12 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 12: Code-Review+1

(2 comments)

Thanks for making the suggested changes. Patch looks good me. Left one minor 
fix in the test below.

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@463
PS6, Line 463: String parentPathString =
 : "/test-warehouse/" + dbName +".db/" + tblName;
 : String filePathString = isPartitionInsert ? 
"/p1=testPartVal/testFile.0" :
 : "/testFile.0";
> Thanks for this suggestion. The load table at L489 is after a partition has
sounds good


http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@516
PS12, Line 516: false
you probably meant to use true here?



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 12
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 06:03:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2749/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 11
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 05:42:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2750/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 12
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 05:42:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@525
PS11, Line 525:   assertTrue("Partition not found after insert.", 
partsAfterInsertOverwrite.size() > 0);
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 11
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 04:58:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 540 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/12
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 12
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 539 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/11
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 11
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2748/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 10
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 04:33:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/10/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/10/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@524
PS10, Line 524:   assertTrue("Partition not found after insert.", 
partsAfterInsertOverwrite.size() > 0);
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 10
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 03:52:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 538 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/10
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 10
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 10:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12889/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3823
PS9, Line 3823:
> This is interesting because based on the thrift definition of InsertEventRe
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@450
PS6, Line 450: testInsertEvents
> Does this test inserOverwrite case as well? Seems important enough not to b
>From Impala side, we are not using the overwrite flag so I thought it was too 
>trivial to test. But I see your point. Adding a test that calls a process on 
>overwrite event.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@463
PS6, Line 463: String parentPathString =
 : "/test-warehouse/" + dbName +".db/" + tblName;
 : String filePathString = isPartitionInsert ? 
"/p1=testPartVal/testFile.0" :
 : "/testFile.0";
> I am not a big fan of hardcoding paths in the tests esp. if the root locati
Thanks for this suggestion. The load table at L489 is after a partition has 
been added. catalog_.getTable().getMetastoreTable().getSd().getLocation() is 
null for some reason. I will investigate more.  I'm trying to avoid 
getOrLoad(). For now let's keep this.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 10
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 12 Apr 2019 03:51:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@450
PS6, Line 450: testInsertEvents
Does this test inserOverwrite case as well? Seems important enough not to be 
missed.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 6
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 11 Apr 2019 19:58:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-11 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 6:

(5 comments)

patch looks good to me. Just wanted to make sure that we are handling the 
replace flag correctly below when generating the insert events.

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@423
PS6, Line 423: refreshCatalogTable
> I left this here because I wanted to add metrics of "Tables Refreshed" in a
Sounds good if you plan to use it later.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@431
PS6, Line 431: throws CatalogException {
> Same as earlier reply. Wanted to have metric for partitionsRefreshed here.
sounds good.


http://gerrit.cloudera.org:8080/#/c/12889/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3823
PS9, Line 3823:
This is interesting because based on the thrift definition of 
InsertEventRequestData the default value of replace flag is unset (which means 
false since its a boolean). Perhaps we should be explicitly set it to true here 
as well to make sure. Suggest you to do the following.

isOverwrite ? insertData.setReplace(true) : insertData.setReplace(false);


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@463
PS6, Line 463: String parentPathString =
 : "/test-warehouse/" + dbName +".db/" + tblName;
 : String filePathString = isPartitionInsert ? 
"/p1=testPartVal/testFile.0" :
 : "/testFile.0";
> You are right. However, almost all the FE tests have hardcoded this locatio
I am not a big fan of hardcoding paths in the tests esp. if the root location 
can be trivially found by catalog_.getTable().getMetastoreTable(). You are 
getting the tbl object anyways down below at line 489. If you move that line 
here you don't need any extra calls.

Not a blocker comment, can be done later if needed.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@476
PS6, Line 476:   out.close();
> I moved it to finally. Do you know a less messy way to do this?
Looks like FSDataOutputStream does not implement Autocloseable, hence 
try-with-resources could not be used. Guess we will have to live with this for 
now.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 6
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 11 Apr 2019 19:57:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2726/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 9
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Apr 2019 21:53:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2725/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 8
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Apr 2019 21:39:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3789
PS8, Line 3789:*
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 8
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Apr 2019 21:16:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-10 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 514 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/9
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 9
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-10 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 514 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/8
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 8
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2723/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 7
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Apr 2019 21:12:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-10 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 7:

(31 comments)

Thanks Vihang and Bharath for the comments. Is there a better way to handle 
finally block in MetastoreEventsProcessorTest?

http://gerrit.cloudera.org:8080/#/c/12889/6/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/12889/6/be/src/service/client-request-state.cc@1106
PS6, Line 1106:
> nit: I think this is kind of obv (and also there is a comment in the thrift
Removed.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2063
PS6, Line 2063: erwise. Throws CatalogE
> Looks like it throws DbNotFoundEx if the db doesn't exist.
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2065
PS6, Line 2065:  */
  :   public boolean refreshTableIfExists(String dbName
> nit: Just say, returns true if reload is successful, false otherwise? Remov
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2077
PS6, Line 2077:
> nit, use "after insert events" to be more specific
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2077
PS6, Line 2077: Db doesn't exist.
> I feel the entire statement can be omitted. Callers may change that makes t
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2082
PS6, Line 2082: if (table == null || table instanceof IncompleteTable) 
return false;
> same comments as above javadoc.
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@835
PS6, Line 835: s
> nit: Returns..
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@398
PS6, Line 398: */
 : protected String getPartitionKeyValuesAsString(Map partSpec) {
 :   return 
Joiner.on(",").withKeyValueSeparator("=").join(partSpec);
 : }
 :
 : /**
 :
> Use Joiner from guava
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@423
PS6, Line 423:  value of the table
> since this method is not doing anything else apart from calling catalog_ref
I left this here because I wanted to add metrics of "Tables Refreshed" in a 
separate patch. Not very sure if this is any useful to users. Would like to 
know your thoughts. For now, I have removed it.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@431
PS6, Line 431:   Boolean tblProperty = getHmsSyncProperty(msTbl_);
> same as above, The method is doing nothing other than calling another metho
Same as earlier reply. Wanted to have metric for partitionsRefreshed here. 
Removed for now.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@630
PS6, Line 630:   Map partSpec = new HashMap<>();
> Any reason why?
Added the reason for this


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@635
PS6, Line 635:   Preconditions.checkState(fsList.size() == partVals.size());
> can we break this into two helpers? processPartitionedInsert() and processN
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@641
PS6, Line 641: // refresh fails.
> probably worth while to check the following conditions:
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@649
PS6, Line 649:
> nit, remove Automatic. kind of redundant since debugLog prints the event in
Done


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@673
PS6, Line 673: nt in the
> nit, remove
Done



[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-10 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 512 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/7
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 7
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 6:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/12889/6/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/12889/6/be/src/service/client-request-state.cc@1106
PS6, Line 1106: // is_overwrite is used to know the type of insert in FE.
nit: I think this is kind of obv (and also there is a comment in the thrift 
def). Consider removing.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2063
PS6, Line 2063: Db/table doesn't exist.
Looks like it throws DbNotFoundEx if the db doesn't exist.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2065
PS6, Line 2065:  * - Throws CatalogException if reloadTable() is unsuccessful.
  :* - Returns true only if reloadTable() succeeds.
nit: Just say, returns true if reload is successful, false otherwise? Remove 
hyphens if not needed.
(same below)


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2077
PS6, Line 2077: Useful to refresh the partition after inserts
I feel the entire statement can be omitted. Callers may change that makes the 
doc stale.?


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2082
PS6, Line 2082:* - Returns true only if reloadPartition() succeeds.
same comments as above javadoc.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@835
PS6, Line 835:
nit: Returns..


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@398
PS6, Line 398:protected String getPartitionKeyValuesAsString(Map partSpec) {
 :   StringBuilder str = new StringBuilder();
 :   for (Map.Entry entry : 
partSpec.entrySet()) {
 : str.append(entry.getKey() + "=" + entry.getValue() + " 
");
 :   }
 :   return str.toString();
 : }
Use Joiner from guava


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@630
PS6, Line 630:  * Currently we do not check for self-events in Inserts. 
This means insert events
Any reason why?


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@635
PS6, Line 635: public void process() throws MetastoreNotificationException {
can we break this into two helpers? processPartitionedInsert() and 
processNonPartitionedInsert()? (for readability, since there are two many 
nested branches.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3558
PS6, Line 3558:  HiveConf hiveConf = new HiveConf(this.getClass());
any reason to move this to here from where it was before


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3578
PS6, Line 3578: 
filesBeforeInsertForPartitions.put(partition.getId(),
> Can we do this conditionally when event processing is enabled?
+1, this is a very commonly used codepth and seems wasteful to unpack the file 
descriptor info if event processing is not configured.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3674
PS6, Line 3674: Preconditions.checkState(parts.size() == 1);
This is taken care of by Iterabls.getOnlyElement()


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3721
PS6, Line 3721:*/
document method args? Map doesn't seem obvious.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3729
PS6, Line 3729:  if 

[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-09 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 6:

(15 comments)

Thanks for making the changes. The patch looks great. I have left some minor 
suggestions and then its good from my side.

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2077
PS6, Line 2077: after inserts
nit, use "after insert events" to be more specific


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@423
PS6, Line 423: refreshCatalogTable
since this method is not doing anything else apart from calling 
catalog_refreshTableIfExists .. may be we can get rid of it.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@431
PS6, Line 431: throws CatalogException {
same as above, The method is doing nothing other than calling another method. 
So may as well remove it and call the underlying method directly.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@641
PS6, Line 641: List partVals = insertPartition_.getValues();
probably worth while to check the following conditions:

Preconditions.checkNotNull(partVals)
Preconditions.checkState (fsList.size() == partVals.size())


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@649
PS6, Line 649: Automatic
nit, remove Automatic. kind of redundant since debugLog prints the event info 
as well.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@673
PS6, Line 673: Automatic
nit, remove


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3580
PS5, Line 3580: rtition.isMarkedCached())
> This code path is taken by only HDFSTables as you can see on L3522.
Got it. Thanks for the clarification.


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3578
PS6, Line 3578: 
filesBeforeInsertForPartitions.put(partition.getId(),
Can we do this conditionally when event processing is enabled?


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3722
PS6, Line 3722: createInsertEventForPartition
nit, Adding plural to the name makes it more readable since it is firing 
multiple insert events. may be, createInsertEventsForPartitions?


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3741
PS6, Line 3741: for (String fileName : deltaFiles) {
  : newFiles.add(fileName);
  :   }
Looks like you are making this copy because the InsertEventRequestData in 
fireInsertEvent method needs a list. In such a case, perhaps a cleaner way is 
to just pass deltaFiles from here and make the copy in that method simply using 
rqst.setFilesAdded(new ArrayList<>(deltaFiles))


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3767
PS6, Line 3767: for (String fileName : deltaFiles) {
  : newFiles.add(fileName);
  :   }
same as previous comment, you can just pass the deltaFiles to the 
fireInsertEvent and reuse the logic


http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@432
PS6, Line 432:   public void testInsertEvents() throws TException, 
ImpalaException {
thanks for adding this test!



[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2690/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 6
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 08 Apr 2019 23:22:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-08 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 6:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/12889/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12889/5//COMMIT_MSG@14
PS5, Line 14: partitions.
:
: Known Issues:
> This may not be applicable anymore
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@413
PS5, Line 413: ts(dbName_, tb
> I think this call is not correct since this will be a no-op if the table is
Thanks for this catch. Used reloadTable() instead which forces a reload every 
time.

reloadPartition()


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@601
PS5, Line 601:*  Metastore event for INSERT events.
> These following two lines can be
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@204
PS5, Line 204:
> Typo sofar
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@38
PS5, Line 38: import org.apache.hadoop.hive.metastore.PartitionDropOptions;
> unused?
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3551
PS5, Line 3551:   // partition key will be empty.
> nit : comma after tables.
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3557
PS5, Line 3557: InsertForTable = new HashSet<>();
> may be a better name would be to suggest that this contains files before in
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3558
PS5, Line 3558: eConf = new HiveConf(this.
> is this unused?
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3580
PS5, Line 3580: rtition.isMarkedCached())
> Is there a concern here of running into CastException? I see that FeFsParti
This code path is taken by only HDFSTables as you can see on L3522.


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3673
PS5, Line 3673: // For
> may be do a else if(catalog.isExternalEventProcessingEnabled()) here so tha
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3674
PS5, Line 3674: Preconditions.checkState(parts.size() == 1);
> May be add a Preconditions.checkState(parts.size == 1); here to make sure t
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3676
PS5, Line 3676: BeforeInsertForTable = (((
> same as above, Do we need to handle LocalFsPartition as well?
The code path is taken by HdfsTable, so we do not need to handle 
LocalFsPartition.


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3699
PS5, Line 3699:   createInsertEve
> May be you can create 2 methods, one for partitioned case and another for n
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3716
PS5, Line 3716:   }
> Add to the description that this method is a no-op if event processing is d
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3719
PS5, Line 3719: nsert and ca
> nit, change the name to isInsertOverwrite
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3736
PS5, Line 3736: List newFiles = new ArrayList<>();
> I think it would be helpful to add info log here which says how many new fi
Done


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3750
PS5, Line 3750:   }
> add a info level log here which prints how many new files were added into t
Done



[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-08 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
Using this event, event-processor will refresh table/partition.
Both insert into and overwrite are supported for tables/partitions.

Known Issues:
1. Inserts into tables from Hive are ignored by the event processor
   as these inserts create an ALTER event first followed by an
   INSERT event. The alter will invalidate table making the refresh
   a no-op. Insert into partitions from hive will create an INSERT
   event first followed by an ALTER event. In this case, there is
   an unnecessary table invalidate after a refresh.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
1. Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

2. Added a test in MetastoreEventsProcessor for testing insert events.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A tests/custom_cluster/test_event_processing.py
9 files changed, 518 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/6
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 6
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-03 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 5:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/12889/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12889/5//COMMIT_MSG@14
PS5, Line 14: Also, renamed
: TableInvalidatingEvent class to TableInvalidatingOrRefreshingEvent
: to reflect new behaviour.
This may not be applicable anymore


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@413
PS5, Line 413: getOrLoadTable
I think this call is not correct since this will be a no-op if the table is 
loaded already. You should probably use reloadTable() method here to force the 
refresh.

I also see that there is a reloadPartition() method in CatalogServiceCatalog. 
Can we use that method to refresh only the Partition when the insert_event is 
for a given partition in case of partitioned tables.


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3575
PS4, Line 3575:   // Attempt to remove this partition name from from 
partsToCreate. If remove
> This is a good suggestion for partitioned tables. However, for unpartitione
I see. Thanks for that info. using nulls are keys is error-prone since the 
methods where this collection is passed need to be aware of this to avoid NPE. 
Its better to just use a separate collection like you seemed to have started 
doing that by having filesForUnpartitionedTable before.


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@38
PS5, Line 38: import org.apache.hadoop.hive.conf.HiveConf.ConfVars;
unused?


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3557
PS5, Line 3557: existingFilesForInsertedPartitions
may be a better name would be to suggest that this contains files before 
insert.. so may something like filesBeforeInsertForPartitions?


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3558
PS5, Line 3558: filesForUnpartitionedTable
is this unused?


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3580
PS5, Line 3580: HdfsPartition) partition)
Is there a concern here of running into CastException? I see that FeFsPartition 
has two implementations HdfsPartition and LocalFsPartition. Do we need to 
handle LocalFsPartition as well?

Also, can we do this only when event processing is enabled?


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3673
PS5, Line 3673: else {
may be do a else if(catalog.isExternalEventProcessingEnabled()) here so that 
this code is only triggered when event processing is enabled.


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3674
PS5, Line 3674: FeFsPartition singlePart = 
Iterables.getOnlyElement((List) parts);
May be add a Preconditions.checkState(parts.size == 1); here to make sure that 
the assumption in the code below are always true


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3676
PS5, Line 3676: (HdfsPartition) singlePart
same as above, Do we need to handle LocalFsPartition as well?


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3699
PS5, Line 3699: createInsertEvent
May be you can create 2 methods, one for partitioned case and another for 
non-partitioned case. The former will take in Map> as an 
argument for filesBeforeInsertInPartitions and the later will take Set 
as an argument for filesBeforeInsertInTable

you can then call them using the

if (table is partitioned)
  createInsertEventForTable()
else
  createInsertEventsForPartitions()


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3716
PS5, Line 3716:* fireInsertEvent() if external event processing is enabled.
Add to the description that this method is a no-op if event processing is 

[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-02 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 5: Code-Review+1

The code here is OK. Discussed in person the issue of cross-system race 
conditions which we agreed to address another time. Would like Vihang to take a 
close look at the notification code itself since he is most knowledgeable.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 5
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 02 Apr 2019 22:24:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-02 Thread Bharath Krishna (Code Review)
Bharath Krishna has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@601
PS5, Line 601: Preconditions.checkNotNull(insertMessage.getTableObj());
These following two lines can be

Preconditions.checkNotNull(insertMessage.getTableObj());
msTbl_ = insertMessage.getTableObj();
written as :
msTbl_ = Preconditions.checkNotNull(insertMessage.getTableObj());


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@204
PS5, Line 204:   // Metric name for number of refreshes by event processor sofar
Typo sofar


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3551
PS5, Line 3551:   // For non-partitioned tables parts below will contain a 
single value. The
nit : comma after tables.


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3766
PS5, Line 3766: LOG.debug("Firing insert event... for %s", tbl.getName());
LOG.debug("Firing insert event for {}", tbl.getName());


http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3780
PS5, Line 3780:   LOG.error("Failed to fire insert event. This may cause 
some table refreshes to be"
I think it's better to have:
Some Tables might not be refreshed on other Impala clusters for this event.



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 5
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 02 Apr 2019 21:55:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2615/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 5
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 02 Apr 2019 21:42:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-02 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 5:

(24 comments)

Thanks for the comments Vihang. I cleaned up code in CatalogOpExecutor. Still 
figuring out a way to write tests to verify if tables are refreshed because of 
insert events.

http://gerrit.cloudera.org:8080/#/c/12889/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12889/4//COMMIT_MSG@24
PS4, Line 24: Existing self-events logic cannot be used for insert events since
:firing insert event does not allow us to modify
> I think it is more appropriate to say existing self-events logic cannot be
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/12889/4/be/src/service/client-request-state.cc@1106
PS4, Line 1106:   // is_overwrite is used to know the type of insert in FE.
> add a comment here explaining why this is needed.
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@839
PS4, Line 839: HashSet<>(f
> would be appropriate to intialize the set with the capacity fdList.size() s
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@842
PS4, Line 842:
> suggest you to use Path.SEPARATOR instead of "/"
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@159
PS4, Line 159: refresh
> refresh on a table/partition
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@587
PS4, Line 587: public static class InsertEvent extends MetastoreTableEvent {
> IIUC, the reason you are extending TableInvalidatingEvent is because you wa
You are right. Changing it back to old TableInvalidatingEvent.


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@588
PS4, Line 588:
> nit, add new line above the constructor. Add a javadoc
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@606
PS4, Line 606:   }
> add a // TODO : to handle self-events for insert case
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@613
PS4, Line 613:
> nit, just saying refresh is good enough. No need to say reload here.
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@616
PS4, Line 616:
> same as above. Just say refresh since I don't think reload means anything e
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@623
PS4, Line 623:   }
> Add a // TODO : One way to do this would be to change hive source code to r
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@965
PS4, Line 965: t object parameters used for self-
> Why do we need to rename? Currently, all the implementations of this sub-cl
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@118
PS4, Line 118:
> Ignore to keep it consistent with other entries of this table
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@118
PS4, Line 118:   |||
 :  * | INSERT EVENT| Refres
> Just use Refresh unless Reload means something else.
Done


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3554
PS4, Line 3554: FeCatalogUtils.loadAllPartitions((HdfsTable) table);
  :   // Map of partition ids to file names of all existing 
partitions touched by the
  :   // insert
> 

[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-02 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
The event-processor will refresh tables using this event. Both insert
into and overwrite are supported for tables/partitions. Also, renamed
TableInvalidatingEvent class to TableInvalidatingOrRefreshingEvent
to reflect new behaviour.

Known Issues:
1. There is an unnecessary table invalidate when insert is done in Hive
   as the insert operation creates an ALTER and an INSERT notification
   event. Currently there is no way for the Event Processor to identify
   and prevent the unnecessary invalidate. IMPALA-7973 may potentially
   solve this issue.
2. Existing self-events logic cannot be used for insert events since
   firing insert event does not allow us to modify table parameters in
   HMS. This means we cannot get the CatalogServiceIdentifiers in insert
   events. Therefore, the event-processor will also refresh the tables
   for which insert operation is performed through Impala.

Testing:
Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/custom_cluster/test_event_processing.py
7 files changed, 282 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/5
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 5
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-01 Thread Bharath Krishna (Code Review)
Bharath Krishna has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3778
PS2, Line 3778: rqst.setPartitionVals(partVals);
> Should this condition be reversed, because it will be false by default?
Done



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 4
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 01 Apr 2019 20:46:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-29 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 4:

(24 comments)

Overall, the approach looks reasonable to me. The CatalogOpExecutor code change 
can be cleaned up and fine-tuned. I have left some suggestions. Hope they make 
sense.

http://gerrit.cloudera.org:8080/#/c/12889/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12889/4//COMMIT_MSG@24
PS4, Line 24: Detection of self-events does not work for inserts currently 
because
:of the way the self-event checks are implemented
I think it is more appropriate to say existing self-events logic cannot be used 
for insert events since firing insert_event does not allow us to modify table 
parameters in HMS. This means that we cannot get the CatalogServiceIdentifiers 
in the insert_event.


http://gerrit.cloudera.org:8080/#/c/12889/4/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/12889/4/be/src/service/client-request-state.cc@1106
PS4, Line 1106:   catalog_update.is_overwrite = 
finalize_params.is_overwrite;
add a comment here explaining why this is needed.


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@839
PS4, Line 839: HashSet<>()
would be appropriate to intialize the set with the capacity fdList.size() since 
its known before-hand

Set fileNames = new HashSet<>(fdList.size());


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@842
PS4, Line 842: "/"
suggest you to use Path.SEPARATOR instead of "/"
also, this.getLocation() can be outside for loop and can be reused.

nit, don't think you need to explicitly do this.


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@159
PS4, Line 159: refresh
refresh on a table/partition


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@587
PS4, Line 587: public static class InsertEvent extends 
TableInvalidatingOrRefreshingEvent
IIUC, the reason you are extending TableInvalidatingEvent is because you want 
to re-use the self-event logic. But since InsertEvent doesn't handle 
self-event, it doesn't need to extend TableInvalidatingEvent. It could just 
extend MetastoreTableEvent.

That would mean that you don't need to rename TableInvalidatingEvent to 
TableInvalidatingOrRefreshingEvent as well.

Right?


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@588
PS4, Line 588: @VisibleForTesting
nit, add new line above the constructor. Add a javadoc


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@606
PS4, Line 606:  * Currently we do not check for self-events in Inserts. 
This means insert events
add a // TODO : to handle self-events for insert case


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@613
PS4, Line 613: reload/refres
nit, just saying refresh is good enough. No need to say reload here.


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@616
PS4, Line 616: refreshed/reloaded
same as above. Just say refresh since I don't think reload means anything 
external. Refresh is a key word in Impala ql so having refresh in the logs 
makes more sense.


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@623
PS4, Line 623:   // No-op: Currently we cannot detect self-events for 
inserts because of the way the
Add a // TODO : One way to do this would be to change hive source code to 
return the event_id in the response when the fire_listener_event is called and 
we keep track of it to ignore the event.


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@965
PS4, Line 965: TableInvalidatingOrRefreshingEvent
Why do we need to rename? Currently, all the implementations of this sub-class 
only issue invalidate right?


http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 

[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2588/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 4
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 29 Mar 2019 06:48:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2587/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 3
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 29 Mar 2019 06:42:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-29 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
The event-processor will refresh tables using this event. Both insert
into and overwrite are supported for tables/partitions. Also, renamed
TableInvalidatingEvent class to TableInvalidatingOrRefreshingEvent
to reflect new behaviour.

Known Issues:
1. There is an unnecessary table invalidate when insert is done in Hive
   as the insert operation creates an ALTER and an INSERT notification
   event. Currently there is no way for the Event Processor to identify
   and prevent the unnecessary invalidate. IMPALA-7973 may potentially
   solve this issue.
2. Detection of self-events does not work for inserts currently because
   of the way the self-event checks are implemented. The flags added to
   test for self events have no way to persist in HMS with just an
   insert operation. Therefore, the event-processor will also refresh
   the tables for which insert operation is performed through Impala.

Testing:
Wrote new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/custom_cluster/test_event_processing.py
7 files changed, 302 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/4
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 4
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-29 Thread Bharath Krishna (Code Review)
Bharath Krishna has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@597
PS2, Line 597: msTbl_ = insertMessage.getTableObj();
We can use Preconditions.checkNotNull(insertMessage.getTableObj())


http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@606
PS2, Line 606:   // Currently we do not check for self-events in Inserts. 
This means insert events
I think we should add this as a javadoc comment for the process method.


http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3716
PS2, Line 3716:   }
nit : typo in variable name existigPartNames


http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3770
PS2, Line 3770: FireEventRequestData data = new FireEventRequestData();
We can add Table name to the debug log.


http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3778
PS2, Line 3778: rqst.setPartitionVals(partVals);
Should this condition be reversed, because it will be false by default?
 if (isOverwrite) insertData.setReplace(true);



--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 3
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 29 Mar 2019 06:00:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-29 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
The event-processor will refresh tables using this event. Both insert
into and overwrite are supported for tables/partitions. Also, renamed
TableInvalidatingEvent class to TableInvalidatingOrRefreshingEvent
to reflect new behaviour.

Known Issues:
1. There is an unnecessary table invalidate when insert is done in Hive
   as the insert operation creates an ALTER and an INSERT notification
   event. Currently there is no way for the Event Processor to identify
   and prevent the unnecessary invalidate. IMPALA-7973 may potentially
   solve this issue.
2. Detection of self-events does not work for inserts currently because
   of the way the self-event checks are implemented. The flags added to
   test for self events have no way to persist in HMS with just an
   insert operation. Therefore, the event-processor will also refresh
   the tables for which insert operation is performed through Impala.

Testing:
Wrote new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/custom_cluster/test_event_processing.py
7 files changed, 299 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/3
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 3
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2586/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 1
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 29 Mar 2019 05:31:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-28 Thread Anurag Mantripragada (Code Review)
Hello Bharath Vissapragada, Paul Rogers, Vihang Karajgaonkar, Bharath Krishna, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12889

to look at the new patch set (#2).

Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
The event-processor will refresh tables using this event. Both insert
into and overwrite are supported for tables/partitions. Also, renamed
TableInvalidatingEvent class to TableInvalidatingOrRegreshingEvent
to reflect new behaviour.

Known Issues:
1. There is an unnecessary table invalidate when insert is done in Hive
   as the insert operation creates an ALTER and an INSERT notification
   event. Currently there is no way for the Event Processor to identify
   and prevent the unnecessary invalidate. IMPALA-7973 may potentially
   solve this issue.
2. Detection of self-events does not work for inserts currently because
   of the way the self-event checks are implemented. The flags added to
   test for self events have no way to persist in HMS with just an
   insert operation. Therefore, the event-processor will also refresh
   the tables for which insert operation is performed through Impala.

Testing:
Added new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/custom_cluster/test_event_processing.py
7 files changed, 301 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/2
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 2
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-03-28 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12889


Change subject: IMPALA-7971: Add support for insert events in event processor.
..

IMPALA-7971: Add support for insert events in event processor.

This patch adds support for detecting and processing insert events
triggered by impala as well as external engines (eg.Hive).

Inserts from Impala will fire an insert event notification.
The event-processor will refresh tables using this event. Both insert
into and overwrite are supported for tables/partitions. Also, renamed
TableInvalidatingEvent class to TableInvalidatingOrRegreshingEvent
to reflect new behaviour.

Known Issues:
1. There is an unnecessary table invalidate when insert is done in Hive
   as the insert operation creates an ALTER and an INSERT notification
   event. Currently there is no way for the Event Processor to identify
   and prevent the unnecessary invalidate. IMPALA-7973 may potentially
   solve this issue.
2. Detection of self-events does not work for inserts currently because
   of the way the self-event checks are implemented. The flags added to
   test for self events have no way to persist in HMS with just an
   insert operation. Therefore, the event-processor will also refresh
   the tables for which insert operation is performed through Impala.

Testing:
Wrote new custom cluster tests to run different insert commands from
hive and verified new data is available in Impala without invalidate
metadata.

Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A tests/custom_cluster/test_event_processing.py
7 files changed, 301 insertions(+), 14 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/1
--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 1
Gerrit-Owner: Anurag Mantripragada