[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 24: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 24 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 29 Apr 2019 21:24:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Reviewed-on: http://gerrit.cloudera.org:8080/12889 Tested-by: Impala Public Jenkins Reviewed-by: Bharath Vissapragada --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/resources/hive-site.xml.py A tests/custom_cluster/test_event_processing.py 10 files changed, 592 insertions(+), 13 deletions(-) Approvals: Impala Public Jenkins: Verified Bharath Vissapragada: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 25 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 24: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 24 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 29 Apr 2019 20:59:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 24: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2964/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 24 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 29 Apr 2019 16:12:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 24: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4101/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 24 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 29 Apr 2019 15:26:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 24: Rebased after CDH change. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 24 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 29 Apr 2019 15:26:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/resources/hive-site.xml.py A tests/custom_cluster/test_event_processing.py 10 files changed, 592 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/24 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 24 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 23: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4100/ -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 23 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sun, 28 Apr 2019 01:55:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 23: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4100/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 23 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sun, 28 Apr 2019 00:06:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 23: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4047/ -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 23 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 19 Apr 2019 18:07:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 23: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4047/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 23 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 19 Apr 2019 16:24:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 23: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 23 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 19 Apr 2019 16:24:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 22: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4044/ -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 22 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 19 Apr 2019 00:45:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 22: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 22 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 23:06:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 21: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 21 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 23:06:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 22: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4044/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 22 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 23:06:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 21: This python file generates the templates for hive-site.xml - Every mini-cluster will have this config set. This is just like the event processing flags are set. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 21 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 23:03:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 21: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2838/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 21 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 22:56:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Krishna has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 21: How does this work for the tests in the MetastoreEventsProcessorTest , does it need dml.events = true as well? -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 21 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 22:41:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 21: Build failed as the HMS flag that enables insert notification was not set by default. Added hive.metastore.dml.events=true config in hive-site.xml.py -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 21 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 22:13:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#21). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/resources/hive-site.xml.py A tests/custom_cluster/test_event_processing.py 10 files changed, 576 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/21 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 21 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 20: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4040/ -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 20 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 07:19:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 20: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4040/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 20 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 01:59:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 20: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 20 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 01:59:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 19: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 19 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 01:59:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2832/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 19 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 01:38:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 19: (2 comments) Thanks Bharath. Changed the tests to account for slower builds. http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py File tests/custom_cluster/test_event_processing.py: http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@122 PS18, Line 122: time.sleep(bu > Lets use a longer timeout for slower builds like ASAN. How about 2 for regu Done http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@133 PS18, Line 133: > Breaks if someone updates the page. Parse it into keyvalue pairs and pick u Done -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 19 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 18 Apr 2019 00:57:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 575 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/19 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 19 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 18: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py File tests/custom_cluster/test_event_processing.py: http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@109 PS18, Line 109: : nit: formatting http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@122 PS18, Line 122: time.sleep(2) Lets use a longer timeout for slower builds like ASAN. How about 2 for regular and 4s for slower? (Look for the usages of build_flavor_timeout) http://gerrit.cloudera.org:8080/#/c/12889/18/tests/custom_cluster/test_event_processing.py@133 PS18, Line 133: 31 Breaks if someone updates the page. Parse it into keyvalue pairs and pick up the value mapping to the key? -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 18 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 17 Apr 2019 20:50:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2805/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 18 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 16 Apr 2019 21:43:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 18: Thanks for the pointers Bharath. I tried to make the wait for event processing more deterministic by scraping the /events page for last_sync_event_id to know if the events have been processed. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 18 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 16 Apr 2019 20:59:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 569 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/18 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 18 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py File tests/custom_cluster/test_event_processing.py: http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py@59 PS16, Line 59: time.sleep(10) > Hmm no, I can't think of any. This custom cluster starts with an event proc Spoke with Anurag offline. We discussed a way to scrape the /events page to get the last synced event ID. We can poll that to make sure events are processed. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 16 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 16 Apr 2019 00:33:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2787/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 17 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 15 Apr 2019 21:28:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 525 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/17 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 17 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 16: (5 comments) lgtm, can +2 once the remaining nits are fixed. http://gerrit.cloudera.org:8080/#/c/12889/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3587 PS16, Line 3587: Map> filesBeforeInsert = new HashMap<>(); nit: Move closer to it's usage. http://gerrit.cloudera.org:8080/#/c/12889/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3723 PS16, Line 3723: // Get the files before insert. This is used to compute the delta of files to : // fire insert events. : if (catalog_.isExternalEventProcessingEnabled()) { : if (!update.is_overwrite) { :for (FeFsPartition partition : affectedExistingPartitions) { : filesBeforeInsert.put(partition.getId(), : ((HdfsPartition) partition).getFileNames()); : } : } : } nit: move this inside createInsertEvents()? http://gerrit.cloudera.org:8080/#/c/12889/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3773 PS16, Line 3773: This list is empty if this is an insert overwrite operation. nit: May be rephrase to what HMS expects? That way semantics are more clear. http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py File tests/custom_cluster/test_event_processing.py: http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py@59 PS16, Line 59: time.sleep(10) Is there a more deterministic way of doing this? Might turn out to be a flaky test in builds like ASAN. http://gerrit.cloudera.org:8080/#/c/12889/16/tests/custom_cluster/test_event_processing.py@79 PS16, Line 79: time.sleep(10) same. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 16 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 15 Apr 2019 17:30:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2777/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 16 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 15 Apr 2019 10:19:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 528 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/16 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 16 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 14: (4 comments) http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3608 PS14, Line 3608: if (catalog_.isExternalEventProcessingEnabled()) { > We don't wish to do this if event processing is disabled right? we can just collect it and discard if event processing is disabled. I think the extra branch affects readability, especially in this part of the code which is already too complex. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3704 PS14, Line 3704: if (catalog_.isExternalEventProcessingEnabled()) { > We don't wish this path be taken if event processor is disabled. Right? same as above. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3743 PS14, Line 3743: // After loading metadata, fire insert event if external event processing is : // enabled. : if (table.getNumClusteringCols() > 0) { : createInsertEventsForPartitions(table, filesBeforeInsertForPartitions, : update.is_overwrite); : } else { : createInsertEventForTable(table, filesBeforeInsertForTable, update.is_overwrite); : } > As mentioned above this is not possible because of two different data struc Right. Why do we need to care about 'id' if the table is not partitioned. We will only have one partition (which is the only partition). That is what I meant when I said "using table.getNumClusteringCols() you'd know if filesBeforeInsert corresponds to a partitioned or a non partitioned table ". http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3787 PS14, Line 3787: deltaFiles = ((HdfsPartition)part).getFileNames(); > This looks right to me. We want deltaFiles to be empty when it is an overwr "We want deltaFiles to be empty when it is an overwrite". Ok this is not obvious. Document this then? I thought that HMS expected that the delta should contain the new set of files even incase of an overwrite. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 14 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sun, 14 Apr 2019 21:40:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2772/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 15 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 13 Apr 2019 09:00:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#15). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 548 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/15 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 15 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 15: (9 comments) http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3587 PS14, Line 3587: Map> filesBeforeInsertForPartitions = new HashMap<>(); : Set filesBeforeInsertForTable = new HashSet<>(); > Both of them can be handled in a single map? Also, rename to filesBeforeIns Here we track fileNames of all affectedPartitions using a Map. Later, this map is used to retrieve and compare fileNames to calculate deltas using partitionIds. However, partitionId cannot be used for a non-partitioned table because they change during load(). One solution was to use partitionNames instead of partitionIds. We decided to go with Ids because using partitionNames was too error prone. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3589 PS14, Line 3589: affectedExistingPartitions = new > just say affectedPartitions ? Changed it to affectedExistingPartitions. Want to stress on 'existing' as these are existing partitions that are involved in this insert event. There could be some new partitions created in the insert which are not stored here. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3608 PS14, Line 3608: if (catalog_.isExternalEventProcessingEnabled()) { > not needed. We don't wish to do this if event processing is disabled right? http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3704 PS14, Line 3704: if (catalog_.isExternalEventProcessingEnabled()) { > not needed. We don't wish this path be taken if event processor is disabled. Right? http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3736 PS14, Line 3736: filesBeforeInsertForTable = ((HdfsPartition) Iterables > Like I mentioned above, I don't see why we should use two different datastr As mentioned before, we cannot keep track of filenames using partitionIds for non-partitioned tables as we drop and create the single partition during load. (partitionId changes.). Hence two datastructures. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3743 PS14, Line 3743: // After loading metadata, fire insert event if external event processing is : // enabled. : if (table.getNumClusteringCols() > 0) { : createInsertEventsForPartitions(table, filesBeforeInsertForPartitions, : update.is_overwrite); : } else { : createInsertEventForTable(table, filesBeforeInsertForTable, update.is_overwrite); : } > How about changing the signature to As mentioned above this is not possible because of two different data structures for filesBeforeInsert for partition case and non-partitioned case http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3787 PS14, Line 3787: deltaFiles = ((HdfsPartition)part).getFileNames(); > shouldn't this go before if? tests didn't catch this? This looks right to me. We want deltaFiles to be empty when it is an overwrite. We only calculate the new files if its not an overwrite. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3791 PS14, Line 3791: deltaFiles.size > need this? Removed. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3813 PS14, Line 3813: deltaFiles = ((HdfsPartition)singlePart).getFileNames(); > same question, shouldn't deltaFiles go before if? We want delta files to be empty if it is an overwrite. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 15 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 13 Apr 2019 08:19:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2768/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 14 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 13 Apr 2019 02:56:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 14: (9 comments) Some more suggestions. I think it is getting closer. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3587 PS14, Line 3587: Map> filesBeforeInsertForPartitions = new HashMap<>(); : Set filesBeforeInsertForTable = new HashSet<>(); Both of them can be handled in a single map? Also, rename to filesBeforeInsert? http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3589 PS14, Line 3589: existingPartitionsTouchedByInsert just say affectedPartitions ? http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3608 PS14, Line 3608: if (catalog_.isExternalEventProcessingEnabled()) { not needed. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3704 PS14, Line 3704: if (catalog_.isExternalEventProcessingEnabled()) { not needed. http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3736 PS14, Line 3736: filesBeforeInsertForTable = ((HdfsPartition) Iterables Like I mentioned above, I don't see why we should use two different datastructures here. Even an unparitioned table is implemented as a single partitioned table. I suggest to refactor like this. if (eventProcessingEnabled && !overwrite) .{ for (part: affectedPartitions) { filesBeforeInsert.put(part.id, part.getFiles()); } } http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3743 PS14, Line 3743: // After loading metadata, fire insert event if external event processing is : // enabled. : if (table.getNumClusteringCols() > 0) { : createInsertEventsForPartitions(table, filesBeforeInsertForPartitions, : update.is_overwrite); : } else { : createInsertEventForTable(table, filesBeforeInsertForTable, update.is_overwrite); : } How about changing the signature to fireInsertEvents(table, filsBeforeInsert, is_overwrite) using table.getNumClusteringCols() you'd know if filesBeforeInsert corresponds to a partitioned or a non partitioned table (you could create your event accordingly) http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3787 PS14, Line 3787: deltaFiles = ((HdfsPartition)part).getFileNames(); shouldn't this go before if? tests didn't catch this? http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3791 PS14, Line 3791: String.valueOf( need this? http://gerrit.cloudera.org:8080/#/c/12889/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3813 PS14, Line 3813: deltaFiles = ((HdfsPartition)singlePart).getFileNames(); same question, shouldn't deltaFiles go before if? deltaFiles = allCurrentFiles; if (!overrwrite) { deltaFiles.removeAll(oldFiles) } -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 14 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 13 Apr 2019 03:08:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 548 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/14 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 14 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 14: (13 comments) http://gerrit.cloudera.org:8080/#/c/12889/13/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/12889/13/common/thrift/CatalogService.thrift@193 PS13, Line 193: // True if the update corresponds to an "insert overwrite" operation > nit: I think we should say "True if this update corresponds to an 'insert o Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@396 PS13, Line 396: /** : * Util method to issue invalidate on a given table on the catalog. This method : * atomically invalidates the table if it exists in the catalog. No-op if the table : * does not exist : */ : p > don't think this needs a separate method. Inline it at the caller? Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@582 PS13, Line 582: > ..handler..Also, add some more color to it? Like it handles the inserts at Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@586 PS13, Line 586: : InsertEven > instead say. Null if the table is unprartitioned...or something like that? Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@614 PS13, Line 614: > this is obvious, remove? Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@619 PS13, Line 619: ** : * Process partition inserts : */ : private void processPartit > braces Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@630 PS13, Line 630: Preconditions.checkState(fsList.size() == partVals.size()); > Preconditions.checkNotNull(insertPartition_); Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@664 PS13, Line 664: d > unpartitioned .. Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@667 PS13, Line 667: try { > Preconditions.checkArgument(partition == null) Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3583 PS13, Line 3583: Collection parts = : FeCatalogUtils.loadAllParti > Remove, this is obv? Done http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3589 PS13, Line 3589: List existingPartitionsTouchedByInsert = new ArrayList<>(); > I suggest to refactor the code like this. Lemme know what you think. I thin Thanks for the suggestion. Sounds good. Refactored code according to your suggestion. However, tracking files for partitions is slightly different from that with table inserts. non-partitioned tables change ids after load, so we cannot track using a map. Hence the if...else for calculating files. http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3767 PS13, Line 3767: * fireInsertEvent() if external event processing is enabled. This is no-op otherwise. :* > shouldn't this be done only for the affected partitions? The filesBeforeInsertForPartitions map contains only the affected partitions. http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3823 PS13, Line 3823: * insert. In case of partitioned table, this event also contains the partition :* values of existing partitions which were touched by the insert. :*/ : private void fireInsertEvent(Table tbl, List partVals, : > insertData.setReplace(isOverwrite) Done -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/13/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/12889/13/common/thrift/CatalogService.thrift@193 PS13, Line 193: // True if an insert was overwrite operation nit: I think we should say "True if this update corresponds to an 'insert overwrite' operation". Looks like TUpdateCatalogRequest can be more generic and not aways for an insert operation. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 13 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 13 Apr 2019 01:46:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 13: (12 comments) I think we could still refactor the code better. I've added some suggestions, lemme know. I'll take another pass after the refactor. http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@396 PS13, Line 396: /** : * Util to get partition key values of a given map of partition keys and values : */ : protected String getPartitionKeyValuesAsString(Map partSpec) { : return Joiner.on(",").withKeyValueSeparator("=").join(partSpec); : } don't think this needs a separate method. Inline it at the caller? http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@582 PS13, Line 582: ..handler..Also, add some more color to it? Like it handles the inserts at the table and partition scope. http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@586 PS13, Line 586: this is not a partition : // insert. instead say. Null if the table is unprartitioned...or something like that? http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@614 PS13, Line 614: This means insert events generated by impala will also be processed this is obvious, remove? http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@619 PS13, Line 619: if (insertPartition_ != null) : processPartitionInserts(); : else : processTableInserts(); braces http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@630 PS13, Line 630: Map partSpec = new HashMap<>(); Preconditions.checkNotNull(insertPartition_); http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@664 PS13, Line 664: unpartitioned .. http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@667 PS13, Line 667: // For non-partitioned tables, refresh the whole table. Preconditions.checkArgument(partition == null) http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3583 PS13, Line 3583: // For non-partitioned tables, parts below will contain a single value. The : // partition key will be empty. Remove, this is obv? http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3589 PS13, Line 3589: Map> filesBeforeInsertForPartitions = new HashMap<>(); I suggest to refactor the code like this. Lemme know what you think. I think that will be much cleaner and keeps the event code together than mixing it with the insert code. Also, I think for "overwrite" you don't need to compute a delta, which means you don't need to compute the fileListing "before" the load. dirtyParts = new List<>; if (partitioned) { for each partition affected in the logic below; dirtyParts.append(affectedPart) } else { dirtyParts.append(onlyPartition) } if (eventProcessingEnabled || !overwrite) { currentFileListing = getFilesForDirtyParts(); } loadMetadata(); if (eventProcessingEnabled) { fireEvents(fileListingBeforeLoad, dirtyParts()); } fireEvents() { computeDelta(); fireEvents(); } http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3767 PS13, Line 3767: Collection partsPostInsert = : ((FeFsTable)table).loadPartitions(filesBeforeInsertForPartitions.keySet()); shouldn't this be done only for the affected partitions? http://gerrit.cloudera.org:8080/#/c/12889/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3823 PS13, Line 3823: if (isOverwrite) { : insertData.setReplace(true); : } else { : insertData.setReplace(false); : } insertData.setReplace(isOverwrite) -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://g
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 13: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 13 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 17:51:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@516 PS12, Line 516: false > Yes. Corrected. thanks -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 17:51:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2751/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 13 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 07:00:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@516 PS12, Line 516: true, > you probably meant to use true here? Yes. Corrected. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 13 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 06:14:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 540 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/13 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 13 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 12: Code-Review+1 (2 comments) Thanks for making the suggested changes. Patch looks good me. Left one minor fix in the test below. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@463 PS6, Line 463: String parentPathString = : "/test-warehouse/" + dbName +".db/" + tblName; : String filePathString = isPartitionInsert ? "/p1=testPartVal/testFile.0" : : "/testFile.0"; > Thanks for this suggestion. The load table at L489 is after a partition has sounds good http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@516 PS12, Line 516: false you probably meant to use true here? -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 06:03:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2749/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 11 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 05:42:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2750/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 05:42:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@525 PS11, Line 525: assertTrue("Partition not found after insert.", partsAfterInsertOverwrite.size() > 0); line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 11 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 04:58:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 540 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/12 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 539 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/11 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 11 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2748/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 10 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 04:33:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/10/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/10/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@524 PS10, Line 524: assertTrue("Partition not found after insert.", partsAfterInsertOverwrite.size() > 0); line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 10 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 03:52:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 538 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/10 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 10 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 10: (3 comments) http://gerrit.cloudera.org:8080/#/c/12889/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3823 PS9, Line 3823: > This is interesting because based on the thrift definition of InsertEventRe Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@450 PS6, Line 450: testInsertEvents > Does this test inserOverwrite case as well? Seems important enough not to b >From Impala side, we are not using the overwrite flag so I thought it was too >trivial to test. But I see your point. Adding a test that calls a process on >overwrite event. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@463 PS6, Line 463: String parentPathString = : "/test-warehouse/" + dbName +".db/" + tblName; : String filePathString = isPartitionInsert ? "/p1=testPartVal/testFile.0" : : "/testFile.0"; > I am not a big fan of hardcoding paths in the tests esp. if the root locati Thanks for this suggestion. The load table at L489 is after a partition has been added. catalog_.getTable().getMetastoreTable().getSd().getLocation() is null for some reason. I will investigate more. I'm trying to avoid getOrLoad(). For now let's keep this. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 10 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 12 Apr 2019 03:51:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@450 PS6, Line 450: testInsertEvents Does this test inserOverwrite case as well? Seems important enough not to be missed. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 6 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 11 Apr 2019 19:58:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 6: (5 comments) patch looks good to me. Just wanted to make sure that we are handling the replace flag correctly below when generating the insert events. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@423 PS6, Line 423: refreshCatalogTable > I left this here because I wanted to add metrics of "Tables Refreshed" in a Sounds good if you plan to use it later. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@431 PS6, Line 431: throws CatalogException { > Same as earlier reply. Wanted to have metric for partitionsRefreshed here. sounds good. http://gerrit.cloudera.org:8080/#/c/12889/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3823 PS9, Line 3823: This is interesting because based on the thrift definition of InsertEventRequestData the default value of replace flag is unset (which means false since its a boolean). Perhaps we should be explicitly set it to true here as well to make sure. Suggest you to do the following. isOverwrite ? insertData.setReplace(true) : insertData.setReplace(false); http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@463 PS6, Line 463: String parentPathString = : "/test-warehouse/" + dbName +".db/" + tblName; : String filePathString = isPartitionInsert ? "/p1=testPartVal/testFile.0" : : "/testFile.0"; > You are right. However, almost all the FE tests have hardcoded this locatio I am not a big fan of hardcoding paths in the tests esp. if the root location can be trivially found by catalog_.getTable().getMetastoreTable(). You are getting the tbl object anyways down below at line 489. If you move that line here you don't need any extra calls. Not a blocker comment, can be done later if needed. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@476 PS6, Line 476: out.close(); > I moved it to finally. Do you know a less messy way to do this? Looks like FSDataOutputStream does not implement Autocloseable, hence try-with-resources could not be used. Guess we will have to live with this for now. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 6 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 11 Apr 2019 19:57:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2726/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 9 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 10 Apr 2019 21:53:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2725/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 8 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 10 Apr 2019 21:39:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3789 PS8, Line 3789:* line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 8 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 10 Apr 2019 21:16:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 514 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/9 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 9 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 514 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/8 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 8 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2723/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 7 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 10 Apr 2019 21:12:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 7: (31 comments) Thanks Vihang and Bharath for the comments. Is there a better way to handle finally block in MetastoreEventsProcessorTest? http://gerrit.cloudera.org:8080/#/c/12889/6/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/12889/6/be/src/service/client-request-state.cc@1106 PS6, Line 1106: > nit: I think this is kind of obv (and also there is a comment in the thrift Removed. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2063 PS6, Line 2063: erwise. Throws CatalogE > Looks like it throws DbNotFoundEx if the db doesn't exist. Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2065 PS6, Line 2065: */ : public boolean refreshTableIfExists(String dbName > nit: Just say, returns true if reload is successful, false otherwise? Remov Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2077 PS6, Line 2077: > nit, use "after insert events" to be more specific Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2077 PS6, Line 2077: Db doesn't exist. > I feel the entire statement can be omitted. Callers may change that makes t Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2082 PS6, Line 2082: if (table == null || table instanceof IncompleteTable) return false; > same comments as above javadoc. Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@835 PS6, Line 835: s > nit: Returns.. Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@398 PS6, Line 398: */ : protected String getPartitionKeyValuesAsString(Map partSpec) { : return Joiner.on(",").withKeyValueSeparator("=").join(partSpec); : } : : /** : > Use Joiner from guava Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@423 PS6, Line 423: value of the table > since this method is not doing anything else apart from calling catalog_ref I left this here because I wanted to add metrics of "Tables Refreshed" in a separate patch. Not very sure if this is any useful to users. Would like to know your thoughts. For now, I have removed it. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@431 PS6, Line 431: Boolean tblProperty = getHmsSyncProperty(msTbl_); > same as above, The method is doing nothing other than calling another metho Same as earlier reply. Wanted to have metric for partitionsRefreshed here. Removed for now. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@630 PS6, Line 630: Map partSpec = new HashMap<>(); > Any reason why? Added the reason for this http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@635 PS6, Line 635: Preconditions.checkState(fsList.size() == partVals.size()); > can we break this into two helpers? processPartitionedInsert() and processN Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@641 PS6, Line 641: // refresh fails. > probably worth while to check the following conditions: Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@649 PS6, Line 649: > nit, remove Automatic. kind of redundant since debugLog prints the event in Done http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@673 PS6, Line 673: nt in the > nit, remove Done http://gerrit.cloudera.org:8080/
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 512 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/7 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 7 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 6: (18 comments) http://gerrit.cloudera.org:8080/#/c/12889/6/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/12889/6/be/src/service/client-request-state.cc@1106 PS6, Line 1106: // is_overwrite is used to know the type of insert in FE. nit: I think this is kind of obv (and also there is a comment in the thrift def). Consider removing. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2063 PS6, Line 2063: Db/table doesn't exist. Looks like it throws DbNotFoundEx if the db doesn't exist. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2065 PS6, Line 2065: * - Throws CatalogException if reloadTable() is unsuccessful. :* - Returns true only if reloadTable() succeeds. nit: Just say, returns true if reload is successful, false otherwise? Remove hyphens if not needed. (same below) http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2077 PS6, Line 2077: Useful to refresh the partition after inserts I feel the entire statement can be omitted. Callers may change that makes the doc stale.? http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2082 PS6, Line 2082:* - Returns true only if reloadPartition() succeeds. same comments as above javadoc. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@835 PS6, Line 835: nit: Returns.. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@398 PS6, Line 398:protected String getPartitionKeyValuesAsString(Map partSpec) { : StringBuilder str = new StringBuilder(); : for (Map.Entry entry : partSpec.entrySet()) { : str.append(entry.getKey() + "=" + entry.getValue() + " "); : } : return str.toString(); : } Use Joiner from guava http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@630 PS6, Line 630: * Currently we do not check for self-events in Inserts. This means insert events Any reason why? http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@635 PS6, Line 635: public void process() throws MetastoreNotificationException { can we break this into two helpers? processPartitionedInsert() and processNonPartitionedInsert()? (for readability, since there are two many nested branches. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3558 PS6, Line 3558: HiveConf hiveConf = new HiveConf(this.getClass()); any reason to move this to here from where it was before http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3578 PS6, Line 3578: filesBeforeInsertForPartitions.put(partition.getId(), > Can we do this conditionally when event processing is enabled? +1, this is a very commonly used codepth and seems wasteful to unpack the file descriptor info if event processing is not configured. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3674 PS6, Line 3674: Preconditions.checkState(parts.size() == 1); This is taken care of by Iterabls.getOnlyElement() http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3721 PS6, Line 3721:*/ document method args? Map doesn't seem obvious. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3729 PS6, Line 3729: if (table.getNumClust
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 6: (15 comments) Thanks for making the changes. The patch looks great. I have left some minor suggestions and then its good from my side. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2077 PS6, Line 2077: after inserts nit, use "after insert events" to be more specific http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@423 PS6, Line 423: refreshCatalogTable since this method is not doing anything else apart from calling catalog_refreshTableIfExists .. may be we can get rid of it. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@431 PS6, Line 431: throws CatalogException { same as above, The method is doing nothing other than calling another method. So may as well remove it and call the underlying method directly. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@641 PS6, Line 641: List partVals = insertPartition_.getValues(); probably worth while to check the following conditions: Preconditions.checkNotNull(partVals) Preconditions.checkState (fsList.size() == partVals.size()) http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@649 PS6, Line 649: Automatic nit, remove Automatic. kind of redundant since debugLog prints the event info as well. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@673 PS6, Line 673: Automatic nit, remove http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3580 PS5, Line 3580: rtition.isMarkedCached()) > This code path is taken by only HDFSTables as you can see on L3522. Got it. Thanks for the clarification. http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3578 PS6, Line 3578: filesBeforeInsertForPartitions.put(partition.getId(), Can we do this conditionally when event processing is enabled? http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3722 PS6, Line 3722: createInsertEventForPartition nit, Adding plural to the name makes it more readable since it is firing multiple insert events. may be, createInsertEventsForPartitions? http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3741 PS6, Line 3741: for (String fileName : deltaFiles) { : newFiles.add(fileName); : } Looks like you are making this copy because the InsertEventRequestData in fireInsertEvent method needs a list. In such a case, perhaps a cleaner way is to just pass deltaFiles from here and make the copy in that method simply using rqst.setFilesAdded(new ArrayList<>(deltaFiles)) http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3767 PS6, Line 3767: for (String fileName : deltaFiles) { : newFiles.add(fileName); : } same as previous comment, you can just pass the deltaFiles to the fireInsertEvent and reuse the logic http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@432 PS6, Line 432: public void testInsertEvents() throws TException, ImpalaException { thanks for adding this test! http://gerrit.cloudera.org:8080/#/c/12889/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@463
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2690/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 6 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 08 Apr 2019 23:22:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 6: (19 comments) http://gerrit.cloudera.org:8080/#/c/12889/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12889/5//COMMIT_MSG@14 PS5, Line 14: partitions. : : Known Issues: > This may not be applicable anymore Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@413 PS5, Line 413: ts(dbName_, tb > I think this call is not correct since this will be a no-op if the table is Thanks for this catch. Used reloadTable() instead which forces a reload every time. reloadPartition() http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@601 PS5, Line 601:* Metastore event for INSERT events. > These following two lines can be Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@204 PS5, Line 204: > Typo sofar Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@38 PS5, Line 38: import org.apache.hadoop.hive.metastore.PartitionDropOptions; > unused? Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3551 PS5, Line 3551: // partition key will be empty. > nit : comma after tables. Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3557 PS5, Line 3557: InsertForTable = new HashSet<>(); > may be a better name would be to suggest that this contains files before in Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3558 PS5, Line 3558: eConf = new HiveConf(this. > is this unused? Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3580 PS5, Line 3580: rtition.isMarkedCached()) > Is there a concern here of running into CastException? I see that FeFsParti This code path is taken by only HDFSTables as you can see on L3522. http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3673 PS5, Line 3673: // For > may be do a else if(catalog.isExternalEventProcessingEnabled()) here so tha Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3674 PS5, Line 3674: Preconditions.checkState(parts.size() == 1); > May be add a Preconditions.checkState(parts.size == 1); here to make sure t Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3676 PS5, Line 3676: BeforeInsertForTable = ((( > same as above, Do we need to handle LocalFsPartition as well? The code path is taken by HdfsTable, so we do not need to handle LocalFsPartition. http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3699 PS5, Line 3699: createInsertEve > May be you can create 2 methods, one for partitioned case and another for n Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3716 PS5, Line 3716: } > Add to the description that this method is a no-op if event processing is d Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3719 PS5, Line 3719: nsert and ca > nit, change the name to isInsertOverwrite Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3736 PS5, Line 3736: List newFiles = new ArrayList<>(); > I think it would be helpful to add info log here which says how many new fi Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3750 PS5, Line 3750: } > add a info level log here which prints how many new files were added into t Done http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/Catal
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. Using this event, event-processor will refresh table/partition. Both insert into and overwrite are supported for tables/partitions. Known Issues: 1. Inserts into tables from Hive are ignored by the event processor as these inserts create an ALTER event first followed by an INSERT event. The alter will invalidate table making the refresh a no-op. Insert into partitions from hive will create an INSERT event first followed by an ALTER event. In this case, there is an unnecessary table invalidate after a refresh. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: 1. Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. 2. Added a test in MetastoreEventsProcessor for testing insert events. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A tests/custom_cluster/test_event_processing.py 9 files changed, 518 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/6 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 6 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 5: (15 comments) http://gerrit.cloudera.org:8080/#/c/12889/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12889/5//COMMIT_MSG@14 PS5, Line 14: Also, renamed : TableInvalidatingEvent class to TableInvalidatingOrRefreshingEvent : to reflect new behaviour. This may not be applicable anymore http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@413 PS5, Line 413: getOrLoadTable I think this call is not correct since this will be a no-op if the table is loaded already. You should probably use reloadTable() method here to force the refresh. I also see that there is a reloadPartition() method in CatalogServiceCatalog. Can we use that method to refresh only the Partition when the insert_event is for a given partition in case of partitioned tables. http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3575 PS4, Line 3575: // Attempt to remove this partition name from from partsToCreate. If remove > This is a good suggestion for partitioned tables. However, for unpartitione I see. Thanks for that info. using nulls are keys is error-prone since the methods where this collection is passed need to be aware of this to avoid NPE. Its better to just use a separate collection like you seemed to have started doing that by having filesForUnpartitionedTable before. http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@38 PS5, Line 38: import org.apache.hadoop.hive.conf.HiveConf.ConfVars; unused? http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3557 PS5, Line 3557: existingFilesForInsertedPartitions may be a better name would be to suggest that this contains files before insert.. so may something like filesBeforeInsertForPartitions? http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3558 PS5, Line 3558: filesForUnpartitionedTable is this unused? http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3580 PS5, Line 3580: HdfsPartition) partition) Is there a concern here of running into CastException? I see that FeFsPartition has two implementations HdfsPartition and LocalFsPartition. Do we need to handle LocalFsPartition as well? Also, can we do this only when event processing is enabled? http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3673 PS5, Line 3673: else { may be do a else if(catalog.isExternalEventProcessingEnabled()) here so that this code is only triggered when event processing is enabled. http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3674 PS5, Line 3674: FeFsPartition singlePart = Iterables.getOnlyElement((List) parts); May be add a Preconditions.checkState(parts.size == 1); here to make sure that the assumption in the code below are always true http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3676 PS5, Line 3676: (HdfsPartition) singlePart same as above, Do we need to handle LocalFsPartition as well? http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3699 PS5, Line 3699: createInsertEvent May be you can create 2 methods, one for partitioned case and another for non-partitioned case. The former will take in Map> as an argument for filesBeforeInsertInPartitions and the later will take Set as an argument for filesBeforeInsertInTable you can then call them using the if (table is partitioned) createInsertEventForTable() else createInsertEventsForPartitions() http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3716 PS5, Line 3716:* fireInsertEvent() if external event processing is enabled. Add to the description that this method is a no-op if event processing is d
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 5: Code-Review+1 The code here is OK. Discussed in person the issue of cross-system race conditions which we agreed to address another time. Would like Vihang to take a close look at the notification code itself since he is most knowledgeable. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 5 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 02 Apr 2019 22:24:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Krishna has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@601 PS5, Line 601: Preconditions.checkNotNull(insertMessage.getTableObj()); These following two lines can be Preconditions.checkNotNull(insertMessage.getTableObj()); msTbl_ = insertMessage.getTableObj(); written as : msTbl_ = Preconditions.checkNotNull(insertMessage.getTableObj()); http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@204 PS5, Line 204: // Metric name for number of refreshes by event processor sofar Typo sofar http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3551 PS5, Line 3551: // For non-partitioned tables parts below will contain a single value. The nit : comma after tables. http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3766 PS5, Line 3766: LOG.debug("Firing insert event... for %s", tbl.getName()); LOG.debug("Firing insert event for {}", tbl.getName()); http://gerrit.cloudera.org:8080/#/c/12889/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3780 PS5, Line 3780: LOG.error("Failed to fire insert event. This may cause some table refreshes to be" I think it's better to have: Some Tables might not be refreshed on other Impala clusters for this event. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 5 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 02 Apr 2019 21:55:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2615/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 5 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 02 Apr 2019 21:42:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 5: (24 comments) Thanks for the comments Vihang. I cleaned up code in CatalogOpExecutor. Still figuring out a way to write tests to verify if tables are refreshed because of insert events. http://gerrit.cloudera.org:8080/#/c/12889/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12889/4//COMMIT_MSG@24 PS4, Line 24: Existing self-events logic cannot be used for insert events since :firing insert event does not allow us to modify > I think it is more appropriate to say existing self-events logic cannot be Done http://gerrit.cloudera.org:8080/#/c/12889/4/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/12889/4/be/src/service/client-request-state.cc@1106 PS4, Line 1106: // is_overwrite is used to know the type of insert in FE. > add a comment here explaining why this is needed. Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@839 PS4, Line 839: HashSet<>(f > would be appropriate to intialize the set with the capacity fdList.size() s Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@842 PS4, Line 842: > suggest you to use Path.SEPARATOR instead of "/" Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@159 PS4, Line 159: refresh > refresh on a table/partition Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@587 PS4, Line 587: public static class InsertEvent extends MetastoreTableEvent { > IIUC, the reason you are extending TableInvalidatingEvent is because you wa You are right. Changing it back to old TableInvalidatingEvent. http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@588 PS4, Line 588: > nit, add new line above the constructor. Add a javadoc Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@606 PS4, Line 606: } > add a // TODO : to handle self-events for insert case Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@613 PS4, Line 613: > nit, just saying refresh is good enough. No need to say reload here. Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@616 PS4, Line 616: > same as above. Just say refresh since I don't think reload means anything e Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@623 PS4, Line 623: } > Add a // TODO : One way to do this would be to change hive source code to r Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@965 PS4, Line 965: t object parameters used for self- > Why do we need to rename? Currently, all the implementations of this sub-cl Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@118 PS4, Line 118: > Ignore to keep it consistent with other entries of this table Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@118 PS4, Line 118: ||| : * | INSERT EVENT| Refres > Just use Refresh unless Reload means something else. Done http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3554 PS4, Line 3554: FeCatalogUtils.loadAllPartitions((HdfsTable) table); : // Map of partition ids to file names of all existing partitions touched by the : // insert > thes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. The event-processor will refresh tables using this event. Both insert into and overwrite are supported for tables/partitions. Also, renamed TableInvalidatingEvent class to TableInvalidatingOrRefreshingEvent to reflect new behaviour. Known Issues: 1. There is an unnecessary table invalidate when insert is done in Hive as the insert operation creates an ALTER and an INSERT notification event. Currently there is no way for the Event Processor to identify and prevent the unnecessary invalidate. IMPALA-7973 may potentially solve this issue. 2. Existing self-events logic cannot be used for insert events since firing insert event does not allow us to modify table parameters in HMS. This means we cannot get the CatalogServiceIdentifiers in insert events. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/custom_cluster/test_event_processing.py 7 files changed, 282 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/5 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 5 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Krishna has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3778 PS2, Line 3778: rqst.setPartitionVals(partVals); > Should this condition be reversed, because it will be false by default? Done -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 4 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 01 Apr 2019 20:46:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 4: (24 comments) Overall, the approach looks reasonable to me. The CatalogOpExecutor code change can be cleaned up and fine-tuned. I have left some suggestions. Hope they make sense. http://gerrit.cloudera.org:8080/#/c/12889/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12889/4//COMMIT_MSG@24 PS4, Line 24: Detection of self-events does not work for inserts currently because :of the way the self-event checks are implemented I think it is more appropriate to say existing self-events logic cannot be used for insert events since firing insert_event does not allow us to modify table parameters in HMS. This means that we cannot get the CatalogServiceIdentifiers in the insert_event. http://gerrit.cloudera.org:8080/#/c/12889/4/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/12889/4/be/src/service/client-request-state.cc@1106 PS4, Line 1106: catalog_update.is_overwrite = finalize_params.is_overwrite; add a comment here explaining why this is needed. http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@839 PS4, Line 839: HashSet<>() would be appropriate to intialize the set with the capacity fdList.size() since its known before-hand Set fileNames = new HashSet<>(fdList.size()); http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@842 PS4, Line 842: "/" suggest you to use Path.SEPARATOR instead of "/" also, this.getLocation() can be outside for loop and can be reused. nit, don't think you need to explicitly do this. http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@159 PS4, Line 159: refresh refresh on a table/partition http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@587 PS4, Line 587: public static class InsertEvent extends TableInvalidatingOrRefreshingEvent IIUC, the reason you are extending TableInvalidatingEvent is because you want to re-use the self-event logic. But since InsertEvent doesn't handle self-event, it doesn't need to extend TableInvalidatingEvent. It could just extend MetastoreTableEvent. That would mean that you don't need to rename TableInvalidatingEvent to TableInvalidatingOrRefreshingEvent as well. Right? http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@588 PS4, Line 588: @VisibleForTesting nit, add new line above the constructor. Add a javadoc http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@606 PS4, Line 606: * Currently we do not check for self-events in Inserts. This means insert events add a // TODO : to handle self-events for insert case http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@613 PS4, Line 613: reload/refres nit, just saying refresh is good enough. No need to say reload here. http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@616 PS4, Line 616: refreshed/reloaded same as above. Just say refresh since I don't think reload means anything external. Refresh is a key word in Impala ql so having refresh in the logs makes more sense. http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@623 PS4, Line 623: // No-op: Currently we cannot detect self-events for inserts because of the way the Add a // TODO : One way to do this would be to change hive source code to return the event_id in the response when the fire_listener_event is called and we keep track of it to ignore the event. http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@965 PS4, Line 965: TableInvalidatingOrRefreshingEvent Why do we need to rename? Currently, all the implementations of this sub-class only issue invalidate right? http://gerrit.cloudera.org:8080/#/c/12889/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2588/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 4 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 29 Mar 2019 06:48:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2587/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 3 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 29 Mar 2019 06:42:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. The event-processor will refresh tables using this event. Both insert into and overwrite are supported for tables/partitions. Also, renamed TableInvalidatingEvent class to TableInvalidatingOrRefreshingEvent to reflect new behaviour. Known Issues: 1. There is an unnecessary table invalidate when insert is done in Hive as the insert operation creates an ALTER and an INSERT notification event. Currently there is no way for the Event Processor to identify and prevent the unnecessary invalidate. IMPALA-7973 may potentially solve this issue. 2. Detection of self-events does not work for inserts currently because of the way the self-event checks are implemented. The flags added to test for self events have no way to persist in HMS with just an insert operation. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: Wrote new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/custom_cluster/test_event_processing.py 7 files changed, 302 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/4 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 4 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Bharath Krishna has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@597 PS2, Line 597: msTbl_ = insertMessage.getTableObj(); We can use Preconditions.checkNotNull(insertMessage.getTableObj()) http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@606 PS2, Line 606: // Currently we do not check for self-events in Inserts. This means insert events I think we should add this as a javadoc comment for the process method. http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3716 PS2, Line 3716: } nit : typo in variable name existigPartNames http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3770 PS2, Line 3770: FireEventRequestData data = new FireEventRequestData(); We can add Table name to the debug log. http://gerrit.cloudera.org:8080/#/c/12889/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3778 PS2, Line 3778: rqst.setPartitionVals(partVals); Should this condition be reversed, because it will be false by default? if (isOverwrite) insertData.setReplace(true); -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 3 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 29 Mar 2019 06:00:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. The event-processor will refresh tables using this event. Both insert into and overwrite are supported for tables/partitions. Also, renamed TableInvalidatingEvent class to TableInvalidatingOrRefreshingEvent to reflect new behaviour. Known Issues: 1. There is an unnecessary table invalidate when insert is done in Hive as the insert operation creates an ALTER and an INSERT notification event. Currently there is no way for the Event Processor to identify and prevent the unnecessary invalidate. IMPALA-7973 may potentially solve this issue. 2. Detection of self-events does not work for inserts currently because of the way the self-event checks are implemented. The flags added to test for self events have no way to persist in HMS with just an insert operation. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: Wrote new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/custom_cluster/test_event_processing.py 7 files changed, 299 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/3 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 3 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12889 ) Change subject: IMPALA-7971: Add support for insert events in event processor. .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2586/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 1 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 29 Mar 2019 05:31:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Hello Bharath Vissapragada, Paul Rogers, Vihang Karajgaonkar, Bharath Krishna, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12889 to look at the new patch set (#2). Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. The event-processor will refresh tables using this event. Both insert into and overwrite are supported for tables/partitions. Also, renamed TableInvalidatingEvent class to TableInvalidatingOrRegreshingEvent to reflect new behaviour. Known Issues: 1. There is an unnecessary table invalidate when insert is done in Hive as the insert operation creates an ALTER and an INSERT notification event. Currently there is no way for the Event Processor to identify and prevent the unnecessary invalidate. IMPALA-7973 may potentially solve this issue. 2. Detection of self-events does not work for inserts currently because of the way the self-event checks are implemented. The flags added to test for self events have no way to persist in HMS with just an insert operation. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: Added new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/custom_cluster/test_event_processing.py 7 files changed, 301 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/2 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 2 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Bharath Krishna Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.
Anurag Mantripragada has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12889 Change subject: IMPALA-7971: Add support for insert events in event processor. .. IMPALA-7971: Add support for insert events in event processor. This patch adds support for detecting and processing insert events triggered by impala as well as external engines (eg.Hive). Inserts from Impala will fire an insert event notification. The event-processor will refresh tables using this event. Both insert into and overwrite are supported for tables/partitions. Also, renamed TableInvalidatingEvent class to TableInvalidatingOrRegreshingEvent to reflect new behaviour. Known Issues: 1. There is an unnecessary table invalidate when insert is done in Hive as the insert operation creates an ALTER and an INSERT notification event. Currently there is no way for the Event Processor to identify and prevent the unnecessary invalidate. IMPALA-7973 may potentially solve this issue. 2. Detection of self-events does not work for inserts currently because of the way the self-event checks are implemented. The flags added to test for self events have no way to persist in HMS with just an insert operation. Therefore, the event-processor will also refresh the tables for which insert operation is performed through Impala. Testing: Wrote new custom cluster tests to run different insert commands from hive and verified new data is available in Impala without invalidate metadata. Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A tests/custom_cluster/test_event_processing.py 7 files changed, 301 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12889/1 -- To view, visit http://gerrit.cloudera.org:8080/12889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52 Gerrit-Change-Number: 12889 Gerrit-PatchSet: 1 Gerrit-Owner: Anurag Mantripragada