[
https://issues.apache.org/jira/browse/IMPALA-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817186#comment-17817186
]
Sai Hemanth Gantasala commented on IMPALA-10949:
------------------------------------------------
There is an intermittent failure in the test that is introduced in this patch.
The observation from logs is that HMS events are synced up even before the
refresh is triggered from Impala.
{code:java}
W0208 13:52:16.556460 23760 MetastoreEventsProcessor.java:1026] Lag: 12s. 25
events pending to be processed.
I0208 13:52:16.561388 23759 MetastoreEvents.java:292] Total number of events
received: 25 Total number of events filtered out: 0
I0208 13:52:16.561439 23759 MetastoreEvents.java:296] Incremented skipped
metric to 27
I0208 13:52:16.561625 23759 MetastoreEvents.java:637] EventId: 94055 EventType:
ALTER_PARTITIONS Created a batch event for 24 events from 94032 to 94055
I0208 13:52:16.563655 23759 MetastoreEvents.java:2466] Ignoring events from
event id 94032 to 94055 since they modify parameters which can be ignored
I0208 13:52:16.565222 23759 MetastoreEventsProcessor.java:1189] Time elapsed in
processing event batch: 11.538ms
I0208 13:52:16.641959 25356 JniUtil.java:166]
8744ff74b87a5d72:a129d71800000000] resetMetadata request: REFRESH TABLE
test_skipping_batching_events_b5611edb.test_batch_table issued by jenkins
I0208 13:52:16.643860 25356 CatalogServiceCatalog.java:2642]
8744ff74b87a5d72:a129d71800000000] Refreshing table metadata:
test_skipping_batching_events_b5611edb.test_batch_table {code}
Bumping up the event polling interval to avoid flakiness in the addendum patch.
> Improve batching logic of events
> --------------------------------
>
> Key: IMPALA-10949
> URL: https://issues.apache.org/jira/browse/IMPALA-10949
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Vihang Karajgaonkar
> Assignee: Sai Hemanth Gantasala
> Priority: Major
> Fix For: Impala 4.4.0
>
>
> This is a followup based on the review comment
> https://gerrit.cloudera.org/#/c/17848/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1641
> Current approach of batching batches together the events from a single
> operation so that self-event check is done per-batch. However, it looks like
> there is a considerable scope of improving the batching logic by clubbing
> together accross the various sources of the events on a table when
> IMPALA-10926 is merged. After IMPALA-10926 each table will track the
> last_synced_event and then the events processor can simply ignore a event
> which <= the last_synced_event. This simplification of self-events logic will
> enable easier batching for all the events of a type on a table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]