Quanlong Huang created IMPALA-12577:
---------------------------------------
Summary: last-synced-event-time is not updated when events are
filtered out
Key: IMPALA-12577
URL: https://issues.apache.org/jira/browse/IMPALA-12577
Project: IMPALA
Issue Type: Bug
Components: Catalog
Reporter: Quanlong Huang
Assignee: Quanlong Huang
Attachments: Selection_099.png
In a quiet CDP cluster, I see the metric of event-processing lag is 8h.
However, the number of pending events is 0.
After some debug, I realized there are some canary tests keep creating
db/tables. They are not in the default catalog of Hive. So their events are
skipped. EventProcessor updates the last synced event id, but doesn't update
the last synced event time correspondingly:
{code:java}
if (filteredEvents.isEmpty()) {
lastSyncedEventId_.set(events.get(events.size() - 1).getEventId());
// Should update lastSyncedEventTimeSecs_ here
return;
}{code}
[https://github.com/apache/impala/blob/d01d028b0727fc36e66709e754cadbf8d89c6a21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L1103]
There are also other places that we just update lastSyncedEventId_ without
lastSyncedEventTimeSecs_
* At startup or global INVALIDATE METADATA, we just get the latest event id so
can't update the lastSyncedEventTimeSecs_
* When events are all filtered out in HMS side due to the eventTypeSkipList.
It'd be nice to fetch the event time as well to keep lastSyncedEventTimeSecs_
correct. It's used to calculate the lag and might trigger alerts.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]