Quanlong Huang created IMPALA-12577:
---------------------------------------

             Summary: last-synced-event-time is not updated when events are 
filtered out
                 Key: IMPALA-12577
                 URL: https://issues.apache.org/jira/browse/IMPALA-12577
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang
         Attachments: Selection_099.png

In a quiet CDP cluster, I see the metric of event-processing lag is 8h. 
However, the number of pending events is 0.

After some debug, I realized there are some canary tests keep creating 
db/tables. They are not in the default catalog of Hive. So their events are 
skipped. EventProcessor updates the last synced event id, but doesn't update 
the last synced event time correspondingly:
{code:java}
if (filteredEvents.isEmpty()) {
  lastSyncedEventId_.set(events.get(events.size() - 1).getEventId());
  // Should update lastSyncedEventTimeSecs_ here
  return;
}{code}
[https://github.com/apache/impala/blob/d01d028b0727fc36e66709e754cadbf8d89c6a21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L1103]

There are also other places that we just update lastSyncedEventId_ without 
lastSyncedEventTimeSecs_
 * At startup or global INVALIDATE METADATA, we just get the latest event id so 
can't update the lastSyncedEventTimeSecs_
 * When events are all filtered out in HMS side due to the eventTypeSkipList.

It'd be nice to fetch the event time as well to keep lastSyncedEventTimeSecs_ 
correct. It's used to calculate the lag and might trigger alerts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to