[ 
https://issues.apache.org/jira/browse/IMPALA-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12577:
------------------------------------
    Description: 
In a quiet CDP cluster, I see the metric of event-processing lag is 8h. 
However, the number of pending events is 0.

!Selection_099.png|width=479,height=201!

After some debug, I realized there are some canary tests keep creating 
db/tables. They are not in the default catalog of Hive. So their events are 
skipped. EventProcessor updates the last synced event id, but doesn't update 
the last synced event time correspondingly:
{code:java}
if (filteredEvents.isEmpty()) {
  lastSyncedEventId_.set(events.get(events.size() - 1).getEventId());
  // Should update lastSyncedEventTimeSecs_ here
  return;
}{code}
[https://github.com/apache/impala/blob/d01d028b0727fc36e66709e754cadbf8d89c6a21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L1103]

There are also other places that we just update lastSyncedEventId_ without 
lastSyncedEventTimeSecs_
 * At startup or global INVALIDATE METADATA, we just get the latest event id so 
can't update the lastSyncedEventTimeSecs_
 * When events are all filtered out in HMS side due to the eventTypeSkipList.

It'd be nice to fetch the event time as well to keep lastSyncedEventTimeSecs_ 
correct. It's used to calculate the lag and might trigger alerts.

  was:
In a quiet CDP cluster, I see the metric of event-processing lag is 8h. 
However, the number of pending events is 0.

After some debug, I realized there are some canary tests keep creating 
db/tables. They are not in the default catalog of Hive. So their events are 
skipped. EventProcessor updates the last synced event id, but doesn't update 
the last synced event time correspondingly:
{code:java}
if (filteredEvents.isEmpty()) {
  lastSyncedEventId_.set(events.get(events.size() - 1).getEventId());
  // Should update lastSyncedEventTimeSecs_ here
  return;
}{code}
[https://github.com/apache/impala/blob/d01d028b0727fc36e66709e754cadbf8d89c6a21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L1103]

There are also other places that we just update lastSyncedEventId_ without 
lastSyncedEventTimeSecs_
 * At startup or global INVALIDATE METADATA, we just get the latest event id so 
can't update the lastSyncedEventTimeSecs_
 * When events are all filtered out in HMS side due to the eventTypeSkipList.

It'd be nice to fetch the event time as well to keep lastSyncedEventTimeSecs_ 
correct. It's used to calculate the lag and might trigger alerts.


> last-synced-event-time is not updated when events are filtered out
> ------------------------------------------------------------------
>
>                 Key: IMPALA-12577
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12577
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>         Attachments: Selection_099.png
>
>
> In a quiet CDP cluster, I see the metric of event-processing lag is 8h. 
> However, the number of pending events is 0.
> !Selection_099.png|width=479,height=201!
> After some debug, I realized there are some canary tests keep creating 
> db/tables. They are not in the default catalog of Hive. So their events are 
> skipped. EventProcessor updates the last synced event id, but doesn't update 
> the last synced event time correspondingly:
> {code:java}
> if (filteredEvents.isEmpty()) {
>   lastSyncedEventId_.set(events.get(events.size() - 1).getEventId());
>   // Should update lastSyncedEventTimeSecs_ here
>   return;
> }{code}
> [https://github.com/apache/impala/blob/d01d028b0727fc36e66709e754cadbf8d89c6a21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L1103]
> There are also other places that we just update lastSyncedEventId_ without 
> lastSyncedEventTimeSecs_
>  * At startup or global INVALIDATE METADATA, we just get the latest event id 
> so can't update the lastSyncedEventTimeSecs_
>  * When events are all filtered out in HMS side due to the eventTypeSkipList.
> It'd be nice to fetch the event time as well to keep lastSyncedEventTimeSecs_ 
> correct. It's used to calculate the lag and might trigger alerts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to