[jira] [Commented] (IMPALA-11490) More metrics to debug event processing lagging behind

ASF subversion and git services (Jira) Mon, 10 Oct 2022 19:43:05 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615422#comment-17615422
 ]


ASF subversion and git services commented on IMPALA-11490:
----------------------------------------------------------

Commit a52721e62761957753f482124c1e76f453d22b1d in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a52721e62 ]

IMPALA-11644: updateLatestEventId should handle cases of empty events

IMPALA-11490 adds a catalog metric for the latest event id in HMS. The
fetched events could be empty if there are no more events in the past 24
hours, since the retention duration for notification events in HMS is 24
hours by default. This case is not handled so the thread for updating
the latestEventId metric will keep throwing a NoSuchElementException
until there are new events generated.

This patch handles the case to avoid exceptions, also sets the initial
value of latestEventId to 0 which is the returned value of
getCurrentNotificationEventId() when there are no events in HMS.

Tests
- Clean up the notification events in HMS by truncating the
  NOTIFICATION_SEQUENCE and NOTIFICATION_LOG tables in the underlying
  PostgreSQL. Then launch HMS and Impala. Verified the exception
  disappears.
- Add a breakpoint in updateLatestEventId() to stop before fetching the
  events and after getting the current event id. Clean up the
  notification events in HMS. Then resume catalogd. Verified no
  exceptions are thrown.

Change-Id: I0f207fff1ff59376e30afdc3cd074c950a1c3ddb
Reviewed-on: http://gerrit.cloudera.org:8080/19112
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> More metrics to debug event processing lagging behind
> -----------------------------------------------------
>
>                 Key: IMPALA-11490
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11490
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>              Labels: supportability
>             Fix For: Impala 4.2.0
>
>
> Event processor could lag behind in many cases, e.g. processing lots events 
> on large tables, waiting for table locks held by manual refresh or other 
> metadata operations, etc.
> Currently we have metric on the last synced event id. We should also add 
> metric on the latest event id in HMS. Users can compare them to know whether 
> event processing is lagging behind.
> We should also add logs/metrics on tables that take long time in event 
> processing, especially those longer than the event polling interval. So users 
> can decide whether to disable event processing on them, or reduce concurrency 
> of metadata operations on them.
> Some metrics like average events processing duration in the last 5min, 30min 
> or 1h will also be helpful.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-11490) More metrics to debug event processing lagging behind

Reply via email to