[
https://issues.apache.org/jira/browse/IMPALA-12460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773328#comment-17773328
]
ASF subversion and git services commented on IMPALA-12460:
----------------------------------------------------------
Commit 68c831176b82612f0146887e571554f20ce29c1e in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=68c831176 ]
IMPALA-12460: Add lag and histogram of event processing in the log
This patch logs the lag of the event processing which is
(latestEventTime - lastSyncedEventTime) at the end of processing an
event batch. If the batch is slow to process, we also log the top-10
expensive events and the top-10 targets that contribute to this. Admins
can decide whether to disable event processing on some tables.
For table and partition level events, the target name is the table name.
For db level events, the target name is the db name. For events that
don't have db/table names, e.g. COMMIT_TXN, the target name is a
constant string - "CLUSTER_WIDE".
Log parsing example for expensive events:
-- In shell:
$ grep 'expensive events' catalogd.INFO > expensive-events.log
-- In Python:
import re
EVENT_PATTERN = re.compile(r"\(type=(\w+), id=(\d+), target=([\w\.]+),
duration_ms=(\d+)\)")
re.findall(EVENT_PATTERN, log_line)
Log parsing example for top targets in event processing:
-- In shell:
$ grep 'targets in event processing' catalogd.INFO > slow-targets.log
-- In Python:
import re
re.findall(r"\(target=([\w\.]+), duration_ms=(\d+)\)", log_line)
This patch also adds a thread annotation on the event-processor thread
so we know what event is under processing using the jstacks.
The unit of the event time is second. This patch fixes some variable
names that show them with ms.
Tests:
- Manually add some delay in event processing, verify the logs and
jstacks.
Change-Id: Ib9421b5e26bfa2324217ec9695fbd81636727d22
Reviewed-on: http://gerrit.cloudera.org:8080/20507
Reviewed-by: Daniel Becker <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Add lag and histogram of event processing in the log
> ----------------------------------------------------
>
> Key: IMPALA-12460
> URL: https://issues.apache.org/jira/browse/IMPALA-12460
> Project: IMPALA
> Issue Type: Improvement
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
>
> At the end of processing each event batch, we log the processing time of that
> batch and the latest event info inside HMS, e.g.
> {code:java}
> I0923 04:02:05.530905 12886 MetastoreEventsProcessor.java:859] Time elapsed
> in processing event batch: 28m1s
> I0923 04:02:05.539295 12886 MetastoreEventsProcessor.java:730] Latest event
> in HMS: id=56918528, time=1695459725{code}
> We should also log the lag of the event processing which is (latestEventTime
> - lastSyncedEventTime). For each event batch that is slow to synced, we
> should also print a histogram to show what are the top10 expensive events to
> sync, and what are the top10 tables that take longest time to sync. So admins
> can decide which tables to disable event processing on them.
> CC [~hemanth619] , [~csringhofer] , [~wzhou] , [~kdeschle]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]