[jira] [Commented] (IMPALA-13884) Improves logging of slow event processing

ASF subversion and git services (Jira) Fri, 28 Mar 2025 18:15:18 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939358#comment-17939358
 ]


ASF subversion and git services commented on IMPALA-13884:
----------------------------------------------------------

Commit 8a3642864edbec450661a1aabd21dab0fc574c27 in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=8a3642864 ]

IMPALA-13884: Add more details in metadata loading logs

Adds the target of the event in HMS event processing logs. For
AlterPartitionEvent, the partition name will also be printed. Note that
Add/DropPartitionEvent could have multiple partitions so ignore them for
now. E.g.
EventId: 5271 EventType: ALTER_PARTITION Target: db1.tbl year=2009/month=1
EventId: 28781 EventType: ALTER_TABLE Target: tpcds.store_sales

In the reason of loading table metadata, adds the event id if the
request comes from the EventProcessor. E.g.
Reloading metadata for table definition and all partition(s) of db1.tbl (INSERT 
event 15765)
Reloading partition metadata: db1.tbl part=1 (INSERT event 15685)
Reloading partition metadata: cachedb.cached_tbl_part j=4,j=1,j=2... and 2 
others (batch of 5 ALTER_PARTITIONS events from 38063 to 38067)

For logs of execDdl requests on AlterTable operations, add the
AlterType. E.g.
execDdl request: ALTER_TABLE db1.tbl UPDATE_STATS issued by user1
execDdl request: ALTER_TABLE db1.tbl RECOVER_PARTITIONS issued by user1

For the warning of waiting for table write lock too long in
CatalogServiceCatalog.tryLock(), also logs the stacktrace. E.g.

Write lock for table db1.tbl was acquired in 3329 msec. Caller stacktrace:
org.apache.impala.catalog.CatalogServiceCatalog.tryLock(CatalogServiceCatalog.java:635)
        at 
org.apache.impala.catalog.CatalogServiceCatalog.tryWriteLock(CatalogServiceCatalog.java:537)
        at 
org.apache.impala.catalog.CatalogServiceCatalog.evaluateSelfEvent(CatalogServiceCatalog.java:1250)
        at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.isSelfEvent(MetastoreEvents.java:928)
        at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.isSelfEvent(MetastoreEvents.java:1259)
        at 
org.apache.impala.catalog.events.MetastoreEvents$BatchPartitionEvent.processTableEvent(MetastoreEvents.java:2819)
        at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.process(MetastoreEvents.java:1348)
        at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:703)
        at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1354)
        at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1097)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)

For metadata loading that got ignored due to the table being updated,
e.g. invalidated concurrently, log the reason. E.g.
Not updating table db1.tbl since it has been modified. Current catalog version: 
30229. Expected catalog version: 25877

Adds a log when start recovering partitions. E.g.
Recovering 9 partitions for db1.tbl

Also adds an overload of AcidUtils.isTransactionalTable() for FeTable to
simplify some codes.

Fix a wrong errorMessageTemplate of Preconditions.checkArgument() in
FeCatalogUtils.parsePartitionKeyValues(). Only "%s" can be used as
placeholders. Arguments will be converted to strings using
String.valueOf(Object) so don't need "%d".

Tests:
 - Passed exhaustive tests

Change-Id: I204d5922e055fd8501b5573e3b913f8874d891d6
Reviewed-on: http://gerrit.cloudera.org:8080/22653
Reviewed-by: Riza Suminto <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Improves logging of slow event processing
> -----------------------------------------
>
>                 Key: IMPALA-13884
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13884
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>
> I saw a slow event processing log like this:
> {code:java}
> W0306 20:54:02.202912   279 MetastoreEventsProcessor.java:1417] Top 10 
> expensive events: (type=INSERT, id=4485886, 
> target=highpartition_small_impala.stocks_highpartition_small_file, 
> duration_ms=40529)...{code}
> However, I can't find any logs of this event directly. Instead, I have to 
> find logs of the previous event 4485885
> {code:java}
> I0306 20:52:29.213896   279 MetastoreEvents.java:838] EventId: 4485885 
> EventType: ADD_PARTITION Incremented skipped metric to 1101327 since no 
> partitions were added.{code}
> and the next event 4485887
> {code:java}
> I0306 20:53:09.742729   279 MetastoreEvents.java:827] EventId: 4485887 
> EventType: ALTER_PARTITION Not processing this event as it only modifies some 
> partition parameters which can be ignored.{code}
> Then all the logs of thread id 279 between these two are the logs of event 
> 4485886. Note that currently event processing is executed in a dedicated 
> thread (single thread).
> {noformat}
> W0306 20:53:03.690521   279 CatalogServiceCatalog.java:631] Write lock for 
> table highpartition_small_impala.stocks_highpartition_small_file was acquired 
> in 34477 msec
> W0306 20:53:09.326007   279 CatalogServiceCatalog.java:631] Write lock for 
> table highpartition_small_impala.stocks_highpartition_small_file was acquired 
> in 5635 msec
> I0306 20:53:09.326260   279 HdfsTable.java:2914] Reloading partition 
> metadata: highpartition_small_impala.stocks_highpartition_small_file 
> year=2022/month=Feb/day=10 (INSERT event)
> I0306 20:53:09.326365   279 MetaStoreUtil.java:191] Fetching 1 partitions 
> for: highpartition_small_impala.stocks_highpartition_small_file using 
> partition batch size: 1000
> I0306 20:53:09.339805   279 MetaStoreUtil.java:223] Fetched 1/1 partitions 
> for table highpartition_small_impala.stocks_highpartition_small_file
> I0306 20:53:09.342584   279 HdfsTable.java:3072] Setting the latest refresh 
> event id to 5016235 for the reloaded 1 partitions
> I0306 20:53:09.342660   279 HdfsTable.java:3075] for table 
> highpartition_small_impala.stocks_highpartition_small_file, file metadataOps: 
> FORCE_LOAD, refreshing file metadata for 1 out of 1 partitions to reload in 
> reloadPartitions()
> I0306 20:53:09.742367   279 HdfsTable.java:847] Loaded file and block 
> metadata for highpartition_small_impala.stocks_highpartition_small_file 
> partitions: year=2022/month=Feb/day=10. Time taken: 399.520ms{noformat}
> We might not be always lucky to have logs of the previous and next events. 
> It'd be helpful to show the event id somewhere, e.g. in the reason of 
> reloading "(INSERT event)".
> For the warning logs of waiting for write locks too long, we should log the 
> stacktrace so we know what's waiting for it.
> For the skipped events, it'd be helpful to log the table name and partition 
> name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-13884) Improves logging of slow event processing

Reply via email to