Quanlong Huang created IMPALA-12488:
---------------------------------------
Summary: Don't go into NEEDS_INVALIDATE state for trivial
ALTER_TABLE event issues
Key: IMPALA-12488
URL: https://issues.apache.org/jira/browse/IMPALA-12488
Project: IMPALA
Issue Type: Improvement
Components: Catalog
Reporter: Quanlong Huang
I can easily stop the event-processor by two Hive commands:
{code:sql}
create table mytbl(i int) tblproperties('impala.disableHmsSync'='true');
alter table mytbl set tblproperties('impala.disableHmsSync'='false'); {code}
The CREATE_TABLE event is skipped due to 'impala.disableHmsSync' is set to
'true'. The follow-up ALTER_TABLE event leads the event-processor go into the
NEEDS_INVALIDATE state and requires a global INVALIDATE METADATA to recover,
which is expensive and will impact all the users.
Event processor logs:
{noformat}
I1009 10:28:19.685743 14387 MetastoreEvents.java:289] Total number of events
received: 2 Total number of events filtered out: 0
I1009 10:28:19.685840 14387 MetastoreEvents.java:293] Incremented skipped
metric to 0
I1009 10:28:19.686319 14387 MetastoreEvents.java:628] EventId: 8369359
EventType: CREATE_TABLE Found table level flag impala.disableHmsSync is set to
true for table default.mytbl
I1009 10:28:19.686364 14387 MetastoreEvents.java:628] EventId: 8369359
EventType: CREATE_TABLE Skipping this event because of flag evaluation
I1009 10:28:19.686443 14387 MetastoreEvents.java:639] EventId: 8369359
EventType: CREATE_TABLE Incremented skipped metric to 1
I1009 10:28:19.691318 14387 MetastoreEventsProcessor.java:1068] Time elapsed in
processing event batch: 122.248ms
I1009 10:28:19.693006 14387 MetastoreEventsProcessor.java:936] Latest event in
HMS: id=8369360, time=1696818498
I1009 10:28:21.697371 14387 MetastoreEventsProcessor.java:838] Received 2
events. Start event id : 8369360
I1009 10:28:21.703583 14387 MetastoreEvents.java:289] Total number of events
received: 2 Total number of events filtered out: 0
I1009 10:28:21.703678 14387 MetastoreEvents.java:293] Incremented skipped
metric to 1
I1009 10:28:21.703763 14387 MetastoreEvents.java:628] EventId: 8369362
EventType: ALTER_TABLE Before flag value true after flag value false changed
for table default.mytbl
I1009 10:28:21.705121 14387 CatalogServiceCatalog.java:1032] Not a self-event
since the given version is -1 and service id is empty
I1009 10:28:21.706120 14387 MetastoreEvents.java:639] EventId: 8369362
EventType: ALTER_TABLE Automatic refresh on table default.mytbl failed as the
table either does not exist anymore or is not in loaded state.
I1009 10:28:21.706358 14387 MetastoreEventsProcessor.java:1068] Time elapsed in
processing event batch: 8.797ms
E1009 10:28:21.706552 14387 MetastoreEventsProcessor.java:893] Event processing
needs a invalidate command to resolve the state
Java exception follows:
org.apache.impala.catalog.events.MetastoreNotificationNeedsInvalidateException:
EventId: 8369362 EventType: ALTER_TABLE Detected that event sync was turned on
for the table default.mytbl and the table does not exist. Event processing
cannot be continued further. Issue a invalidate metadata command to reset the
event processing state
at
org.apache.impala.catalog.events.MetastoreEvents$AlterTableEvent.process(MetastoreEvents.java:1533)
at
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:522)
at
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1055)
at
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:884)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
W1009 10:28:22.721551 14387 MetastoreEventsProcessor.java:877] Event processing
is skipped since status is NEEDS_INVALIDATE. Last synced event id is 8369360
W1009 10:28:23.722357 14387 MetastoreEventsProcessor.java:877] Event processing
is skipped since status is NEEDS_INVALIDATE. Last synced event id is
8369360{noformat}
The cause is that reloadTableFromCatalog() returns false:
[https://github.com/apache/impala/blob/782cda449/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1524]
More specifitly, catalog_.reloadTableIfExists() in it returns false.
[https://github.com/apache/impala/blob/782cda449/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2819]
There are several possible causes, we can handle them differently and avoid
going into the NEEDS_INVALIDATE:
# If the table doesn't exist in the catalog cache (including
DatabaseNotFoundException), check if it exists in HMS. Yes => add it as
unloaded state (IncompleteTable). No => Ignore the event.
# If the table exists in the catalog cache and is in unloaded state, ignore
the event. We've done so in many other places.
# If the table is loaded in catalog cache and somehow eventId <=
table.getCreateEventId(), just ignore the event since it's stale.
# If the table is loaded in catalog cache and {{reloadTable()}} on it failed
with TableLoadingException, invalidate the table since its metadata is probably
messed up.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]