Michael Smith created IMPALA-13989:
--------------------------------------
Summary: ALTER TABLE RENAME can fail with concurrent INVALIDATE
METADATA
Key: IMPALA-13989
URL: https://issues.apache.org/jira/browse/IMPALA-13989
Project: IMPALA
Issue Type: Bug
Components: Catalog
Affects Versions: Impala 5.0.0
Reporter: Michael Smith
IMPALA-13631 removes holding the catalog's versionLock_ writeLock during the
whole operation (including HMS RPC). That introduces a possible failure mode
where {{ALTER TABLE RENAME}} fails with
{quote}Table/view rename succeeded in the Hive Metastore, but failed in
Impala's Catalog Server.{quote}
when {{INVALIDATE METADATA}} is run concurrently. This shows up in the new
statements added to test_concurrent_ddls.py.
I can reproduce this error by adding a delay after HMS {{alter_table}} RPC
completes (and before we {{getNextMetastoreEventsForTableIfEnabled}}) and
running {{INVALIDATE METADATA}} from another session. I think that suggests the
scenario as:
# {{alter_table}} RPC completes
# Impala {{invalidate metadata}} executes and processes {{alter_table}} event
# {{alterTableOrViewRename}} runs {{catalog_.alterTable}}, but old table has
already been removed from the catalog so it fails
This should be pretty rare. Running a global {{invalidate metadata}} is a bad
idea in a production environment as it's akin to restarting catalogd. However I
think we can address this with better error handling in
{{alterTableOrViewRename}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)