Quanlong Huang created IMPALA-9214:
--------------------------------------

             Summary: REFRESH with sync_ddl may fail with concurrent INVALIDATE 
METADATA
                 Key: IMPALA-9214
                 URL: https://issues.apache.org/jira/browse/IMPALA-9214
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang


The call trace for executing a REFRESH statement in Catalogd is
{code:java}
JniCatalog#resetMetadata
  CatalogOpExecutor#execResetMetadata
    CatalogServiceCatalog#reloadTable
    CatalogServiceCatalog#waitForSyncDdlVersion
{code}
In CatalogServiceCatalog#reloadTable(), the {{Tbl}} object may be stale if 
there's a concurrent reset, i.e. INVALIDATE METADATA, running. Then 
{{CatalogServiceCatalog#reloadTable}} will return the thrift object of a stale 
Table. It can't be found in the catalog cache and the {{topicUpdateLog_}}, so 
{{waitForSyncDdlVersion}} will finally hang or run out of attempts.

Here is an example. Let's say table1 is an unpartitioned table and is loaded. 
Two queries, "Refresh table1" and "Invalidate metadata" are running 
concurrently.

Thread-1 (Refresh):
 * Gets the {{Table}} object in CatalogServiceCatalog#execResetMetadata and 
goes into {{reloadTable}}. The catalog version of table1 is 50.
 * Waiting for both version lock and table lock here: 
[https://github.com/apache/impala/blob/a1588e44980c648cb7f9263cbd0409abfbaeacf7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2023]

Thread-2 (Invalidate Metadata):
 * Holds the version lock and replace the whole catalog cache with a new one. 
Makes all existing catalog objects stale. Now the catalog version of table1 is 
90.
 * Release the version lock.

Thread-1 (Refresh):
 * Gets the version lock and table lock
 * Get a new catalog version, let's say 100. Then release version lock.
 * Load the metadata into the stale Table object. Bump its catalog version from 
50 to 100.
 * Return the thrift object of the updated stale object from {{reloadTable}}
 * Goes into {{waitForSyncDdlVersion}}. Wait for an update of table1 is sent 
and the sent version >= 100.

However, table1 in the catalog cache is with version 90. Unless there's another 
update on this table, Thread-1 will hang or run out of attempts for waiting the 
expected update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to