[ 
https://issues.apache.org/jira/browse/IMPALA-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18007942#comment-18007942
 ] 

Michael Smith commented on IMPALA-14176:
----------------------------------------

Let's use this to track the actual bug we're observing: a probable race between 
"invalidate metadata" and event processing.

[~rizaon]'s theory on this is
# EP is lagging behind
# INVALIDATE ALL query run, wiping out dbCache_ with Incomplete/Unloaded 
databases.
# TableLoadingMgr is in the middle of populating table list of 
test_local_catalog_ddls_with_invalidate_metadata_sync_ddl_b87f02d6, but not yet 
load test_1_part.
# REFRESH 
test_local_catalog_ddls_with_invalidate_metadata_sync_ddl_b87f02d6.test_1_part 
in local catalog mode goes through doGetPartialCatalogObject(). It found the 
DB, but not the table.
# Coordinator says table is not found.
# EP catch up faster than TableLoadingMgr, and set IncompleteTable with smaller 
EventId.

However (from [~stigahuang]) the table list is loaded in INVALIDATE ALL here: 
https://github.com/apache/impala/blob/5db760662f7cfc060ff9fda78676759b369523b0/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2183.
 TableLoadingMgr just replaces the IncompleteTables with HdfsTables, etc. (So 
this sequence is probably not quite right.)

> test_ddls_with_invalidate_metadata seems to be flaky
> ----------------------------------------------------
>
>                 Key: IMPALA-14176
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14176
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Fang-Yu Rao
>            Assignee: Michael Smith
>            Priority: Major
>              Labels: broken-build
>
> custom_cluster.test_concurrent_ddls.TestConcurrentDdls.test_ddls_with_invalidate_metadata()
>  in 
> [https://github.com/apache/impala/blame/master/tests/custom_cluster/test_concurrent_ddls.py]
>  seems to be flaky. The test could fail with the following error message.
> {code:java}
> Stacktrace
> conftest.py:407: in cleanup
>     cleanup_database(client, db_name, True)
> conftest.py:393: in cleanup_database
>     "" if must_exist else "IF EXISTS", db_name))
> common/impala_connection.py:686: in execute
>     cursor.execute(sql_stmt, configuration=self.__query_options)
> ../infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:394:
>  in execute
>     self._wait_to_finish()  # make execute synchronous
> ../infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:484:
>  in _wait_to_finish
>     raise OperationalError(resp.errorMessage)
> E   OperationalError: Query 02494739f9b0db33:4172daba00000000 failed:
> E   ImpalaRuntimeException: Error making 'dropDatabase' RPC to Hive 
> Metastore: 
> E   CAUSED BY: NoSuchObjectException: 
> hive.test_ddls_with_invalidate_metadata_9525e717.test_14 table not found
> {code}
>  
> Sometimes I could also see the following error message.
> {code:java}
> custom_cluster/test_concurrent_ddls.py:78: in 
> test_local_catalog_ddls_with_invalidate_metadata
>     self._run_ddls_with_invalidation(unique_database, sync_ddl=False)
> custom_cluster/test_concurrent_ddls.py:169: in _run_ddls_with_invalidation
>     worker[i].get(timeout=100)
> /data/jenkins/workspace/impala-asf-master-core-ozone-erasure-coding/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
>  in get
>     raise self._value
> E   AssertionError: Query 434d35fc7094287b:c3809b7b00000000 failed:
> E     AnalysisException: Table already exists: 
> test_local_catalog_ddls_with_invalidate_metadata_e78f2324.test_16_part
> E     
> E     
> E   assert <bound method type.is_acceptable_error of <class 
> 'test_concurrent_ddls.TestConcurrentDdls'>>('Query 
> 434d35fc7094287b:c3809b7b00000000 failed:\nAnalysisException: Table already 
> exists: 
> test_local_catalog_ddls_with_invalidate_metadata_e78f2324.test_16_part\n\n', 
> False)
> E    +  where <bound method type.is_acceptable_error of <class 
> 'test_concurrent_ddls.TestConcurrentDdls'>> = 
> <test_concurrent_ddls.TestConcurrentDdls object at 
> 0x7fa3e9d82d50>.is_acceptable_error
> {code}
> In the latter case, Impala coordinator threw an AnalysisException during the 
> analysis of the query "{{alter table 
> test_local_catalog_ddls_with_invalidate_metadata_e78f2324.test_16_part2 
> rename to 
> test_local_catalog_ddls_with_invalidate_metadata_e78f2324.test_16_part}}".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to