[ 
https://issues.apache.org/jira/browse/IMPALA-14307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18013673#comment-18013673
 ] 

Quanlong Huang commented on IMPALA-14307:
-----------------------------------------

Sorry, the above analysis has a mistake that if renameTable() sees the new Db 
object (so returns <null, null>), the follow-up fallback of invalidateTable() 
should also see the new Db object. So the new table is impossible to be added 
in the old Db object.

I then looked into the catalog versions and realized the the table is not 
deleted by reset():
{noformat}
CatalogVersion -> catalog operation
8210: add DATABASE test_ddls_with_invalidate_metadata_9525e717
8211: add test_ddls_with_invalidate_metadata_9525e717.test_5
8212: add test_ddls_with_invalidate_metadata_9525e717.test_1
8213: unknown (not collected)
8214: add test_ddls_with_invalidate_metadata_9525e717.test_5_part
8215: unknown (not collected)
8216: unknown (not collected)
8217: add test_ddls_with_invalidate_metadata_9525e717.test_7_part
8218: add test_ddls_with_invalidate_metadata_9525e717.test_0_part
8219: add test_ddls_with_invalidate_metadata_9525e717.test_0
8220: unknown (not collected)
8221: add test_ddls_with_invalidate_metadata_9525e717.test_6_part2
8222: add test_ddls_with_invalidate_metadata_9525e717.test_2_part
8223: delete test_ddls_with_invalidate_metadata_9525e717.test_6_2   <--- These 
are done in reset() which adds a db and then tables under it
8224: add DATABASE tpcds
...
8509: add DATABASE tpch_text_gzip
8510: add tpch_text_gzip.lineitem
8511: add tpch_text_gzip.part
8512: add tpch_text_gzip.partsupp
8513: add tpch_text_gzip.supplier
8514: add tpch_text_gzip.nation
8515: add tpch_text_gzip.region
8516: add tpch_text_gzip.orders
8517: add tpch_text_gzip.customer  <--- These are done in reset()
8518-8520: unknown (not collected)
8521: delete test_ddls_with_invalidate_metadata_9525e717.test_1_part2
8522: add test_ddls_with_invalidate_metadata_9525e717.test_1_part
8523: delete test_ddls_with_invalidate_metadata_9525e717.test_6
8524: add test_ddls_with_invalidate_metadata_9525e717.test_6_2  <--- 8523 and 
8524 seems not done in reset()
8525-8528: unknown (not collected)
8529: delete test_ddls_with_invalidate_metadata_9525e717.test_2
8530: add test_ddls_with_invalidate_metadata_9525e717.test_2_2{noformat}
The above catalog version list indicates the issue (delete "test_6" + add 
"test_6_2") happens after reset() updates the catalog.

What could revert the rename operation (rename "test_6_2" to "test_6") and make 
such a catalog change (delete "test_6" + add "test_6_2")?
{noformat}
I20250811 16:22:58.250062 3608258 catalog-server.cc:1615] Collected update: 
1:TABLE:test_ddls_with_invalidate_metadata_9525e717.test_6_2, version=8524, 
original size=106, compressed size=105
I20250811 16:22:58.250487 3608258 catalog-server.cc:1615] Collected deletion: 
1:TABLE:test_ddls_with_invalidate_metadata_9525e717.test_6, version=8523, 
original size=97, compressed size=95{noformat}
I then realized EventProcessor is restarted at the end reset() and there are 
actually events applied!
{noformat}
I20250811 16:22:58.178097 3608935 MetastoreEventsProcessor.java:1062] 
1d4918bff853c0ec:c5646e4a00000000] Metastore event processing restarted. Last 
synced event id was updated from 67940 to 67956
I20250811 16:22:58.182371 3608256 MetastoreEvents.java:304] Total number of 
events received: 16 Total number of events filtered out: 0
...
I20250811 16:22:58.184755 3608256 MetastoreEvents.java:871] EventId: 67947 
EventType: CREATE_TABLE Target: 
test_ddls_with_invalidate_metadata_9525e717.test_7_part. Incremented skipped 
metric to 21
I20250811 16:22:58.184978 3608256 Table.java:246] createEventId_ for table: 
test_ddls_with_invalidate_metadata_9525e717.test_6_2 set to: 67948
I20250811 16:22:58.185040 3608256 Table.java:261] lastSyncedEventId_ for table: 
test_ddls_with_invalidate_metadata_9525e717.test_6_2 set from -1 to 67948
I20250811 16:22:58.185232 3608256 CatalogOpExecutor.java:4841] EventId: 67949 
Table test_ddls_with_invalidate_metadata_9525e717.test_2_part is not loaded. 
Skipping add partitions
...{noformat}
The log of "createEventId_ for table: xxx.test_6_2" indicates applying an event 
adds this table. Unfortunately we don't log the event type. But only the 
ALTER_TABLE RENAME event could remove "test_6" and add "test_6_2". It should be 
skipped since it's an event triggered by a previous rename operation. However, 
the createEventId of "test_6" is not correctly set (as mentioned in the 
previous comment). The deleteEventLog is also not updated due to event 
processing is paused during reset(). We should allow DDLs checking events even 
if EventProcessor is paused or in error state. Since event processing can be 
recovered (restarted) later.

> invalidateTable could add table to old Db object while concurrent reset is 
> running
> ----------------------------------------------------------------------------------
>
>                 Key: IMPALA-14307
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14307
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>
> Saw a failure in TestConcurrentDdls.test_ddls_with_invalidate_metadata:
> {code:python}
> custom_cluster/test_concurrent_ddls.py:50: in 
> test_ddls_with_invalidate_metadata
>     self._run_ddls_with_invalidation(unique_database, sync_ddl=False)
> custom_cluster/test_concurrent_ddls.py:179: in _run_ddls_with_invalidation
>     worker[i].get(timeout=100)
> /data/jenkins/workspace/impala-asf-master-core-ubsan-arm/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572:
>  in get
>     raise self._value
> E   AssertionError: Query b148f272b5821772:ace4369900000000 failed:
> E     AnalysisException: Table does not exist: 
> test_ddls_with_invalidate_metadata_9525e717.test_6
> E     
> E     
> E   assert <bound method type.is_acceptable_error of <class 
> 'test_concurrent_ddls.TestConcurrentDdls'>>('Query 
> b148f272b5821772:ace4369900000000 failed:\nAnalysisException: Table does not 
> exist: test_ddls_with_invalidate_metadata_9525e717.test_6\n\n', False)
> E    +  where <bound method type.is_acceptable_error of <class 
> 'test_concurrent_ddls.TestConcurrentDdls'>> = 
> <test_concurrent_ddls.TestConcurrentDdls object at 
> 0xffff1ff7d810>.is_acceptable_error{code}
> The test runs in legacy catalog mode. Coordinator logs show that it receives 
> a deletion update on the table before analying the REFRESH statement on it:
> {noformat}
> I20250811 16:22:58.429529 3608823 ImpaladCatalog.java:227] Deleting: 
> TABLE:test_ddls_with_invalidate_metadata_9525e717.test_6 version: 8523 size: 
> 97
> ...
> I20250811 16:22:58.540377 3609121 Frontend.java:2398] 
> b148f272b5821772:ace4369900000000] Analyzing query: refresh 
> test_ddls_with_invalidate_metadata_9525e717.test_6 db: default
> ...
> I20250811 16:22:58.542122 3609121 jni-util.cc:321] 
> b148f272b5821772:ace4369900000000] 
> org.apache.impala.common.AnalysisException: Table does not exist: 
> test_ddls_with_invalidate_metadata_9525e717.test_6
>         at 
> org.apache.impala.analysis.ResetMetadataStmt.analyze(ResetMetadataStmt.java:173)
>         at 
> org.apache.impala.analysis.AnalysisContext$AnalysisDriverImpl.analyze(AnalysisContext.java:576)
>         at 
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:496)
>         at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2938)
>         at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:2505)
>         at 
> org.apache.impala.service.Frontend.getTExecRequestWithFallback(Frontend.java:2374)
>         at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:2063)
>         at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:176){noformat}
> Catalogd logs show that invalidateTable triggered by alterTableOrViewRename() 
> is running concurrently with a catalog reset (i.e. global INVALIDATE 
> METADATA):
> {noformat}
> I20250811 16:22:58.129398 3609170 JniUtil.java:167] 
> f548d67b72570890:626780e200000000] execDdl request: ALTER_TABLE 
> test_ddls_with_invalidate_metadata_9525e717.test_6_2 RENAME_TABLE issued by 
> jenkins
> I20250811 16:22:58.144670 3608935 JniUtil.java:167] 
> 1d4918bff853c0ec:c5646e4a00000000] resetMetadata request: INVALIDATE ALL 
> issued by jenkins
> I20250811 16:22:58.173722 3608935 CatalogServiceCatalog.java:2339] 
> 1d4918bff853c0ec:c5646e4a00000000] Loaded table list (12) and functions (0) 
> for database: test_ddls_with_invalidate_metadata_9525e717. Fetch duration: 2 
> ms.
> I20250811 16:22:58.177870 3609170 CatalogServiceCatalog.java:3283] 
> f548d67b72570890:626780e200000000] Invalidating table metadata: 
> test_ddls_with_invalidate_metadata_9525e717.test_6
> I20250811 16:22:58.178402 3608935 JniUtil.java:178] 
> 1d4918bff853c0ec:c5646e4a00000000] Finished resetMetadata request: INVALIDATE 
> ALL issued by jenkins. Time spent: 34ms
> I20250811 16:22:58.181602 3609170 Table.java:246] 
> f548d67b72570890:626780e200000000] createEventId_ for table: 
> test_ddls_with_invalidate_metadata_9525e717.test_6 set to: -1
> I20250811 16:22:58.181922 3609170 JniUtil.java:178] 
> f548d67b72570890:626780e200000000] Finished execDdl request: ALTER_TABLE 
> test_ddls_with_invalidate_metadata_9525e717.test_6_2 RENAME_TABLE issued by 
> jenkins. Time spent: 52ms
> {noformat}
> Then the catalog update collector thread saw the table disappears and sent 
> the deletion update:
> {noformat}
> I20250811 16:22:58.247798 3608258 JniUtil.java:167] getCatalogDelta request: 
> Getting catalog delta from version 8517
> I20250811 16:22:58.250062 3608258 catalog-server.cc:1615] Collected update: 
> 1:TABLE:test_ddls_with_invalidate_metadata_9525e717.test_6_2, version=8524, 
> original size=106, compressed size=105
> I20250811 16:22:58.250487 3608258 catalog-server.cc:1615] Collected deletion: 
> 1:TABLE:test_ddls_with_invalidate_metadata_9525e717.test_6, version=8523, 
> original size=97, compressed size=95
> I20250811 16:22:58.250890 3608258 catalog-server.cc:1615] Collected update: 
> 1:CATALOG_SERVICE_ID, version=8532, original size=60, compressed size=58
> I20250811 16:22:58.250959 3608258 JniUtil.java:178] Finished getCatalogDelta 
> request: Getting catalog delta from version 8517. Time spent: 3ms{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to