[ 
https://issues.apache.org/jira/browse/IMPALA-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reopened IMPALA-9135:
------------------------------------

Reopen this since I can still reproduce it. E.g. for INSERT, adds a sleep 
before getting the catalogVersion lock in 
[CatalogOpExecutor#updateCatalogImpl()|https://github.com/apache/impala/blob/ab92a300fc9363895418690b87b4a73df7a7202d/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L7290]:
{code:java}
diff --git a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java 
b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
index 565ee58b2..b2c82ef88 100644
--- a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
+++ b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
@@ -7286,7 +7286,7 @@ public class CatalogOpExecutor {
       throw new InternalException("Unexpected table type: " +
           update.getTarget_table());
     }
-
+    try { Thread.sleep(10000L); } catch (Exception e) {}
     tryWriteLock(table, "updating the catalog", catalogTimeline);
     final Timer.Context context
         = 
table.getMetrics().getTimer(HdfsTable.CATALOG_UPDATE_DURATION_METRIC).time();{code}
Create a table and make it loaded:
{code:sql}
create table tbl (i int);
describe tbl;{code}
Run INSERT on it with sync_ddl=1
{code}
impala-shell.sh -q "set sync_ddl=1; insert into tbl values (0)" &{code}
Run a global INVALIDATE concurrently
{code}
impala-shell.sh -q "invalidate metadata"{code}
This can finish but the INSERT keeps hanging until there is another update on 
the table.

> DDLs with sync_ddl may fail with concurrent INVALIDATE METADATA
> ---------------------------------------------------------------
>
>                 Key: IMPALA-9135
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9135
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Priority: Major
>              Labels: concurrency
>
> This can be revealed by tests/custom_cluster/test_concurrent_ddls.py added in 
> [https://gerrit.cloudera.org/c/14307]
> If running with INVALIDATE METADATA concurrently, the DDLs may run out of 
> attemps in CatalogServiceCatalog.waitForSyncDdlVersion() to wait for the 
> target update being sent, no matter how large we increase the maxNumAttempts.
> The error logs:
> {code:java}
> E1107 17:34:25.092439  7353 CatalogServiceCatalog.java:2626] Couldn't 
> retrieve the covering topic version for catalog objects. Updated objects: 
> [TABLE:test_ddls_with_invalidate_metadata_sync_ddl_f41e97e6.test_9_part 
> version: 349], deleted objects: []
> I1107 17:34:25.093451  7353 jni-util.cc:288] 
> org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog 
> topic version for the SYNC_DDL operation after 5 attempts.The operation has 
> been successfully executed but its effects may have not been broadcast to all 
> the coordinators.
>         at 
> org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:2630)
>         at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:414)
>         at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:167)
> I1107 17:34:25.142006  6389 catalog-server.cc:337] A catalog update with 2 
> entries is assembled. Catalog version: 356 Last sent catalog version: 355
> I1107 17:34:25.142168  6381 catalog-server.cc:641] Collected update: 
> 1:TABLE:test_ddls_with_invalidate_metadata_sync_ddl_f41e97e6.test_15_part, 
> version=357, original size=101, compressed size=98
> I1107 17:34:25.142215  6381 catalog-server.cc:641] Collected update: 
> 1:CATALOG_SERVICE_ID, version=357, original size=49, compressed size=52
> I1107 17:34:25.142287  7356 CatalogServiceCatalog.java:2642] Operation using 
> SYNC_DDL is waiting for catalog topic version: 357. Time to identify topic 
> version (msec): 19
> I1107 17:34:25.192239  6389 catalog-server.cc:337] A catalog update with 2 
> entries is assembled. Catalog version: 357 Last sent catalog version: 356
> I1107 17:34:25.192428  6381 catalog-server.cc:641] Collected update: 
> 1:TABLE:test_ddls_with_invalidate_metadata_sync_ddl_f41e97e6.test_16_part, 
> version=358, original size=101, compressed size=98
> I1107 17:34:25.192462  6381 catalog-server.cc:641] Collected update: 
> 1:TABLE:test_ddls_with_invalidate_metadata_sync_ddl_f41e97e6.test_11_part, 
> version=359, original size=101, compressed size=98
> I1107 17:34:25.192484  6381 catalog-server.cc:641] Collected update: 
> 1:TABLE:test_ddls_with_invalidate_metadata_sync_ddl_f41e97e6.test_12_part, 
> version=360, original size=101, compressed size=98
> I1107 17:34:25.192535  6381 catalog-server.cc:641] Collected update: 
> 1:CATALOG_SERVICE_ID, version=360, original size=49, compressed size=52
> I1107 17:34:25.192613  7355 CatalogServiceCatalog.java:2642] Operation using 
> SYNC_DDL is waiting for catalog topic version: 360. Time to identify topic 
> version (msec): 13
> I1107 17:34:25.192695  7351 CatalogServiceCatalog.java:2642] Operation using 
> SYNC_DDL is waiting for catalog topic version: 360. Time to identify topic 
> version (msec): 45
> I1107 17:34:25.192734  7350 CatalogServiceCatalog.java:2642] Operation using 
> SYNC_DDL is waiting for catalog topic version: 360. Time to identify topic 
> version (msec): 29
> I1107 17:34:25.222911  7353 status.cc:126] CatalogException: Couldn't 
> retrieve the catalog topic version for the SYNC_DDL operation after 5 
> attempts.The operation has been successfully executed but its effects may 
> have not been broadcast to all the coordinators.
>     @          0x1c5ae50  impala::Status::Status()
>     @          0x24f7ad2  impala::JniUtil::GetJniExceptionMsg()
>     @          0x1c41987  impala::JniCall::Call<>()
>     @          0x1c3fec9  impala::JniUtil::CallJniMethod<>()
>     @          0x1c3e0e6  impala::Catalog::ExecDdl()
>     @          0x1c1ed17  CatalogServiceThriftIf::ExecDdl()
>     @          0x1cb3047  impala::CatalogServiceProcessor::process_ExecDdl()
>     @          0x1cb2d95  impala::CatalogServiceProcessor::dispatchCall()
>     @          0x1c08d65  apache::thrift::TDispatchProcessor::process()
>     @          0x20e8a0d  
> apache::thrift::server::TAcceptQueueServer::Task::run()
>     @          0x20de040  impala::ThriftThread::RunRunnable()
>     @          0x20df766  boost::_mfi::mf2<>::operator()()
>     @          0x20df5fc  boost::_bi::list3<>::operator()<>()
>     @          0x20df348  boost::_bi::bind_t<>::operator()()
>     @          0x20df25b  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
>     @          0x1ffb6e9  boost::function0<>::operator()()
>     @          0x2573dea  impala::Thread::SuperviseThread()
>     @          0x257c16e  boost::_bi::list5<>::operator()<>()
>     @          0x257c092  boost::_bi::bind_t<>::operator()()
>     @          0x257c055  boost::detail::thread_data<>::run()
>     @          0x3d61599  thread_proxy
>     @     0x7f1ce6ca46b9  start_thread
>     @     0x7f1ce343f41c  clone
> E1107 17:34:25.222932  7353 catalog-server.cc:112] CatalogException: Couldn't 
> retrieve the catalog topic version for the SYNC_DDL operation after 5 
> attempts.The operation has been successfully executed but its effects may 
> have not been broadcast to all the coordinators.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to