Riza Suminto created IMPALA-13620:
-------------------------------------

             Summary: Lower default parallelism of compute_table_stats.py
                 Key: IMPALA-13620
                 URL: https://issues.apache.org/jira/browse/IMPALA-13620
             Project: IMPALA
          Issue Type: Improvement
          Components: Infrastructure
    Affects Versions: Impala 4.4.0
            Reporter: Riza Suminto
            Assignee: Riza Suminto


compute_table_stats.py might be overparallize in iarge core machine if 
--parallelism is not set. This overparallelism seems to overload HMS and cause 
failure in some DDL operation such as follow.
{noformat}
2024-12-15 07:28:08,946 Thread-2:  Failed on table tpch.customer
Traceback (most recent call last):
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/util/compute_table_stats.py",
 line 41, in compute_stats_table
    result = impala_client.execute(statement)
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
 line 188, in execute
    handle = self.__execute_query(query_string.strip(), user=user)
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
 line 284, in __execute_query
    self.wait_for_finished(handle)
  File 
"/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
 line 314, in wait_for_finished
    raise ImpalaBeeswaxException(error_log, None)
ImpalaBeeswaxException: Query e248e193f7cd1d0d:3a2eee1d00000000 failed:
ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore:
CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not 
allowed{noformat}
The stacktrace in CatalogD is as follow:
{noformat}
E1215 07:28:08.935083 23281 JniUtil.java:184] 
e248e193f7cd1d0d:3a2eee1d00000000] Error in ALTER_TABLE tpch.customer issued by 
jenkins. Time spent: 1m
I1215 07:28:08.935757 23281 jni-util.cc:321] e248e193f7cd1d0d:3a2eee1d00000000] 
org.apache.impala.common.ImpalaRuntimeException: Error making 'alter_table' RPC 
to Hive Metastore:
        at 
org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6675)
        at 
org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6633)
        at 
org.apache.impala.service.CatalogOpExecutor.alterTableUpdateStatsInner(CatalogOpExecutor.java:2004)
        at 
org.apache.impala.service.CatalogOpExecutor.alterTableUpdateStats(CatalogOpExecutor.java:1932)
        at 
org.apache.impala.service.CatalogOpExecutor.alterTable(CatalogOpExecutor.java:1398)
        at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:463)
        at 
org.apache.impala.service.JniCatalog.lambda$execDdl$3(JniCatalog.java:316)
        at 
org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90)
        at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58)
        at 
org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89)
        at 
org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100)
        at 
org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:245)
        at 
org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:259)
        at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:315)
Caused by: InvalidOperationException(message:Alter table in REMOTE database is 
not allowed)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result.read(ThriftHiveMetastore.java)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_req(ThriftHiveMetastore.java:3002)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_req(ThriftHiveMetastore.java:2989)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:489)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:460)
        at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
        at com.sun.proxy.$Proxy11.alter_table(Unknown Source)
        at 
org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6671)
        ... 13 more
CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not 
allowed
    @          0x10e88c4
    @          0x1cb723a
    @          0x1099748
    @          0x102e5a7
    @           0xfe16b8
    @           0xfc1393
    @           0xfc8e5b
    @          0x15355a2
    @          0x1d9ccd9
    @          0x26f5d47
    @     0x7faf6d8b2ea5
    @     0x7faf6a7adb0d
E1215 07:28:08.936064 23281 catalog-server.cc:292] 
e248e193f7cd1d0d:3a2eee1d00000000] ImpalaRuntimeException: Error making 
'alter_table' RPC to Hive Metastore:
CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not 
allowed{noformat}
My ad-hoc experiment with 16 parallelism max is able to let 
compute-table-stats.sh pass without any error.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to