[
https://issues.apache.org/jira/browse/IMPALA-13620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Riza Suminto resolved IMPALA-13620.
-----------------------------------
Fix Version/s: Impala 4.5.0
Resolution: Fixed
> Lower default parallelism of compute_table_stats.py
> ---------------------------------------------------
>
> Key: IMPALA-13620
> URL: https://issues.apache.org/jira/browse/IMPALA-13620
> Project: IMPALA
> Issue Type: Improvement
> Components: Infrastructure
> Affects Versions: Impala 4.4.0
> Reporter: Riza Suminto
> Assignee: Riza Suminto
> Priority: Major
> Fix For: Impala 4.5.0
>
>
> compute_table_stats.py might be overparallize in iarge core machine if
> --parallelism is not set. This overparallelism seems to overload HMS and
> cause failure in some DDL operation such as follow.
> {noformat}
> 2024-12-15 07:28:08,946 Thread-2: Failed on table tpch.customer
> Traceback (most recent call last):
> File
> "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/util/compute_table_stats.py",
> line 41, in compute_stats_table
> result = impala_client.execute(statement)
> File
> "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
> line 188, in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> File
> "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
> line 284, in __execute_query
> self.wait_for_finished(handle)
> File
> "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
> line 314, in wait_for_finished
> raise ImpalaBeeswaxException(error_log, None)
> ImpalaBeeswaxException: Query e248e193f7cd1d0d:3a2eee1d00000000 failed:
> ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore:
> CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not
> allowed{noformat}
> The stacktrace in CatalogD is as follow:
> {noformat}
> E1215 07:28:08.935083 23281 JniUtil.java:184]
> e248e193f7cd1d0d:3a2eee1d00000000] Error in ALTER_TABLE tpch.customer issued
> by jenkins. Time spent: 1m
> I1215 07:28:08.935757 23281 jni-util.cc:321]
> e248e193f7cd1d0d:3a2eee1d00000000]
> org.apache.impala.common.ImpalaRuntimeException: Error making 'alter_table'
> RPC to Hive Metastore:
> at
> org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6675)
> at
> org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6633)
> at
> org.apache.impala.service.CatalogOpExecutor.alterTableUpdateStatsInner(CatalogOpExecutor.java:2004)
> at
> org.apache.impala.service.CatalogOpExecutor.alterTableUpdateStats(CatalogOpExecutor.java:1932)
> at
> org.apache.impala.service.CatalogOpExecutor.alterTable(CatalogOpExecutor.java:1398)
> at
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:463)
> at
> org.apache.impala.service.JniCatalog.lambda$execDdl$3(JniCatalog.java:316)
> at
> org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90)
> at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58)
> at
> org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89)
> at
> org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100)
> at
> org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:245)
> at
> org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:259)
> at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:315)
> Caused by: InvalidOperationException(message:Alter table in REMOTE database
> is not allowed)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result.read(ThriftHiveMetastore.java)
> at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_req(ThriftHiveMetastore.java:3002)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_req(ThriftHiveMetastore.java:2989)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:489)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:460)
> at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
> at com.sun.proxy.$Proxy11.alter_table(Unknown Source)
> at
> org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6671)
> ... 13 more
> CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not
> allowed
> @ 0x10e88c4
> @ 0x1cb723a
> @ 0x1099748
> @ 0x102e5a7
> @ 0xfe16b8
> @ 0xfc1393
> @ 0xfc8e5b
> @ 0x15355a2
> @ 0x1d9ccd9
> @ 0x26f5d47
> @ 0x7faf6d8b2ea5
> @ 0x7faf6a7adb0d
> E1215 07:28:08.936064 23281 catalog-server.cc:292]
> e248e193f7cd1d0d:3a2eee1d00000000] ImpalaRuntimeException: Error making
> 'alter_table' RPC to Hive Metastore:
> CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not
> allowed{noformat}
> My ad-hoc experiment with 16 parallelism max is able to let
> compute-table-stats.sh pass without any error.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]