[ 
https://issues.apache.org/jira/browse/IMPALA-13620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-13620.
-----------------------------------
    Fix Version/s: Impala 4.5.0
       Resolution: Fixed

> Lower default parallelism of compute_table_stats.py
> ---------------------------------------------------
>
>                 Key: IMPALA-13620
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13620
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>    Affects Versions: Impala 4.4.0
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Major
>             Fix For: Impala 4.5.0
>
>
> compute_table_stats.py might be overparallize in iarge core machine if 
> --parallelism is not set. This overparallelism seems to overload HMS and 
> cause failure in some DDL operation such as follow.
> {noformat}
> 2024-12-15 07:28:08,946 Thread-2:  Failed on table tpch.customer
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/util/compute_table_stats.py",
>  line 41, in compute_stats_table
>     result = impala_client.execute(statement)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
>  line 188, in execute
>     handle = self.__execute_query(query_string.strip(), user=user)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
>  line 284, in __execute_query
>     self.wait_for_finished(handle)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/tests/beeswax/impala_beeswax.py",
>  line 314, in wait_for_finished
>     raise ImpalaBeeswaxException(error_log, None)
> ImpalaBeeswaxException: Query e248e193f7cd1d0d:3a2eee1d00000000 failed:
> ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore:
> CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not 
> allowed{noformat}
> The stacktrace in CatalogD is as follow:
> {noformat}
> E1215 07:28:08.935083 23281 JniUtil.java:184] 
> e248e193f7cd1d0d:3a2eee1d00000000] Error in ALTER_TABLE tpch.customer issued 
> by jenkins. Time spent: 1m
> I1215 07:28:08.935757 23281 jni-util.cc:321] 
> e248e193f7cd1d0d:3a2eee1d00000000] 
> org.apache.impala.common.ImpalaRuntimeException: Error making 'alter_table' 
> RPC to Hive Metastore:
>         at 
> org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6675)
>         at 
> org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6633)
>         at 
> org.apache.impala.service.CatalogOpExecutor.alterTableUpdateStatsInner(CatalogOpExecutor.java:2004)
>         at 
> org.apache.impala.service.CatalogOpExecutor.alterTableUpdateStats(CatalogOpExecutor.java:1932)
>         at 
> org.apache.impala.service.CatalogOpExecutor.alterTable(CatalogOpExecutor.java:1398)
>         at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:463)
>         at 
> org.apache.impala.service.JniCatalog.lambda$execDdl$3(JniCatalog.java:316)
>         at 
> org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90)
>         at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58)
>         at 
> org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89)
>         at 
> org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100)
>         at 
> org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:245)
>         at 
> org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:259)
>         at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:315)
> Caused by: InvalidOperationException(message:Alter table in REMOTE database 
> is not allowed)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result$alter_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_req_result.read(ThriftHiveMetastore.java)
>         at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_req(ThriftHiveMetastore.java:3002)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_req(ThriftHiveMetastore.java:2989)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:489)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:460)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
>         at com.sun.proxy.$Proxy11.alter_table(Unknown Source)
>         at 
> org.apache.impala.service.CatalogOpExecutor.applyAlterTable(CatalogOpExecutor.java:6671)
>         ... 13 more
> CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not 
> allowed
>     @          0x10e88c4
>     @          0x1cb723a
>     @          0x1099748
>     @          0x102e5a7
>     @           0xfe16b8
>     @           0xfc1393
>     @           0xfc8e5b
>     @          0x15355a2
>     @          0x1d9ccd9
>     @          0x26f5d47
>     @     0x7faf6d8b2ea5
>     @     0x7faf6a7adb0d
> E1215 07:28:08.936064 23281 catalog-server.cc:292] 
> e248e193f7cd1d0d:3a2eee1d00000000] ImpalaRuntimeException: Error making 
> 'alter_table' RPC to Hive Metastore:
> CAUSED BY: InvalidOperationException: Alter table in REMOTE database is not 
> allowed{noformat}
> My ad-hoc experiment with 16 parallelism max is able to let 
> compute-table-stats.sh pass without any error.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to