[
https://issues.apache.org/jira/browse/IMPALA-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299996#comment-17299996
]
Quanlong Huang commented on IMPALA-10563:
-----------------------------------------
h1. Analysis for *test_failing_inserts*
The timeout issue of *test_failing_inserts* is due to slow listing in loading
file metadata. The test passes if I increase the timeout, and it finally takes
1790.68s.
In catalogd.INFO, we can find that loading the file meta can take 9s!
{code:java}
I0310 05:29:51.962357 6705 CatalogOpExecutor.java:5307] Committed transaction:
895
I0310 05:29:51.972383 6705 HdfsTable.java:1161] Reloading metadata for all
partition(s) of test_failing_inserts_b980fc6.test_inserts_fail (INSERT)
I0310 05:29:51.976411 6705 HdfsTable.java:2684] Load Valid Write Id List Done.
Time taken: 3.826ms
...... skipping catalog update logs ......
I0310 05:30:01.293617 6705 HdfsTable.java:763] Loaded file and block metadata
for test_failing_inserts_b980fc6.test_inserts_fail partitions: . Time taken:
9s316ms
I0310 05:30:01.293763 6705 HdfsTable.java:1198] Incrementally loaded table
metadata for: test_failing_inserts_b980fc6.test_inserts_fail
W0310 05:30:01.293851 6705 HdfsTable.java:1226] Time taken on loading table
test_failing_inserts_b980fc6.test_inserts_fail exceeded warning threshold.
Time: 9s321ms{code}
Adding some tracing logs, I found that each time when we get FileStatus of a
subdir, it takes some time (~50ms). And there are too many delta dirs under the
table:
{code:java}
I0311 17:44:03.150600 28739 CatalogOpExecutor.java:5307] Committed transaction:
234
I0311 17:44:03.161862 28739 HdfsTable.java:1161] Reloading metadata for all
partition(s) of test_failing_inserts_b980fc6.test_inserts_fail (INSERT)
I0311 17:44:03.161931 28739 HdfsTable.java:2653] Get valid writeIds for table:
test_failing_inserts_b980fc6.test_inserts_fail
I0311 17:44:03.166882 28739 HdfsTable.java:2658] Valid writeIds:
test_failing_inserts_b980fc6.test_inserts_fail:234:225:225:1,2,3,4,5,8,9,11,14,16,17,18,20,24,25,28,30,32,33,36,37,38,40,41,42,43,44,45,46,47,48,49,50,52,53,54,56,60,63,64,67,68,69,73,77,79,82,83,87,91,92,93,99,100,101,103,104,106,110,111,112,113,119,120,121,123,126,129,133,135,136,137,138,139,140,143,144,148,149,150,151,152,156,158,159,160,161,165,171,172,174,175,177,179,183,186,191,192,195,196,197,200,201,203,206,208,209,210,211,212,213,214,215,217,218,219,221,224,226,228,229,231,232
I0311 17:44:03.167032 28739 HdfsTable.java:2684] Load Valid Write Id List Done.
Time taken: 5.022ms
I0311 17:44:03.167337 28739 HdfsTable.java:1285] update unpartitioned table:
test_failing_inserts_b980fc6.test_inserts_fail
I0311 17:44:03.169451 28739 FileMetadataLoader.java:182] Refreshing file
metadata from path
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail
I0311 17:44:03.244565 28739 FileMetadataLoader.java:196] Got RemoteIterator
I0311 17:44:03.292613 28739 FileMetadataLoader.java:205] Got FileStatus for
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail/base_109
I0311 17:44:03.292798 28739 FileMetadataLoader.java:205] Got FileStatus for
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail/base_109/empty
I0311 17:44:03.345263 28739 FileMetadataLoader.java:205] Got FileStatus for
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail/base_118
I0311 17:44:03.345350 28739 FileMetadataLoader.java:205] Got FileStatus for
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail/base_118/empty
...... skipping similar logs ......
I0311 17:44:14.537648 28739 FileMetadataLoader.java:205] Got FileStatus for
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail/delta_233_233
I0311 17:44:14.537734 28739 FileMetadataLoader.java:205] Got FileStatus for
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail/delta_233_233/b24e5609cc97ac6d-89d65f3600000000_1641018731_data.0.txt
I0311 17:44:14.588214 28739 FileMetadataLoader.java:205] Got FileStatus for
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail/delta_234_234
I0311 17:44:14.588277 28739 FileMetadataLoader.java:205] Got FileStatus for
gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail/delta_234_234/704276a64ce65329-24c3910200000000_1987007864_data.0.txt
I0311 17:44:14.590593 28739 FileMetadataLoader.java:251]
{path=gs://quanlong-impala-test-02/test-warehouse/managed/test_failing_inserts_b980fc6.db/test_inserts_fail,
loaded files=1, skipped files=12, uncommited files=248, superceded files=97}
I0311 17:44:14.590889 28739 HdfsTable.java:763] Loaded file and block metadata
for test_failing_inserts_b980fc6.test_inserts_fail partitions: . Time taken:
11s423ms
I0311 17:44:14.591003 28739 HdfsTable.java:1198] Incrementally loaded table
metadata for: test_failing_inserts_b980fc6.test_inserts_fail
W0311 17:44:14.591102 28739 HdfsTable.java:1226] Time taken on loading table
test_failing_inserts_b980fc6.test_inserts_fail exceeded warning threshold.
Time: 11s429ms
{code}
In the above case, there are around 250 subdirs under the table folder. Listing
each of them takes 50ms. Finally sum up to 12s.
Thus, timeout for test_failing_inserts is due to slow listing. gsutil has a
more performant flat listing, e.g. {{gsutil ls gs://mybucket/**}}. Maybe we can
use it instead. Will file a followup jira for fast listing.
> Stress insert tests timeout on GCS
> ----------------------------------
>
> Key: IMPALA-10563
> URL: https://issues.apache.org/jira/browse/IMPALA-10563
> Project: IMPALA
> Issue Type: Bug
> Components: Infrastructure
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Major
> Attachments: test_concurrent_inserts_catalogd.INFO.bz2,
> test_concurrent_inserts_hive-metastore.log.bz2,
> test_concurrent_inserts_impalad.INFO.bz2,
> test_concurrent_inserts_impalad_node1.INFO.bz2,
> test_concurrent_inserts_impalad_node2.INFO.bz2,
> test_failing_inserts_catalogd.INFO.bz2,
> test_failing_inserts_hive-metastore.log.bz2,
> test_failing_inserts_impalad.INFO.bz2,
> test_failing_inserts_impalad_node1.INFO.bz2,
> test_failing_inserts_impalad_node2.INFO.bz2
>
>
> The following tests failed with TimeoutError when running on GCS:
> {code:java}
> F
> stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0]
> stress/test_acid_stress.py:255: in test_concurrent_inserts
> run_tasks(writers + checkers)
> stress/stress_util.py:43: in run_tasks
> pool.map_async(Task.run, tasks).get(timeout_seconds)
>
> ../toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:568:
> in get
> raise TimeoutError
> E TimeoutError
> F
> stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0]
> stress/test_acid_stress.py:333: in test_failing_inserts
> self._run_test_failing_inserts(unique_database, is_partitioned)
> stress/test_acid_stress.py:324: in _run_test_failing_inserts
> run_tasks(writers + checkers)
> stress/stress_util.py:43: in run_tasks
> pool.map_async(Task.run, tasks).get(timeout_seconds)
>
> ../toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:568:
> in get
> raise TimeoutError
> E TimeoutError
> F
> stress/test_insert_stress.py::TestInsertStress::()::test_inserts[unique_database0]
> stress/test_insert_stress.py:102: in test_inserts
> run_tasks(writers + checkers)
> stress/stress_util.py:43: in run_tasks
> pool.map_async(Task.run, tasks).get(timeout_seconds)
>
> ../toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:568:
> in get
> raise TimeoutError
> E TimeoutError
> {code}
> It seems the statements take too much time in waiting for HMS. Thus resulting
> in the whole slow down. Found some error logs in HMS's log:
> {code}
> 2021-03-08T01:29:10,479 INFO [pool-6-thread-181] txn.TxnHandler: Allocated
> writeId: 1 for txnId: 3667
> 2021-03-08T01:29:10,492 INFO [pool-6-thread-177] txn.TxnHandler: Added
> entries to MIN_HISTORY_LEVEL with a single query for current txn: [3669]
> 2021-03-08T01:29:10,493 INFO [pool-6-thread-181] txn.TxnHandler: Allocated
> write ids for dbName=test_concurrent_inserts_8933345c,
> tblName=test_concurrent_inserts (txnIds: [3667])
> 2021-03-08T01:29:10,502 INFO [pool-6-thread-195] txn.TxnHandler: Added
> entries to MIN_HISTORY_LEVEL with a single query for current txn: [3670]
> 2021-03-08T01:29:10,507 ERROR [pool-6-thread-197] txn.TxnHandler: Exception
> during write ids allocation for
> request=AllocateTableWriteIdsRequest(dbName:test_concurrent_inserts_8933345c,
> tableName:test_concurrent_inserts, txnIds:[3668]). Will retry if possible.
> org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique
> constraint "NEXT_WRITE_ID_IDX"
> Detail: Key ("NWI_DATABASE",
> "NWI_TABLE")=(test_concurrent_inserts_8933345c, test_concurrent_inserts)
> already exists.
> at
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
> ~[postgresql-42.2.14.jar:42.2.14]
> at
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267)
> ~[postgresql-42.2.14.jar:42.2.14]
> at
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312)
> ~[postgresql-42.2.14.jar:42.2.14]
> at
> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448)
> ~[postgresql-42.2.14.jar:42.2.14]
> at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369)
> ~[postgresql-42.2.14.jar:42.2.14]
> at
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153)
> ~[postgresql-42.2.14.jar:42.2.14]
> at
> org.postgresql.jdbc.PgPreparedStatement.execute(PgPreparedStatement.java:142)
> ~[postgresql-42.2.14.jar:42.2.14]
> at
> com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
> ~[HikariCP-2.6.1.jar:?]
> at
> com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
> ~[HikariCP-2.6.1.jar:?]
> at
> org.apache.hadoop.hive.metastore.txn.TxnHandler.allocateTableWriteIds(TxnHandler.java:1820)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8370)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[?:1.8.0_282]
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[?:1.8.0_282]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_282]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282]
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at com.sun.proxy.$Proxy28.allocate_table_write_ids(Unknown Source)
> [?:?]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:20126)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:20110)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at java.security.AccessController.doPrivileged(Native Method)
> [?:1.8.0_282]
> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_282]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
> [hadoop-common-3.1.1.7.2.7.0-44.jar:?]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_282]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_282]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
> 2021-03-08T01:29:10,513 WARN [pool-6-thread-197] txn.TxnHandler: Retryable
> error detected in
> allocateTableWriteIds(AllocateTableWriteIdsRequest(dbName:test_concurrent_inserts_8933345c,
> tableName:test_concurrent_inserts, txnIds:[3668])). Will wait 2000ms and
> retry up to 10 times. Error: ERROR: duplicate key value violates unique
> constraint "NEXT_WRITE_ID_IDX"
> Detail: Key ("NWI_DATABASE",
> "NWI_TABLE")=(test_concurrent_inserts_8933345c, test_concurrent_inserts)
> already exists.
> 2021-03-08T01:29:10,554 INFO [pool-6-thread-177] txn.TxnHandler: Allocated
> writeId: 2 for txnId: 3669
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]