Philip Zeyliger created IMPALA-7980:
---------------------------------------
Summary: High system CPU time usage (and waste) when runtime
filters filter out files
Key: IMPALA-7980
URL: https://issues.apache.org/jira/browse/IMPALA-7980
Project: IMPALA
Issue Type: Task
Reporter: Philip Zeyliger
When running TPC-DS query 1 on scale factor 10,000 (10TB) on a 140-node cluster
with {{replica_preference=remote}}, we observed really high system CPU usage
for some of the scan nodes:
{code}
HDFS_SCAN_NODE (id=6):(Total: 59s107ms, non-child: 59s107ms, % non-
child: 100.00%
- BytesRead: 80.50 MB (84408563)
- ScannerThreadsSysTime: 36m17s
{code}
Using {perf}, we discovered a lot of usage of {futex_wait} and
{pthread_cond_wait} and so on. (We also used perf to record context switches
and cycles.) Interestingly, observing in top saw the really high system CPU
usage spike some time into the query.
We believe what's going on is that we start many ScannerThread instances, which
wait first until initial ranges have been issued and then grab data using
{impala::io::ScanRange::GetNext()}. They do this in a loop, and it uses two
locks, until the query is done or there are no {{num_unqueued_files_}} left. If
num_unqueued_files_ is left above zero, then these threads just loop through
two lock acquisitions and nothing else. We believe that this hot loop is eating
system CPU aggressively.
It's a bit interesting that this is exacerbated in the case with more remote
reads. Our best guess is that some of the reads take significantly longer in
this case, and a single outlier can extend this period of waste.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]