Thanks Tim. I had issues running compute stats on some of our tables (calling 
alter table on Hive was failing and I wasn’t able to resolve it) and I think 
this was one of them. I’ll try switching over to a shuffle join and see if that 
helps.

-- Piyush


From: Tim Armstrong <tarmstr...@cloudera.com>
Reply-To: "user@impala.apache.org" <user@impala.apache.org>
Date: Friday, February 9, 2018 at 12:24 PM
To: "user@impala.apache.org" <user@impala.apache.org>
Subject: Re: Debugging Impala query that consistently hangs

I suspect it's busy building the hash tables in the join with id=7. If you 
drill down into the profile I suspect you'll see a bunch of time spent there. 
The top-level time counter isn't necessarily updated live for the time spent 
building the hash tables, but the fact it's using 179GB of memory is a big hint 
that it's building some big hash tables.

The plan you're getting is really terrible btw. That join has > 2B rows on the 
right side and 0 rows on the left side, which is the exact opposite of what you 
what.

I'd suggest running compute stats on the input tables to get a better plan. I 
suspect that will solve your problem.

On Thu, Feb 8, 2018 at 12:06 PM, Piyush Narang 
<p.nar...@criteo.com<mailto:p.nar...@criteo.com>> wrote:
Yeah like I mentioned, the summary tab isn’t getting updated either:
Operator             #Hosts   Avg Time   Max Time    #Rows  Est. #Rows   Peak 
Mem  Est. Peak Mem  Detail
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
22:AGGREGATE              1  142.755us  142.755us        0           1    4.00 
KB       10.00 MB  FINALIZE
21:EXCHANGE               1    0.000ns    0.000ns        0           1          
0              0  UNPARTITIONED
10:AGGREGATE              1  126.704us  126.704us        0           1    4.00 
KB       10.00 MB
20:AGGREGATE              1  312.493us  312.493us        0          -1  119.12 
KB      128.00 MB  FINALIZE
19:EXCHANGE               1    0.000ns    0.000ns        0          -1          
0              0  HASH(day,country,…)
09:AGGREGATE              1  216.614us  216.614us        0          -1  119.12 
KB      128.00 MB  STREAMING
18:AGGREGATE              1  357.918us  357.918us        0          -1  170.12 
KB      128.00 MB  FINALIZE
17:EXCHANGE               1    0.000ns    0.000ns        0          -1          
0              0  HASH(…)
08:AGGREGATE              1   27.985us   27.985us        0          -1  170.12 
KB      128.00 MB  STREAMING
07:HASH JOIN              1    0.000ns    0.000ns        0          -1  179.72 
GB        2.00 GB  LEFT OUTER JOIN, PARTITIONED
|--16:EXCHANGE            1      6m47s      6m47s    2.17B          -1          
0              0  HASH(user_id)
|  05:HASH JOIN          22    8s927ms   14s258ms    2.17B          -1   68.07 
MB        2.00 GB  LEFT OUTER JOIN, BROADCAST
|  |--14:EXCHANGE        22   11s626ms   11s844ms    1.08M          -1          
0              0  BROADCAST
|  |  04:SCAN HDFS        2  103.838ms  138.573ms    1.08M          -1   10.48 
MB       96.00 MB  dim_publisher pub
|  03:SCAN HDFS          22      8m40s      10m9s    2.17B          -1    1.03 
GB      616.00 MB  bi_arbitrage_full a
15:EXCHANGE               1   22s489ms   22s489ms        0          -1          
0              0  HASH(uid)
06:HASH JOIN              1   51.613ms   51.613ms   88.70K          -1   46.04 
MB        2.00 GB  INNER JOIN, BROADCAST
|--13:EXCHANGE            1   22s928ms   22s928ms  177.30K          -1          
0              0  BROADCAST
|  02:SCAN HDFS          22   14s311ms   21s235ms  177.30K          -1  798.47 
MB      440.00 MB  advertiser_event_rich
12:AGGREGATE              1    7.971ms    7.971ms       36          -1   36.18 
MB      128.00 MB  FINALIZE
11:EXCHANGE               1    1s892ms    1s892ms       56          -1          
0              0  HASH(..)
01:AGGREGATE              1    0.000ns    0.000ns       56          -1   35.73 
MB      128.00 MB  STREAMING
00:SCAN HDFS              1    2s012ms    2s012ms      213          -1    3.34 
MB      128.00 MB  bi_dim_campaign


-- Piyush


From: Jeszy <jes...@gmail.com<mailto:jes...@gmail.com>>
Reply-To: "user@impala.apache.org<mailto:user@impala.apache.org>" 
<user@impala.apache.org<mailto:user@impala.apache.org>>
Date: Thursday, February 8, 2018 at 2:59 PM
To: "user@impala.apache.org<mailto:user@impala.apache.org>" 
<user@impala.apache.org<mailto:user@impala.apache.org>>
Subject: Re: Debugging Impala query that consistently hangs

Not sure that's what you're referring to, but scan progress isn't necessarily 
indicative of overall query progress. Can you attach the text profile of the 
cancelled query?
If you cannot upload attachments, the Summary section is the best starting 
point, so please include that.

On 8 February 2018 at 20:53, Piyush Narang 
<p.nar...@criteo.com<mailto:p.nar...@criteo.com>> wrote:
Hi folks,

I have a query that I’m running on Impala that seems to consistently stop 
making progress after reaching 45-50%. It stays at that split number for a 
couple of hours (before I cancel it).  I don’t see any progress on the summary 
page either. I’m running 2.11.0-cdh5.14.0 RELEASE (build 
d68206561bce6b26762d62c01a78e6cd27aa7690). It seems to not be making progress 
from an exchange hash step.
Has anyone run into this in the past? Any suggestions on what’s the best way to 
debug this? (I could take stack dumps on the coordinator / workers, but not 
sure if there’s any other way).

Thanks,

-- Piyush



Reply via email to