Let us know if we can help figuring out what went wrong with compute stats.
- Tim On Mon, Feb 12, 2018 at 6:07 AM, Piyush Narang <p.nar...@criteo.com> wrote: > Got it, thanks for the explanation Tim. I’ll chase into the issue with > compute stats for that table. > > > > -- Piyush > > > > > > *From: *Tim Armstrong <tarmstr...@cloudera.com> > *Reply-To: *"user@impala.apache.org" <user@impala.apache.org> > *Date: *Sunday, February 11, 2018 at 2:31 PM > > *To: *"user@impala.apache.org" <user@impala.apache.org> > *Subject: *Re: Debugging Impala query that consistently hangs > > > > Piyush, > > > > I can't recommend in strong enough terms that you figure out how to get > compute stats working. You will not have a good experience with Impala > without statistics - there's no way you will get good plans for all your > queries. > > > > - Tim > > > > On Fri, Feb 9, 2018 at 11:25 AM, Piyush Narang <p.nar...@criteo.com> > wrote: > > Thanks Tim. I had issues running compute stats on some of our tables > (calling alter table on Hive was failing and I wasn’t able to resolve it) > and I think this was one of them. I’ll try switching over to a shuffle join > and see if that helps. > > > > -- Piyush > > > > > > *From: *Tim Armstrong <tarmstr...@cloudera.com> > *Reply-To: *"user@impala.apache.org" <user@impala.apache.org> > *Date: *Friday, February 9, 2018 at 12:24 PM > > > *To: *"user@impala.apache.org" <user@impala.apache.org> > *Subject: *Re: Debugging Impala query that consistently hangs > > > > I suspect it's busy building the hash tables in the join with id=7. If you > drill down into the profile I suspect you'll see a bunch of time spent > there. The top-level time counter isn't necessarily updated live for the > time spent building the hash tables, but the fact it's using 179GB of > memory is a big hint that it's building some big hash tables. > > > > The plan you're getting is really terrible btw. That join has > 2B rows on > the right side and 0 rows on the left side, which is the exact opposite of > what you what. > > > > I'd suggest running compute stats on the input tables to get a better > plan. I suspect that will solve your problem. > > > > On Thu, Feb 8, 2018 at 12:06 PM, Piyush Narang <p.nar...@criteo.com> > wrote: > > Yeah like I mentioned, the summary tab isn’t getting updated either: > > Operator #Hosts Avg Time Max Time #Rows Est. #Rows > Peak Mem Est. Peak Mem Detail > > > > > > > > > > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ---------------------------------------------------------------------- > > 22:AGGREGATE 1 142.755us 142.755us 0 1 > 4.00 KB 10.00 MB FINALIZE > > > > > > > > > > > 21:EXCHANGE 1 0.000ns 0.000ns 0 > 1 0 0 UNPARTITIONED > > > > > > > > > > > 10:AGGREGATE 1 126.704us 126.704us 0 1 > 4.00 KB 10.00 MB > > > > > > > > > > > 20:AGGREGATE 1 312.493us 312.493us 0 -1 > 119.12 KB 128.00 MB FINALIZE > > > > > > > > > > > 19:EXCHANGE 1 0.000ns 0.000ns 0 > -1 0 0 HASH(day,country,…) > > 09:AGGREGATE 1 216.614us 216.614us 0 -1 > 119.12 KB 128.00 MB STREAMING > > > > > > > > > > > 18:AGGREGATE 1 357.918us 357.918us 0 -1 > 170.12 KB 128.00 MB FINALIZE > > > > > > > > > > > 17:EXCHANGE 1 0.000ns 0.000ns 0 > -1 0 0 HASH(…) > > > > > > > 08:AGGREGATE 1 27.985us 27.985us 0 -1 > 170.12 KB 128.00 MB STREAMING > > > > > > > > > > > 07:HASH JOIN 1 0.000ns 0.000ns 0 -1 > 179.72 GB 2.00 GB LEFT OUTER JOIN, PARTITIONED > > > > > > > > > > > |--16:EXCHANGE 1 6m47s 6m47s 2.17B > -1 0 0 HASH(user_id) > > > > > > > > > > > | 05:HASH JOIN 22 8s927ms 14s258ms 2.17B -1 > 68.07 MB 2.00 GB LEFT OUTER JOIN, BROADCAST > > > > > > > > > > > | |--14:EXCHANGE 22 11s626ms 11s844ms 1.08M > -1 0 0 BROADCAST > > > > > > > > > > > | | 04:SCAN HDFS 2 103.838ms 138.573ms 1.08M -1 > 10.48 MB 96.00 MB dim_publisher pub > > > > > > > > > > > | 03:SCAN HDFS 22 8m40s 10m9s 2.17B -1 > 1.03 GB 616.00 MB bi_arbitrage_full a > > > > > > > > > > > 15:EXCHANGE 1 22s489ms 22s489ms 0 > -1 0 0 HASH(uid) > > > > > > > > > > > 06:HASH JOIN 1 51.613ms 51.613ms 88.70K -1 > 46.04 MB 2.00 GB INNER JOIN, BROADCAST > > > > > > > > > > > |--13:EXCHANGE 1 22s928ms 22s928ms 177.30K > -1 0 0 BROADCAST > > > > > > > > > > > | 02:SCAN HDFS 22 14s311ms 21s235ms 177.30K -1 > 798.47 MB 440.00 MB advertiser_event_rich > > > > > > > > > > > 12:AGGREGATE 1 7.971ms 7.971ms 36 -1 > 36.18 MB 128.00 MB FINALIZE > > > > > > > > > > > 11:EXCHANGE 1 1s892ms 1s892ms 56 > -1 0 0 HASH(..) > > > > > > > > > > > 01:AGGREGATE 1 0.000ns 0.000ns 56 -1 > 35.73 MB 128.00 MB STREAMING > > > > > > > > > > > 00:SCAN HDFS 1 2s012ms 2s012ms 213 -1 > 3.34 MB 128.00 MB bi_dim_campaign > > > > > > -- Piyush > > > > > > *From: *Jeszy <jes...@gmail.com> > *Reply-To: *"user@impala.apache.org" <user@impala.apache.org> > *Date: *Thursday, February 8, 2018 at 2:59 PM > *To: *"user@impala.apache.org" <user@impala.apache.org> > *Subject: *Re: Debugging Impala query that consistently hangs > > > > Not sure that's what you're referring to, but scan progress isn't > necessarily indicative of overall query progress. Can you attach the text > profile of the cancelled query? > > If you cannot upload attachments, the Summary section is the best starting > point, so please include that. > > > > On 8 February 2018 at 20:53, Piyush Narang <p.nar...@criteo.com> wrote: > > Hi folks, > > > > I have a query that I’m running on Impala that seems to consistently stop > making progress after reaching 45-50%. It stays at that split number for a > couple of hours (before I cancel it). I don’t see any progress on the > summary page either. I’m running 2.11.0-cdh5.14.0 RELEASE (build > d68206561bce6b26762d62c01a78e6cd27aa7690). It seems to not be making > progress from an exchange hash step. > > Has anyone run into this in the past? Any suggestions on what’s the best > way to debug this? (I could take stack dumps on the coordinator / workers, > but not sure if there’s any other way). > > > > Thanks, > > > > -- Piyush > > > > > > > > >