Most of the intelligence in the planning process relies on having stats, including the BROADCAST/SHUFFLE join mode selection.
If you compute stats you'll have a much better experience. On Fri, Feb 9, 2018 at 11:44 AM, Piyush Narang <p.nar...@criteo.com> wrote: > Actually, looking at this again, the hash join that is consuming 179GB is > supposed to be partitioned right? How would stats change that? > > I checked the query I kicked off and I have this there, “left outer join > /* +SHUFFLE */”. I think without it I end up with query failures. > > > > Is there something I’m missing? > > > > -- Piyush > > > > > > *From: *Tim Armstrong <tarmstr...@cloudera.com> > *Reply-To: *"email@example.com" <firstname.lastname@example.org> > *Date: *Friday, February 9, 2018 at 12:24 PM > *To: *"email@example.com" <firstname.lastname@example.org> > *Subject: *Re: Debugging Impala query that consistently hangs > > > > 07:HASH JOIN 1 0.000ns 0.000ns 0 -1 > 179.72 GB 2.00 GB LEFT OUTER JOIN, PARTITIONED > > > > > > > > > > > > >