I suspect it's busy building the hash tables in the join with id=7. If you
drill down into the profile I suspect you'll see a bunch of time spent
there. The top-level time counter isn't necessarily updated live for the
time spent building the hash tables, but the fact it's using 179GB of
memory is
Thanks Tim. I had issues running compute stats on some of our tables (calling
alter table on Hive was failing and I wasn’t able to resolve it) and I think
this was one of them. I’ll try switching over to a shuffle join and see if that
helps.
-- Piyush
From: Tim Armstrong
Reply-To: "user@impa
Actually, looking at this again, the hash join that is consuming 179GB is
supposed to be partitioned right? How would stats change that?
I checked the query I kicked off and I have this there, “left outer join /*
+SHUFFLE */”. I think without it I end up with query failures.
Is there something I
Most of the intelligence in the planning process relies on having stats,
including the BROADCAST/SHUFFLE join mode selection.
If you compute stats you'll have a much better experience.
On Fri, Feb 9, 2018 at 11:44 AM, Piyush Narang wrote:
> Actually, looking at this again, the hash join that is
To be clearer, the main problem with that plan is that the join order is
bad. Broadcast vs shuffle is a secondary issue. The query doesn't look that
complex so with stats you should get a reasonable plan without hinting.
On 9 Feb. 2018 17:29, "Tim Armstrong" wrote:
> Most of the intelligence in