Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12519 )

Change subject: IMPALA-8214: Fix bad plan in load_nested.py
......................................................................

IMPALA-8214: Fix bad plan in load_nested.py

The previous plan had the larger input on the build side of the join and
did a broadcast join, which is very suboptimal.

This speeds up data loading on my minicluster - 18s vs 31s and has a
more significant impact on a real cluster, where queries execute
much faster, the memory requirement is significantly reduced and
the data loading can potentially be broken up into fewer chunks.

I also considered computing stats on the table to let Impala generate
the same plan, but this achieves the same goal more efficiently.

Testing:
Run core tests. Resource estimates in planner tests changed slightly
because of the different distribution of data.

Change-Id: I55e0ca09590a90ba530efe4e8f8bf587dde3eeeb
Reviewed-on: http://gerrit.cloudera.org:8080/12519
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
---
M testdata/bin/load_nested.py
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
3 files changed, 12 insertions(+), 12 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/12519
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I55e0ca09590a90ba530efe4e8f8bf587dde3eeeb
Gerrit-Change-Number: 12519
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>

Reply via email to