Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12519 )
Change subject: IMPALA-8214: Fix bad plan in load_nested.py ...................................................................... IMPALA-8214: Fix bad plan in load_nested.py The previous plan had the larger input on the build side of the join and did a broadcast join, which is very suboptimal. This speeds up data loading on my minicluster - 18s vs 31s and has a more significant impact on a real cluster, where queries execute much faster, the memory requirement is significantly reduced and the data loading can potentially be broken up into fewer chunks. I also considered computing stats on the table to let Impala generate the same plan, but this achieves the same goal more efficiently. Testing: Run core tests. Resource estimates in planner tests changed slightly because of the different distribution of data. Change-Id: I55e0ca09590a90ba530efe4e8f8bf587dde3eeeb Reviewed-on: http://gerrit.cloudera.org:8080/12519 Reviewed-by: Impala Public Jenkins <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M testdata/bin/load_nested.py M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test 3 files changed, 12 insertions(+), 12 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/12519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I55e0ca09590a90ba530efe4e8f8bf587dde3eeeb Gerrit-Change-Number: 12519 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]>
