Thomas Tauber-Marshall has uploaded a new patch set (#3).

Change subject: IMPALA-4731/IMPALA-397/IMPALA-4728: Materialize sort exprs
......................................................................

IMPALA-4731/IMPALA-397/IMPALA-4728: Materialize sort exprs

Previously, exprs used in sorts were evaluated lazily. This can
potentially be bad for performance if the exprs are expensive to
evaluate, and it can lead to crashes if the exprs are
non-deterministic, as this violates assumptions of our sorting
algorithm.

This patch addresses these issues by materializing ordering exprs.
It does so when the expr is non-deterministic (including when it
contains a UDF, which we cannot currently know if they are
non-deterministic), or when its cost exceeds a threshold (or the
cost is unknown).

Testing:
- Added e2e tests in test_sort.py.
- Updated planner tests.

Change-Id: Ifefdaff8557a30ac44ea82ed428e6d1ffbca2e9e
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M tests/query_test/test_sort.py
10 files changed, 319 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/6322/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifefdaff8557a30ac44ea82ed428e6d1ffbca2e9e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>

Reply via email to