spark git commit: [SPARK-10371][SQL] Implement subexpr elimination for UnsafeProjections
Repository: spark Updated Branches: refs/heads/branch-1.6 5ccc1eb08 -> f38509a76 [SPARK-10371][SQL] Implement subexpr elimination for UnsafeProjections This patch adds the building blocks for codegening subexpr elimination and implements it end to end for UnsafeProjection. The building blocks can be used to do the same thing for other operators. It introduces some utilities to compute common sub expressions. Expressions can be added to this data structure. The expr and its children will be recursively matched against existing expressions (ones previously added) and grouped into common groups. This is built using the existing `semanticEquals`. It does not understand things like commutative or associative expressions. This can be done as future work. After building this data structure, the codegen process takes advantage of it by: 1. Generating a helper function in the generated class that computes the common subexpression. This is done for all common subexpressions that have at least two occurrences and the expression tree is sufficiently complex. 2. When generating the apply() function, if the helper function exists, call that instead of regenerating the expression tree. Repeated calls to the helper function shortcircuit the evaluation logic. Author: Nong Li Author: Nong Li This patch had conflicts when merged, resolved by Committer: Michael Armbrust Closes #9480 from nongli/spark-10371. (cherry picked from commit 87aedc48c01dffbd880e6ca84076ed47c68f88d0) Signed-off-by: Michael Armbrust Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f38509a7 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f38509a7 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f38509a7 Branch: refs/heads/branch-1.6 Commit: f38509a763816f43a224653fe65e4645894c9fc4 Parents: 5ccc1eb Author: Nong Li Authored: Tue Nov 10 11:28:53 2015 -0800 Committer: Michael Armbrust Committed: Tue Nov 10 11:29:05 2015 -0800 -- .../expressions/EquivalentExpressions.scala | 106 + .../sql/catalyst/expressions/Expression.scala | 50 +- .../sql/catalyst/expressions/Projection.scala | 16 ++ .../expressions/codegen/CodeGenerator.scala | 110 - .../codegen/GenerateUnsafeProjection.scala | 36 - .../catalyst/expressions/namedExpressions.scala | 4 + .../SubexpressionEliminationSuite.scala | 153 +++ .../scala/org/apache/spark/sql/SQLConf.scala| 8 + .../apache/spark/sql/execution/SparkPlan.scala | 5 + .../spark/sql/execution/basicOperators.scala| 3 +- .../org/apache/spark/sql/SQLQuerySuite.scala| 48 ++ 11 files changed, 523 insertions(+), 16 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/f38509a7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala new file mode 100644 index 000..e7380d2 --- /dev/null +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import scala.collection.mutable + +/** + * This class is used to compute equality of (sub)expression trees. Expressions can be added + * to this class and they subsequently query for expression equality. Expression trees are + * considered equal if for the same input(s), the same result is produced. + */ +class EquivalentExpressions { + /** + * Wrapper around an Expression that provides semantic equality. + */ + case class Expr(e: Expression) { +val hash = e.semanticHash() +override def equals(o: Any): Boolean = o match { + case other: Expr =>
spark git commit: [SPARK-10371][SQL] Implement subexpr elimination for UnsafeProjections
Repository: spark Updated Branches: refs/heads/master 53600854c -> 87aedc48c [SPARK-10371][SQL] Implement subexpr elimination for UnsafeProjections This patch adds the building blocks for codegening subexpr elimination and implements it end to end for UnsafeProjection. The building blocks can be used to do the same thing for other operators. It introduces some utilities to compute common sub expressions. Expressions can be added to this data structure. The expr and its children will be recursively matched against existing expressions (ones previously added) and grouped into common groups. This is built using the existing `semanticEquals`. It does not understand things like commutative or associative expressions. This can be done as future work. After building this data structure, the codegen process takes advantage of it by: 1. Generating a helper function in the generated class that computes the common subexpression. This is done for all common subexpressions that have at least two occurrences and the expression tree is sufficiently complex. 2. When generating the apply() function, if the helper function exists, call that instead of regenerating the expression tree. Repeated calls to the helper function shortcircuit the evaluation logic. Author: Nong Li Author: Nong Li This patch had conflicts when merged, resolved by Committer: Michael Armbrust Closes #9480 from nongli/spark-10371. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/87aedc48 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/87aedc48 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/87aedc48 Branch: refs/heads/master Commit: 87aedc48c01dffbd880e6ca84076ed47c68f88d0 Parents: 5360085 Author: Nong Li Authored: Tue Nov 10 11:28:53 2015 -0800 Committer: Michael Armbrust Committed: Tue Nov 10 11:28:53 2015 -0800 -- .../expressions/EquivalentExpressions.scala | 106 + .../sql/catalyst/expressions/Expression.scala | 50 +- .../sql/catalyst/expressions/Projection.scala | 16 ++ .../expressions/codegen/CodeGenerator.scala | 110 - .../codegen/GenerateUnsafeProjection.scala | 36 - .../catalyst/expressions/namedExpressions.scala | 4 + .../SubexpressionEliminationSuite.scala | 153 +++ .../scala/org/apache/spark/sql/SQLConf.scala| 8 + .../apache/spark/sql/execution/SparkPlan.scala | 5 + .../spark/sql/execution/basicOperators.scala| 3 +- .../org/apache/spark/sql/SQLQuerySuite.scala| 48 ++ 11 files changed, 523 insertions(+), 16 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/87aedc48/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala new file mode 100644 index 000..e7380d2 --- /dev/null +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import scala.collection.mutable + +/** + * This class is used to compute equality of (sub)expression trees. Expressions can be added + * to this class and they subsequently query for expression equality. Expression trees are + * considered equal if for the same input(s), the same result is produced. + */ +class EquivalentExpressions { + /** + * Wrapper around an Expression that provides semantic equality. + */ + case class Expr(e: Expression) { +val hash = e.semanticHash() +override def equals(o: Any): Boolean = o match { + case other: Expr => e.semanticEquals(other.e) + case _ => false +} +override def hashCode: Int = hash + } + + // For