[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-03-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20670


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r171463022
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala
 ---
@@ -22,21 +22,24 @@ import org.apache.spark.sql.catalyst.expressions._
 
 trait QueryPlanConstraints { self: LogicalPlan =>
 
+  /**
+   * An [[ExpressionSet]] that contains an additional set of constraints, 
such as equality
+   * constraints and `isNotNull` constraints, etc.
+   */
+  lazy val allConstraints: ExpressionSet = ExpressionSet(validConstraints
--- End diff --

We still need `if (conf.constraintPropagationEnabled)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r171462811
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala
 ---
@@ -22,21 +22,24 @@ import org.apache.spark.sql.catalyst.expressions._
 
 trait QueryPlanConstraints { self: LogicalPlan =>
 
+  /**
+   * An [[ExpressionSet]] that contains an additional set of constraints, 
such as equality
+   * constraints and `isNotNull` constraints, etc.
+   */
+  lazy val allConstraints: ExpressionSet = ExpressionSet(validConstraints
+.union(inferAdditionalConstraints(validConstraints))
+.union(constructIsNotNullConstraints(validConstraints)))
--- End diff --

Nit: indents


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r171276798
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/InferFiltersFromConstraintsSuite.scala
 ---
@@ -192,4 +192,17 @@ class InferFiltersFromConstraintsSuite extends 
PlanTest {
 
 comparePlans(Optimize.execute(original.analyze), correct.analyze)
   }
+
+  test("SPARK-23405: left-semi equal-join should filter out null join keys 
on both sides") {
+val x = testRelation.subquery('x)
+val y = testRelation.subquery('y)
+val condition = Some("x.a".attr === "y.a".attr)
+val originalQuery = x.join(y, LeftSemi, condition).analyze
+val left = x.where(IsNotNull('a))
+val right = y.where(IsNotNull('a))
+val correctAnswer = left.join(right, LeftSemi, condition)
+.analyze
--- End diff --

this doesn't need to be in a new line


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-28 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r171201415
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala
 ---
@@ -22,21 +22,30 @@ import org.apache.spark.sql.catalyst.expressions._
 
 trait QueryPlanConstraints { self: LogicalPlan =>
 
+  /**
+   * An [[ExpressionSet]] that contains an additional set of constraints 
about equality
--- End diff --

The comment is not acute, we may have various kinds of constraints.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r171182870
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/InferFiltersFromConstraintsSuite.scala
 ---
@@ -192,4 +192,17 @@ class InferFiltersFromConstraintsSuite extends 
PlanTest {
 
 comparePlans(Optimize.execute(original.analyze), correct.analyze)
   }
+
+  test("SPARK-23405:single left-semi join, filter out nulls on either side 
on equi-join keys") {
+val x = testRelation.subquery('x)
+val y = testRelation.subquery('y)
+val originalQuery = x.join(y, LeftSemi,
+  condition = Some("x.a".attr === "y.a".attr)).analyze
--- End diff --

nit: we can create a `val condition = Some("x.a".attr === "y.a".attr)` to 
reduce duplicated code


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r171182439
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/InferFiltersFromConstraintsSuite.scala
 ---
@@ -192,4 +192,17 @@ class InferFiltersFromConstraintsSuite extends 
PlanTest {
 
 comparePlans(Optimize.execute(original.analyze), correct.analyze)
   }
+
+  test("SPARK-23405:single left-semi join, filter out nulls on either side 
on equi-join keys") {
--- End diff --

nit: `SPARK-23405: left-semi equa-join should filter out null join keys on 
both sides`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r171182102
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala
 ---
@@ -22,21 +22,30 @@ import org.apache.spark.sql.catalyst.expressions._
 
 trait QueryPlanConstraints { self: LogicalPlan =>
 
+  /**
+   * An [[ExpressionSet]] that contains an additional set of constraints 
about equality
+   * constraints and `isNotNull` constraints.
+   */
+  lazy val allConstraints: ExpressionSet = {
+if (conf.constraintPropagationEnabled) {
+  ExpressionSet(validConstraints
+.union(inferAdditionalConstraints(validConstraints))
+.union(constructIsNotNullConstraints(validConstraints)))
+} else {
+  ExpressionSet(Set.empty)
+}
+  }
+
   /**
* An [[ExpressionSet]] that contains invariants about the rows output 
by this operator. For
* example, if this set contains the expression `a = 2` then that 
expression is guaranteed to
* evaluate to `true` for all rows produced.
*/
   lazy val constraints: ExpressionSet = {
 if (conf.constraintPropagationEnabled) {
--- End diff --

now we don't need this if.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r171102033
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala
 ---
@@ -23,20 +23,23 @@ import org.apache.spark.sql.catalyst.expressions._
 trait QueryPlanConstraints { self: LogicalPlan =>
 
   /**
-   * An [[ExpressionSet]] that contains invariants about the rows output 
by this operator. For
-   * example, if this set contains the expression `a = 2` then that 
expression is guaranteed to
-   * evaluate to `true` for all rows produced.
-   */
+* An [[ExpressionSet]] that contains an additional set of constraints 
about equality
+* constraints and `isNotNull` constraints.
+*/
+  lazy val allConstraints: ExpressionSet = ExpressionSet(validConstraints
--- End diff --

This should also be guarded by `constraintPropagationEnabled`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20670: [SPARK-23405] Generate additional constraints for...

2018-02-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20670#discussion_r170840898
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala
 ---
@@ -27,16 +27,15 @@ trait QueryPlanConstraints { self: LogicalPlan =>
* example, if this set contains the expression `a = 2` then that 
expression is guaranteed to
* evaluate to `true` for all rows produced.
--- End diff --

The comment belongs to `constraints` not `allConstraints`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org