Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/11647#discussion_r56102487
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
---
@@ -152,7 +152,10 @@ abstract class Expression extends
Github user tedyu commented on a diff in the pull request:
https://github.com/apache/spark/pull/11647#discussion_r56065435
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
---
@@ -152,7 +152,10 @@ abstract class Expression extends
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/11647
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-196454296
LGTM, merging to master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-196191931
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-196191923
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-196191580
**[Test build #53050 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53050/consoleFull)**
for PR 11647 at commit
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-196165607
Thanks for reviewing this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-196165179
**[Test build #53050 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53050/consoleFull)**
for PR 11647 at commit
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-196165006
Thanks - I will let @marmbrus do the final review & merge.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/11647#discussion_r55959825
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
---
@@ -598,3 +601,61 @@ abstract class TernaryExpression
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/11647#discussion_r55958696
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
---
@@ -598,3 +601,61 @@ abstract class TernaryExpression
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-196110966
@rxin @marmbrus I've addressed comments. Any other change I should do for
this? Thanks!
---
If your project is set up for it, you can reply to this email and have your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195655654
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195655657
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195655579
**[Test build #52986 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52986/consoleFull)**
for PR 11647 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195640380
**[Test build #52986 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52986/consoleFull)**
for PR 11647 at commit
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/11647#discussion_r55912815
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala
---
@@ -36,15 +36,16 @@ import
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/11647#discussion_r55872639
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala
---
@@ -36,15 +36,16 @@ import
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195307315
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195307309
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195306728
**[Test build #52918 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52918/consoleFull)**
for PR 11647 at commit
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195302187
I was measuring the time to finish the code snippet shown in pr
description. But I think I should only measure the time spent on `val actual =
Optimize.execute(plan)`.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195257957
**[Test build #52918 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52918/consoleFull)**
for PR 11647 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195251569
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195251580
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195250651
**[Test build #52909 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52909/consoleFull)**
for PR 11647 at commit
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195239995
Ah, the above is wrong. I forget to consider other side of each `Or`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195239016
Hmm, like this?
protected def splitDisjunctivePredicates(condition: Expression):
Seq[Expression] = {
condition match {
case
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195238948
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195238945
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195238751
**[Test build #52905 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52905/consoleFull)**
for PR 11647 at commit
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195231098
One way I can think of which is a bit ugly is to special case a sequence of
ors and optimize it without recursion.
---
If your project is set up for it, you can reply
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195230298
Should I update the title of this PR, as I simplify `Canonicalize` instead
of `BooleanSimplification`?
---
If your project is set up for it, you can reply to this
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195229737
For example, this case:
// Common factor elimination for conjunction
case and @ (left And right) =>
val lhs =
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195228890
I think it is the time needed to go through `BooleanSimplification` itself?
It is a big rule. A few pattern cases in it is complicated as I see.
---
If your project is
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195227387
What are the time spent beyond that?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195225784
**[Test build #52909 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52909/consoleFull)**
for PR 11647 at commit
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195225504
With latest change:
233 milliseconds
207 milliseconds
220 milliseconds
BTW, by completely removing the comparison of canonicalized
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195212540
**[Test build #52905 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52905/consoleFull)**
for PR 11647 at commit
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195212450
It still seem pretty slow. Any other way to speed this up by an order of
magnitude?
cc @marmbrus
---
If your project is set up for it, you can reply to this
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195211932
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195201225
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195201219
**[Test build #52902 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52902/consoleFull)**
for PR 11647 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195201226
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195200420
**[Test build #52902 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52902/consoleFull)**
for PR 11647 at commit
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/11647#issuecomment-195198332
cc @davies @marmbrus
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/11647
[SPARK-13658][SQL] BooleanSimplification rule is slow with large boolean
expressions
JIRA: https://issues.apache.org/jira/browse/SPARK-13658
## What changes were proposed in this pull
48 matches
Mail list logo