[GitHub] spark pull request: [SPARK-13630][SQL] Adds optimizer rule collaps...

2016-04-19 Thread skambha
Github user skambha closed the pull request at:

https://github.com/apache/spark/pull/11480


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13630][SQL] Adds optimizer rule collaps...

2016-03-02 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/11480#discussion_r54840678
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -483,6 +484,25 @@ object CollapseRepartition extends Rule[LogicalPlan] {
 }
 
 /**
+ * Collapse two adjacent [[Sort]] operators into one if possible. Keep the 
last sort
+ * This rule applies to the scenario where the global is same for the Sort 
nodes and then
+ * either a) The sorts are adjacent or b) In between two Sort nodes, there 
is a Filter or
+ * a Project or a Limit
+ */
+object CollapseSorts extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

And if we do it, we probably want to do it based on some output constraint 
of a plan


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13630][SQL] Adds optimizer rule collaps...

2016-03-02 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/11480#discussion_r54840644
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -483,6 +484,25 @@ object CollapseRepartition extends Rule[LogicalPlan] {
 }
 
 /**
+ * Collapse two adjacent [[Sort]] operators into one if possible. Keep the 
last sort
+ * This rule applies to the scenario where the global is same for the Sort 
nodes and then
+ * either a) The sorts are adjacent or b) In between two Sort nodes, there 
is a Filter or
+ * a Project or a Limit
+ */
+object CollapseSorts extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

as it is, i'm not sure how useful this rule is.

I think what would be useful is to have an EliminateSort rule, that can 
actually eliminate the more useful cases.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13630][SQL] Adds optimizer rule collaps...

2016-03-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11480#issuecomment-191508120
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13630][SQL] Adds optimizer rule collaps...

2016-03-02 Thread skambha
GitHub user skambha opened a pull request:

https://github.com/apache/spark/pull/11480

[SPARK-13630][SQL] Adds optimizer rule collapsesorts to collapse adja…

## What changes were proposed in this pull request?

This patch does the following: 
I) Adds a new optimizer rule collapsesorts that does the following if 
global is same for the adjacent sorts.
a)  Collapse adjacent sorts and keep the last sort
b) Collapse adjacent sorts if there is a project or a limit or a filter in 
between and keep the last sort. 

II) A new test suite CollapseSortsSuite is added with tests. 
Also note, one of the _testcase (test("collapsesorts: test collapsesorts in 
sort <- limit <- sort scenario") ) _does not compare with expected plan because 
of the unapply in Limit will actually remove the LocalLimit from the plan. 
Hence the test just checks that the collapsesorts rule was exercised by 
checking for the number of Sort in the plan.

## How was this patch tested?
A)
Following test suites were run and the lint checking was done. No new test 
failures: 
build/sbt -Phive hive/test
build/sbt sql/test
build/sbt catalyst/test
dev/lint-scala

B) A new test suite CollapseSortsSuite is added with new tests to exercise 
the collapsesorts rule. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/skambha/spark SPARK-13630

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11480.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11480


commit 7e2450efbf7b33cc33bb0b2946ba790cf27e1bac
Author: Sunitha Kambhampati 
Date:   2016-03-02T23:59:38Z

[SPARK-13630][SQL] Adds optimizer rule collapsesorts to collapse adjacent 
sorts and add new testsuite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org