[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-20 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59860308
  
Thanks, merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2825


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-17 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2825#discussion_r19014836
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExpressionOptimizationSuite.scala
 ---
@@ -30,7 +30,7 @@ class ExpressionOptimizationSuite extends 
ExpressionEvaluationSuite {
   expected: Any,
   inputRow: Row = EmptyRow): Unit = {
 val plan = Project(Alias(expression, sOptimized($expression))() :: 
Nil, NoRelation)
-val optimizedPlan = Optimizer(plan)
+val optimizedPlan = SparkOptimizer(plan)
 super.checkEvaluation(optimizedPlan.expressions.head, expected, 
inputRow)
   }
 }
--- End diff --

Would you mind to do a favor to add a new line here? :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-17 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2825#discussion_r19014954
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -28,7 +28,9 @@ import org.apache.spark.sql.catalyst.plans.logical._
 import org.apache.spark.sql.catalyst.rules._
 import org.apache.spark.sql.catalyst.types._
 
-object Optimizer extends RuleExecutor[LogicalPlan] {
+abstract class Optimizer extends RuleExecutor[LogicalPlan]
+
+object SparkOptimizer extends Optimizer {
--- End diff --

I think `DefaultOptimizer` or `BasicOptimizer` may be better names. 
Conceptually, Catalyst is not tightly coupled with Spark (I know right now 
Catalyst depends on Spark code somehow, but we don't want to exacerbate this 
trend).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-17 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59506552
  
Two minor comments. This LGTM, thanks :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-17 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2825#discussion_r19050063
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExpressionOptimizationSuite.scala
 ---
@@ -30,7 +30,7 @@ class ExpressionOptimizationSuite extends 
ExpressionEvaluationSuite {
   expected: Any,
   inputRow: Row = EmptyRow): Unit = {
 val plan = Project(Alias(expression, sOptimized($expression))() :: 
Nil, NoRelation)
-val optimizedPlan = Optimizer(plan)
+val optimizedPlan = SparkOptimizer(plan)
 super.checkEvaluation(optimizedPlan.expressions.head, expected, 
inputRow)
   }
 }
--- End diff --

Of course not. I'll add a new line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-17 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2825#discussion_r19050137
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -28,7 +28,9 @@ import org.apache.spark.sql.catalyst.plans.logical._
 import org.apache.spark.sql.catalyst.rules._
 import org.apache.spark.sql.catalyst.types._
 
-object Optimizer extends RuleExecutor[LogicalPlan] {
+abstract class Optimizer extends RuleExecutor[LogicalPlan]
+
+object SparkOptimizer extends Optimizer {
--- End diff --

Thank you for your suggestion.
I agree that Catalyst should not tightly coupled with Spark.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59592260
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21870/consoleFull)
 for   PR 2825 at commit 
[`abbc53c`](https://github.com/apache/spark/commit/abbc53cc9b1e02d19c2f2200947bcb86bf33511c).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59594304
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21870/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59594302
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21870/consoleFull)
 for   PR 2825 at commit 
[`abbc53c`](https://github.com/apache/spark/commit/abbc53cc9b1e02d19c2f2200947bcb86bf33511c).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59379459
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21808/consoleFull)
 for   PR 2825 at commit 
[`9547a23`](https://github.com/apache/spark/commit/9547a23fd5ca0058d1044d19f6a96bdbb1e3b810).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2825#discussion_r18964661
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -28,7 +28,9 @@ import org.apache.spark.sql.catalyst.plans.logical._
 import org.apache.spark.sql.catalyst.rules._
 import org.apache.spark.sql.catalyst.types._
 
-object Optimizer extends RuleExecutor[LogicalPlan] {
+abstract class Optimizer extends RuleExecutor[LogicalPlan]
+
+object Optimizer extends Optimizer {
--- End diff --

It is confusing to use the same name for both. 
Will this mean an API change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59388681
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21808/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59388676
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21808/consoleFull)
 for   PR 2825 at commit 
[`9547a23`](https://github.com/apache/spark/commit/9547a23fd5ca0058d1044d19f6a96bdbb1e3b810).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2825#discussion_r18995386
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -28,7 +28,9 @@ import org.apache.spark.sql.catalyst.plans.logical._
 import org.apache.spark.sql.catalyst.rules._
 import org.apache.spark.sql.catalyst.types._
 
-object Optimizer extends RuleExecutor[LogicalPlan] {
+abstract class Optimizer extends RuleExecutor[LogicalPlan]
+
+object Optimizer extends Optimizer {
--- End diff --

Hi @srowen, thank you for your comment.
I'll rename the `object`.
And yes, this PR is a very little API change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59450880
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21827/consoleFull)
 for   PR 2825 at commit 
[`4d2e1bc`](https://github.com/apache/spark/commit/4d2e1bc8126c815cd393e1b527fa19b727b8d4c4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59453270
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21827/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59453264
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21827/consoleFull)
 for   PR 2825 at commit 
[`4d2e1bc`](https://github.com/apache/spark/commit/4d2e1bc8126c815cd393e1b527fa19b727b8d4c4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59456929
  
I think people want to customize something other than default, a better way 
is to create an new `Context` class which derives from `SQLContext`, just like 
the `HiveContext` does. In the other hand, since the `Catalyst` focus on the 
logical plan optimization, why not just keep adding / updating the rules 
(object)?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59457217
  
Sorry, I mean probably we don't want to mix the logical plan  physical 
plan optimization, the logical plan optimization should be generic, if we want 
to do some customized optimization for specific engine (e.g. MapReduce / 
Spark), we'd better to do that in physical plan, e.g. `SparkStrategies.scala`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3969][SQL] Optimizer should have a supe...

2014-10-16 Thread ueshin
Github user ueshin commented on the pull request:

https://github.com/apache/spark/pull/2825#issuecomment-59461447
  
Hi @chenghao-intel, thank you for your comment.
Yes, that's right. I don't want to mix the logical plan  physical plan 
optimization and I'll extend `SparkStrategies` if I need.
I want to add some `Expression`s for my projects and optimize logical plan 
with them, but right now we can't replace `Optimizer` because it is an 
`object`, so I want to add an extension point for `Optimizer`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org