[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-12-09 Thread markhamstra
Github user markhamstra commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-163365306
  
Should this be closed now that https://github.com/apache/spark/pull/9961 is 
merged?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-12-09 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-163367236
  
Yes, using the magic words: do you mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-12-09 Thread dilipbiswal
Github user dilipbiswal closed the pull request at:

https://github.com/apache/spark/pull/9844


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-12-09 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-163369667
  
@srowen closed. Sorry to have missed it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-24 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159499937
  
@cloud-fan Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9844#discussion_r45705657
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 ---
@@ -217,5 +217,23 @@ class AnalysisSuite extends AnalysisTest {
   nullResult,
   udf4)
 // checkUDF(udf4, expected4)
+}
+
+  test("SPARK-11863 mixture of aliases and real columns in orderby clause 
- tpcds 19,55,71") {
+val a = testRelation2.output(0)
+val c = testRelation2.output(2)
+val alias1 = a.as("a1")
+val alias2 = c.as("a2")
+val alias3 = count(a).as("a3")
+
+val plan = testRelation2.
+  groupBy('a, 'c) ('a.as("a1"), 'c.as("a2"), count('a).as("a3")).
+  orderBy('a1.asc, 'c.asc)
--- End diff --

code style: we should put the `.` at the beginning of a line, not at the 
end. And also remove the space between `groupBy('a, 'c)` and `('a.as("a1"),...`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159191093
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46588/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159191091
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159191057
  
**[Test build #46588 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46588/consoleFull)**
 for PR 9844 at commit 
[`cbf14ff`](https://github.com/apache/spark/commit/cbf14ffa49c456193e56946ae30181d7b6571139).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159191926
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159191928
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46589/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-24 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9844#discussion_r45715703
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 ---
@@ -217,5 +217,23 @@ class AnalysisSuite extends AnalysisTest {
   nullResult,
   udf4)
 // checkUDF(udf4, expected4)
+}
+
+  test("SPARK-11863 mixture of aliases and real columns in orderby clause 
- tpcds 19,55,71") {
+val a = testRelation2.output(0)
+val c = testRelation2.output(2)
+val alias1 = a.as("a1")
+val alias2 = c.as("a2")
+val alias3 = count(a).as("a3")
+
+val plan = testRelation2.
+  groupBy('a, 'c) ('a.as("a1"), 'c.as("a2"), count('a).as("a3")).
+  orderBy('a1.asc, 'c.asc)
--- End diff --

@cloud-fan Will do. Wenchen, there are a few test failures. I am still 
looking at it. So i think our idea to NOT consider the already resolved 
attribute for pushdown decision is causing the issue.

Here are the tests
1) SELECT count(*) FROM orderByData GROUP BY a ORDER BY count(*)
In this case we want the sort attribute representing the count(*) to be 
replace by the
group by alias.
2) SELECT a FROM orderByData GROUP BY a ORDER BY a, count(*), sum(b)
In this case we want the count(*) to be pushed down to aggregate

In both these case, we are skipping pushdown processing because its a 
resolved attribute.
Given this wenchen, may i request you to look at the original fix. After 
learning more about
different conditions, it seems like that may be a safer fix. Let me know 
what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9844#discussion_r45700102
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -610,9 +610,9 @@ class Analyzer(
*/
   object ResolveAggregateFunctions extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
-  case filter @ Filter(havingCondition,
- aggregate @ Aggregate(grouping, originalAggExprs, child))
-  if aggregate.resolved =>
+  case filter@Filter(havingCondition,
+  aggregate@Aggregate(grouping, originalAggExprs, child))
+if aggregate.resolved =>
--- End diff --

keep the indent please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9844#discussion_r45700402
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -661,7 +666,9 @@ class Analyzer(
   val evaluatedOrderings = 
resolvedAliasedOrdering.zip(sortOrder).map {
 case (evaluated, order) =>
   val index = originalAggExprs.indexWhere {
-case Alias(child, _) => child semanticEquals 
evaluated.child
+case a@Alias(child, _) =>
+  (child semanticEquals evaluated.child) ||
+a.exprId == 
evaluated.child.asInstanceOf[AttributeReference].exprId
 case other => other semanticEquals evaluated.child
--- End diff --

We can put the `finalSortOrders` at the final place, after this 
`evaluatedOrderings`, it should be:
```
val sortOrdersMap = unresolvedSortOrders.map(new 
TreeNodeRef(_)).zip(evaluatedOrderings).toMap
val finalSortOrders = sortOrders.map(s => 
sortOrdersMap.getOrElse(TreeNodeRef(s), s))

if (sortOrder == finalSortOrders) {
..
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9844#discussion_r45701500
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -661,7 +666,9 @@ class Analyzer(
   val evaluatedOrderings = 
resolvedAliasedOrdering.zip(sortOrder).map {
 case (evaluated, order) =>
   val index = originalAggExprs.indexWhere {
-case Alias(child, _) => child semanticEquals 
evaluated.child
+case a@Alias(child, _) =>
+  (child semanticEquals evaluated.child) ||
+a.exprId == 
evaluated.child.asInstanceOf[AttributeReference].exprId
 case other => other semanticEquals evaluated.child
--- End diff --

@cloud-fan **THANKS !!** Somehow i kept thinking that we need the full list 
of sort attributes for pushdown determination. After your comment , it makes 
sense as any resolved sort order attribute(s) must be resolved from the 
aggregate expressions and hence its ok to not considered. Right ?

One other question , after our a change a lot of cases are now going 
through this codepath and here we add an extra projection above the sort. 
Should we add this only when we have added at least one pushdown attribute ? Or 
should we fix the tescases instead ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9844#discussion_r45702258
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -661,7 +666,9 @@ class Analyzer(
   val evaluatedOrderings = 
resolvedAliasedOrdering.zip(sortOrder).map {
 case (evaluated, order) =>
   val index = originalAggExprs.indexWhere {
-case Alias(child, _) => child semanticEquals 
evaluated.child
+case a@Alias(child, _) =>
+  (child semanticEquals evaluated.child) ||
+a.exprId == 
evaluated.child.asInstanceOf[AttributeReference].exprId
 case other => other semanticEquals evaluated.child
--- End diff --

I think our optimizer is smart enough to remove unnecessary `Project`, so 
we don't need to worry about it here :)
you can change your test case if the plan doesn't match.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9844#discussion_r45691681
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -661,7 +666,9 @@ class Analyzer(
   val evaluatedOrderings = 
resolvedAliasedOrdering.zip(sortOrder).map {
 case (evaluated, order) =>
   val index = originalAggExprs.indexWhere {
-case Alias(child, _) => child semanticEquals 
evaluated.child
+case a@Alias(child, _) =>
+  (child semanticEquals evaluated.child) ||
+a.exprId == 
evaluated.child.asInstanceOf[AttributeReference].exprId
 case other => other semanticEquals evaluated.child
--- End diff --

@cloud-fan Thanks for your help. I have made changes based on your 
comments. There is one change i am a little concerned about and need your input.
 After computing the finalSortOrders, i am passing it to the pushdown 
computation logic where it tries to compare the semantic equality between sort 
attributes and aggregate expression attributes. For the already resolved 
attribute, the comparison is failing and it considers the attribute as 
pushdownable and things just go wrong :-). So i am comparing the expression id 
of the resolved alias against the aggregation expression's attribute reference. 
It does not seem clean though :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159136027
  
**[Test build #46580 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46580/consoleFull)**
 for PR 9844 at commit 
[`d319524`](https://github.com/apache/spark/commit/d319524be7fb8daedb72a23587182f8307afd812).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159139182
  
**[Test build #46580 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46580/consoleFull)**
 for PR 9844 at commit 
[`d319524`](https://github.com/apache/spark/commit/d319524be7fb8daedb72a23587182f8307afd812).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159139215
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46580/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159139214
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159186937
  
**[Test build #46588 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46588/consoleFull)**
 for PR 9844 at commit 
[`cbf14ff`](https://github.com/apache/spark/commit/cbf14ffa49c456193e56946ae30181d7b6571139).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159187032
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159188789
  
@cloud-fan can you please help trigger a retest ? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159187034
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46587/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-23 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-159188818
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-22 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158869241
  
Thanks a lot @cloud-fan. Actually i do remember trying to do something 
similar. So i had tried to filter on resolved and was trying to only pick 
un-resolved attributes. But after i had done the execute  i had difficulty to 
stich together the two resolved attributes to restore the original sortorder. 
We need to keep the orginal order of the attribute, right ? So here is what i 
was trying ..
1) filter the sortorder attributes to only keep the unresolved ones.
2) copy them to aggregate expression and call execute
3) After this i wanted to form a sort order having both the already 
filtered attributes and the newly resolved ones. This is where i was having 
difficulty.
3) We then call checkAnalysis .. and rest of the logic follows.

I am sure you will have some magic here :-) Can you share your thoughts.. 
Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-22 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158870885
  
how about this:
```
val unresolvedSortOrders = sortOrders.filterNot(_.resolved)
val resolvedSortOrders =  ... // the original logic that copy to aggregate 
expression and call execute
val sortOrdersMap = 
unresolvedSortOrders.map(TreeNodeRef(_)).zip(resolvedSortOrders).toMap
val finalSortOrders = sortOrders.map(s => 
sortOrdersMap.getOrElse(TreeNodeRef(s), s))
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-22 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158871599
  
@cloud-fan Wow.. thank you very much. I will try it. Thanks again,


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-22 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158846201
  
In 
[`ResolveAggregateFunctions`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L633-L635),
 we pick up all `SortOrder`s and put them in aggregate list to resolve them. 
However, logically we should only pick unresolved `SortOrder`s as some of them 
may already be resolved at 
[`ResolveReferences`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L431-L435).

In your case, the `a` in `order by a, c2` can and only can be resolved in 
`ResolveReferences`, so we should skip the `a` in `ResolveAggregateFunctions`.

Does it make sense to you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-22 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158837039
  
@cloud-fan Thank you for the explanation as always. Trying to see if i 
understood your suggestion properly. Were you suggesting to add another case 
under ResolveSortReferences to deal with sort operator with   Aggregation ? Or 
you were thinking to call resolveAndFindMissing from within 
ResolveAggregateFunctions passing the grandchild ? Since most of the stuff were 
happening in that function , i thought its easier to just refactor the 
evaluatedorderings logic. However you understand this stuff a lot better. So i 
would go by your suggestion. Please let me know.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-22 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158831881
  
This is really a good catch, thanks @dilipbiswal 

The problems is that, normal operator should be resolved based on its 
child, but `Sort` operator can be resolved based on its grandchild. So we have 
3 rules that can resolve `Sort`: `ResolveReferences`, 
`ResolveSortReferences`(if grandchild is `Project`) and 
`ResolveAggregateFunctions`(if grandchild is `Aggregate`).

For your example, `select c1 as a , c2 as b from tab group by c1, c2 order 
by a, c2`, we need to resolve `a` and `c2` for `Sort`. Firstly `a` will be 
resolved in `ResolveReferences` based on its child, and when we reach 
`ResolveAggregateFunctions`, we will try to resolve both `a` and `c2` based on 
its grandchild, but failed because `a` is not a legal aggregate expression.

I think we can just fix the problem directly, i.e. only pick up unresolved 
`SortOrder`s and try to resolve it based on grandchild in 
`ResolveAggregateFunctions`. @dilipbiswal what do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158351303
  
**[Test build #46405 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46405/consoleFull)**
 for PR 9844 at commit 
[`ef4274a`](https://github.com/apache/spark/commit/ef4274a06a5859e0f233baaab4044a6d1b0e6418).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158351431
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46405/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158351430
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158320803
  
**[Test build #46405 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46405/consoleFull)**
 for PR 9844 at commit 
[`ef4274a`](https://github.com/apache/spark/commit/ef4274a06a5859e0f233baaab4044a6d1b0e6418).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158255401
  
**[Test build #46383 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46383/consoleFull)**
 for PR 9844 at commit 
[`954f919`](https://github.com/apache/spark/commit/954f9194377b759fa8414c3a66fb7ee5c74640b7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-19 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158254009
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158205318
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-19 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/9844

[SPARK-11863][SQL][WIP] Unable to resolve order by if it contains mixture 
of aliases and real columns.

Compute the evaluatedOrderings by replacing the Alias names referenced by 
Sort
expression with the attributes in agregate expressions after checking the 
semantic equality.

Example : select c1 as a , c2 as b from tab group by c1, c2 order by a, c2


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark-11863

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9844.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9844


commit 954f9194377b759fa8414c3a66fb7ee5c74640b7
Author: Dilip Biswal 
Date:   2015-11-04T05:23:11Z

[SPARK-11863] Unable to resolve order by attributes if it contains mixture 
of aliases and real columns.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-19 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158205174
  
@cloud-fan Hi Wenchen, can you please look at this change and let me know 
your comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158271745
  
**[Test build #46383 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46383/consoleFull)**
 for PR 9844 at commit 
[`954f919`](https://github.com/apache/spark/commit/954f9194377b759fa8414c3a66fb7ee5c74640b7).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158271821
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...

2015-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9844#issuecomment-158271825
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46383/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org