[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-163365306 Should this be closed now that https://github.com/apache/spark/pull/9961 is merged? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-163367236 Yes, using the magic words: do you mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal closed the pull request at: https://github.com/apache/spark/pull/9844 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-163369667 @srowen closed. Sorry to have missed it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159499937 @cloud-fan Thanks a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/9844#discussion_r45705657 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala --- @@ -217,5 +217,23 @@ class AnalysisSuite extends AnalysisTest { nullResult, udf4) // checkUDF(udf4, expected4) +} + + test("SPARK-11863 mixture of aliases and real columns in orderby clause - tpcds 19,55,71") { +val a = testRelation2.output(0) +val c = testRelation2.output(2) +val alias1 = a.as("a1") +val alias2 = c.as("a2") +val alias3 = count(a).as("a3") + +val plan = testRelation2. + groupBy('a, 'c) ('a.as("a1"), 'c.as("a2"), count('a).as("a3")). + orderBy('a1.asc, 'c.asc) --- End diff -- code style: we should put the `.` at the beginning of a line, not at the end. And also remove the space between `groupBy('a, 'c)` and `('a.as("a1"),...` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159191093 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46588/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159191091 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159191057 **[Test build #46588 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46588/consoleFull)** for PR 9844 at commit [`cbf14ff`](https://github.com/apache/spark/commit/cbf14ffa49c456193e56946ae30181d7b6571139). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159191926 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159191928 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46589/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/9844#discussion_r45715703 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala --- @@ -217,5 +217,23 @@ class AnalysisSuite extends AnalysisTest { nullResult, udf4) // checkUDF(udf4, expected4) +} + + test("SPARK-11863 mixture of aliases and real columns in orderby clause - tpcds 19,55,71") { +val a = testRelation2.output(0) +val c = testRelation2.output(2) +val alias1 = a.as("a1") +val alias2 = c.as("a2") +val alias3 = count(a).as("a3") + +val plan = testRelation2. + groupBy('a, 'c) ('a.as("a1"), 'c.as("a2"), count('a).as("a3")). + orderBy('a1.asc, 'c.asc) --- End diff -- @cloud-fan Will do. Wenchen, there are a few test failures. I am still looking at it. So i think our idea to NOT consider the already resolved attribute for pushdown decision is causing the issue. Here are the tests 1) SELECT count(*) FROM orderByData GROUP BY a ORDER BY count(*) In this case we want the sort attribute representing the count(*) to be replace by the group by alias. 2) SELECT a FROM orderByData GROUP BY a ORDER BY a, count(*), sum(b) In this case we want the count(*) to be pushed down to aggregate In both these case, we are skipping pushdown processing because its a resolved attribute. Given this wenchen, may i request you to look at the original fix. After learning more about different conditions, it seems like that may be a safer fix. Let me know what you think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/9844#discussion_r45700102 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -610,9 +610,9 @@ class Analyzer( */ object ResolveAggregateFunctions extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators { - case filter @ Filter(havingCondition, - aggregate @ Aggregate(grouping, originalAggExprs, child)) - if aggregate.resolved => + case filter@Filter(havingCondition, + aggregate@Aggregate(grouping, originalAggExprs, child)) +if aggregate.resolved => --- End diff -- keep the indent please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/9844#discussion_r45700402 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -661,7 +666,9 @@ class Analyzer( val evaluatedOrderings = resolvedAliasedOrdering.zip(sortOrder).map { case (evaluated, order) => val index = originalAggExprs.indexWhere { -case Alias(child, _) => child semanticEquals evaluated.child +case a@Alias(child, _) => + (child semanticEquals evaluated.child) || +a.exprId == evaluated.child.asInstanceOf[AttributeReference].exprId case other => other semanticEquals evaluated.child --- End diff -- We can put the `finalSortOrders` at the final place, after this `evaluatedOrderings`, it should be: ``` val sortOrdersMap = unresolvedSortOrders.map(new TreeNodeRef(_)).zip(evaluatedOrderings).toMap val finalSortOrders = sortOrders.map(s => sortOrdersMap.getOrElse(TreeNodeRef(s), s)) if (sortOrder == finalSortOrders) { .. ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/9844#discussion_r45701500 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -661,7 +666,9 @@ class Analyzer( val evaluatedOrderings = resolvedAliasedOrdering.zip(sortOrder).map { case (evaluated, order) => val index = originalAggExprs.indexWhere { -case Alias(child, _) => child semanticEquals evaluated.child +case a@Alias(child, _) => + (child semanticEquals evaluated.child) || +a.exprId == evaluated.child.asInstanceOf[AttributeReference].exprId case other => other semanticEquals evaluated.child --- End diff -- @cloud-fan **THANKS !!** Somehow i kept thinking that we need the full list of sort attributes for pushdown determination. After your comment , it makes sense as any resolved sort order attribute(s) must be resolved from the aggregate expressions and hence its ok to not considered. Right ? One other question , after our a change a lot of cases are now going through this codepath and here we add an extra projection above the sort. Should we add this only when we have added at least one pushdown attribute ? Or should we fix the tescases instead ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/9844#discussion_r45702258 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -661,7 +666,9 @@ class Analyzer( val evaluatedOrderings = resolvedAliasedOrdering.zip(sortOrder).map { case (evaluated, order) => val index = originalAggExprs.indexWhere { -case Alias(child, _) => child semanticEquals evaluated.child +case a@Alias(child, _) => + (child semanticEquals evaluated.child) || +a.exprId == evaluated.child.asInstanceOf[AttributeReference].exprId case other => other semanticEquals evaluated.child --- End diff -- I think our optimizer is smart enough to remove unnecessary `Project`, so we don't need to worry about it here :) you can change your test case if the plan doesn't match. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/9844#discussion_r45691681 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -661,7 +666,9 @@ class Analyzer( val evaluatedOrderings = resolvedAliasedOrdering.zip(sortOrder).map { case (evaluated, order) => val index = originalAggExprs.indexWhere { -case Alias(child, _) => child semanticEquals evaluated.child +case a@Alias(child, _) => + (child semanticEquals evaluated.child) || +a.exprId == evaluated.child.asInstanceOf[AttributeReference].exprId case other => other semanticEquals evaluated.child --- End diff -- @cloud-fan Thanks for your help. I have made changes based on your comments. There is one change i am a little concerned about and need your input. After computing the finalSortOrders, i am passing it to the pushdown computation logic where it tries to compare the semantic equality between sort attributes and aggregate expression attributes. For the already resolved attribute, the comparison is failing and it considers the attribute as pushdownable and things just go wrong :-). So i am comparing the expression id of the resolved alias against the aggregation expression's attribute reference. It does not seem clean though :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159136027 **[Test build #46580 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46580/consoleFull)** for PR 9844 at commit [`d319524`](https://github.com/apache/spark/commit/d319524be7fb8daedb72a23587182f8307afd812). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159139182 **[Test build #46580 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46580/consoleFull)** for PR 9844 at commit [`d319524`](https://github.com/apache/spark/commit/d319524be7fb8daedb72a23587182f8307afd812). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159139215 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46580/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159139214 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159186937 **[Test build #46588 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46588/consoleFull)** for PR 9844 at commit [`cbf14ff`](https://github.com/apache/spark/commit/cbf14ffa49c456193e56946ae30181d7b6571139). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159187032 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159188789 @cloud-fan can you please help trigger a retest ? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159187034 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46587/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-159188818 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158869241 Thanks a lot @cloud-fan. Actually i do remember trying to do something similar. So i had tried to filter on resolved and was trying to only pick un-resolved attributes. But after i had done the execute i had difficulty to stich together the two resolved attributes to restore the original sortorder. We need to keep the orginal order of the attribute, right ? So here is what i was trying .. 1) filter the sortorder attributes to only keep the unresolved ones. 2) copy them to aggregate expression and call execute 3) After this i wanted to form a sort order having both the already filtered attributes and the newly resolved ones. This is where i was having difficulty. 3) We then call checkAnalysis .. and rest of the logic follows. I am sure you will have some magic here :-) Can you share your thoughts.. Thanks a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158870885 how about this: ``` val unresolvedSortOrders = sortOrders.filterNot(_.resolved) val resolvedSortOrders = ... // the original logic that copy to aggregate expression and call execute val sortOrdersMap = unresolvedSortOrders.map(TreeNodeRef(_)).zip(resolvedSortOrders).toMap val finalSortOrders = sortOrders.map(s => sortOrdersMap.getOrElse(TreeNodeRef(s), s)) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158871599 @cloud-fan Wow.. thank you very much. I will try it. Thanks again, --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158846201 In [`ResolveAggregateFunctions`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L633-L635), we pick up all `SortOrder`s and put them in aggregate list to resolve them. However, logically we should only pick unresolved `SortOrder`s as some of them may already be resolved at [`ResolveReferences`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L431-L435). In your case, the `a` in `order by a, c2` can and only can be resolved in `ResolveReferences`, so we should skip the `a` in `ResolveAggregateFunctions`. Does it make sense to you? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158837039 @cloud-fan Thank you for the explanation as always. Trying to see if i understood your suggestion properly. Were you suggesting to add another case under ResolveSortReferences to deal with sort operator with Aggregation ? Or you were thinking to call resolveAndFindMissing from within ResolveAggregateFunctions passing the grandchild ? Since most of the stuff were happening in that function , i thought its easier to just refactor the evaluatedorderings logic. However you understand this stuff a lot better. So i would go by your suggestion. Please let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158831881 This is really a good catch, thanks @dilipbiswal The problems is that, normal operator should be resolved based on its child, but `Sort` operator can be resolved based on its grandchild. So we have 3 rules that can resolve `Sort`: `ResolveReferences`, `ResolveSortReferences`(if grandchild is `Project`) and `ResolveAggregateFunctions`(if grandchild is `Aggregate`). For your example, `select c1 as a , c2 as b from tab group by c1, c2 order by a, c2`, we need to resolve `a` and `c2` for `Sort`. Firstly `a` will be resolved in `ResolveReferences` based on its child, and when we reach `ResolveAggregateFunctions`, we will try to resolve both `a` and `c2` based on its grandchild, but failed because `a` is not a legal aggregate expression. I think we can just fix the problem directly, i.e. only pick up unresolved `SortOrder`s and try to resolve it based on grandchild in `ResolveAggregateFunctions`. @dilipbiswal what do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158351303 **[Test build #46405 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46405/consoleFull)** for PR 9844 at commit [`ef4274a`](https://github.com/apache/spark/commit/ef4274a06a5859e0f233baaab4044a6d1b0e6418). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158351431 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46405/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158351430 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158320803 **[Test build #46405 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46405/consoleFull)** for PR 9844 at commit [`ef4274a`](https://github.com/apache/spark/commit/ef4274a06a5859e0f233baaab4044a6d1b0e6418). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158255401 **[Test build #46383 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46383/consoleFull)** for PR 9844 at commit [`954f919`](https://github.com/apache/spark/commit/954f9194377b759fa8414c3a66fb7ee5c74640b7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158254009 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158205318 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
GitHub user dilipbiswal opened a pull request: https://github.com/apache/spark/pull/9844 [SPARK-11863][SQL][WIP] Unable to resolve order by if it contains mixture of aliases and real columns. Compute the evaluatedOrderings by replacing the Alias names referenced by Sort expression with the attributes in agregate expressions after checking the semantic equality. Example : select c1 as a , c2 as b from tab group by c1, c2 order by a, c2 You can merge this pull request into a Git repository by running: $ git pull https://github.com/dilipbiswal/spark spark-11863 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9844.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9844 commit 954f9194377b759fa8414c3a66fb7ee5c74640b7 Author: Dilip BiswalDate: 2015-11-04T05:23:11Z [SPARK-11863] Unable to resolve order by attributes if it contains mixture of aliases and real columns. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158205174 @cloud-fan Hi Wenchen, can you please look at this change and let me know your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158271745 **[Test build #46383 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46383/consoleFull)** for PR 9844 at commit [`954f919`](https://github.com/apache/spark/commit/954f9194377b759fa8414c3a66fb7ee5c74640b7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158271821 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL][WIP] Unable to resolve orde...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9844#issuecomment-158271825 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46383/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org