[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-16 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-157208292
  
Can we close this now that #9216 is merged?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-16 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-157239704
  
Sure. Close it. Thank you for your time! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-16 Thread gatorsmile
Github user gatorsmile closed the pull request at:

https://github.com/apache/spark/pull/9385


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-14 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-156730810
  
@marmbrus CachedTableSuite failed due to the same reason. We did not clean 
up the subquery names. Thus, it is unable to give a correct result when 
deciding if Exchange is needed. 

I did the fix by using `semanticEquals`. Please check if the changes 
appropriate. https://github.com/apache/spark/pull/9216

Now, all the test cases passed. Thanks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-13 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-156612181
  
Hi, @marmbrus 

Originally, I thought quantifiers are part of identifiers, like schema name 
in traditional RDBMS. Based on your explanation, this is not true. 

I did a code change. Please check if the latest changes make sense. 
`semanticEquals` is used. Now, all the test cases passed. 
https://github.com/gatorsmile/spark/commit/8e72b17561e4cc1a6cce86fc70f6ed968ebf5b38

Just did a merge to the latest master. Thank you for your time.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-12 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-156238962
  
I'm sorry, I don't see where this `expand()` function you are talking about 
is, or why it should not use `semanticEquals`.  The whole point of 
`semanticEquals` is "are these two attributes referring to the same data, and 
in many cases this is actually what you want instead of structural equality 
(structural equality matters, for example, when you are checking to see if a 
tree has reached fixed point.  otherwise we could not use rules to make changes 
to Attributes that are cosmetic in nature).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-10 Thread gatorsmile
Github user gatorsmile closed the pull request at:

https://github.com/apache/spark/pull/9385


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-10 Thread gatorsmile
GitHub user gatorsmile reopened a pull request:

https://github.com/apache/spark/pull/9385

[SPARK-11433] [SQL] Cleanup the subquery name after eliminating subquery

This fix is to remove the subquery name in qualifiers after eliminating 
subquery.  

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark eliminateSubQ

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9385.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9385


commit db69ccf2c60679c4ca111a618190258d5b5cef62
Author: Xiao Li 
Date:   2015-10-30T22:12:19Z

cleanup the subquery name after eliminating subquery

commit a26763d758bc58dacf81be171428ede215775532
Author: gatorsmile 
Date:   2015-11-04T00:04:31Z

flatmap->map




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-10 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-155605985
  
@marmbrus 

After rechecking the root reason why Expand failed, I still think we should 
cleanup the subquery name after subquery elimination. My current fix needs a 
change to enable a deeper clean of subquery. 

Let me explain what happened in Expand. 

Before subquery elimination, the subquery name "mytable" is shown in all 
the three upper levels (Aggregate, Expand and Project). 
```scala
Aggregate [(a#2 + b#3)#7,b#3,grouping__id#6], [(a#2 + b#3)#7 AS 
_c0#4,b#3,sum(cast((a#2 - b#3) as bigint)) AS _c2#5L]
 Expand [0,1,3], [(a#2 + b#3)#7,b#3], grouping__id#6
  Project [a#2,b#3,(a#2 + b#3) AS (a#2 + b#3)#7]
   Subquery mytable
Project [_1#0 AS a#2,_2#1 AS b#3]
 LocalRelation [_1#0,_2#1], [[1,2],[2,4]]
```

After subquery elimination, the subquery name "mytable" is not removed in 
these three levels. 

 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-10 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-15565
  
Hi, @marmbrus 

After digging the root reason why Expand cases failed, I found we  still 
need a deeper clean of subquery after elimination. 

Let me use the following example to explain what happened in Expand. This 
query works well if we do not compare the qualifiers when comparing two 
AttributeReferences. I think this is a bug if merging 
https://github.com/apache/spark/pull/9216, right? 

```scala
val sqlDF = sql("select a, b, sum(a) from mytable group by a, b with 
rollup").explain(true)
```

Before subquery elimination, the subquery name "mytable" is shown in all 
the two upper layers (Aggregate and Expand). 
```scala
Aggregate [a#2,b#3,grouping__id#5], [a#2,b#3,sum(cast(a#2 as bigint)) AS 
_c2#4L]
 Expand [0,1,3], [a#2,b#3], grouping__id#5
  Subquery mytable
   Project [_1#0 AS a#2,_2#1 AS b#3]
LocalRelation [_1#0,_2#1], [[1,2],[2,4]]
```

After subquery elimination, the subquery name "mytable" is not removed in 
these two upper layers. 
```scala
Aggregate [a#2,b#3,grouping__id#5], [a#2,b#3,sum(cast(a#2 as bigint)) AS 
_c2#4L]
 Expand [0,1,3], [a#2,b#3], grouping__id#5
  Project [_1#0 AS a#2,_2#1 AS b#3]
   LocalRelation [_1#0,_2#1], [[1,2],[2,4]]
```

In SparkStrategies, we create an array of Projections for the child 
projection of Expand.

```scala
  case e @ logical.Expand(_, _, _, child) =>
execution.Expand(e.projections, e.output, planLater(child)) :: Nil
```

`e.projections` calls the function `expand()`. Inside the function 
`expand()`, I do not think we should use `semanticEquals` there. 

Let me post the incorrect physical plan

```scala
TungstenAggregate(key=[a#2,b#3,grouping__id#12], functions=[(sum(cast(a#2 
as bigint)),mode=Final,isDistinct=false)], output=[a#2,b#3,_c2#11L])
 TungstenExchange hashpartitioning(a#2,b#3,grouping__id#12,5)
  TungstenAggregate(key=[a#2,b#3,grouping__id#12], functions=[(sum(cast(a#2 
as bigint)),mode=Partial,isDistinct=false)], 
output=[a#2,b#3,grouping__id#12,currentSum#15L])
   Expand [List(a#2, b#3, 0),List(a#2, b#3, 1),List(a#2, b#3, 3)], 
[a#2,b#3,grouping__id#12]
LocalTableScan [a#2,b#3], [[1,2],[2,4]]
```

For you convenience, below is the correct one:

```scala
TungstenAggregate(key=[a#2,b#3,grouping__id#12], functions=[(sum(cast(a#2 
as bigint)),mode=Final,isDistinct=false)], output=[a#2,b#3,_c2#11L])
 TungstenExchange hashpartitioning(a#2,b#3,grouping__id#12,5)
  TungstenAggregate(key=[a#2,b#3,grouping__id#12], functions=[(sum(cast(a#2 
as bigint)),mode=Partial,isDistinct=false)], 
output=[a#2,b#3,grouping__id#12,currentSum#15L])
   Expand [List(null, null, 0),List(a#2, null, 1),List(a#2, b#3, 3)], 
[a#2,b#3,grouping__id#12]
LocalTableScan [a#2,b#3], [[1,2],[2,4]]
```

My current fix does not fix this issue yet. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-06 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-154594596
  
Thanks for your contribution, but I'm tempted to not make this change 
unless there is actually a bug.  We are eliminating the subqueries because they 
will impact optimization and planning.  However, keeping the qualifiers around 
could actually be useful if we want to give better error messages.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-06 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-154650945
  
@marmbrus Thanks! 

I will try to change equals to semanticEquals in the pull request 
https://github.com/apache/spark/pull/9216. Then, you can decide if this is a 
right solution. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-06 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-154609690
  
@marmbrus I already hit this issue when resolving 
https://issues.apache.org/jira/browse/SPARK-8658. That means, when comparing 
two AttributeReferences, we should not compare their qualifiers. That looks a 
strange fix, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-06 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-154640515
  
`Expression` instances have two different kinds of equality:
 - `equals` full equality, i.e. capitalization, qualifiers, etc.  This is 
important for example to know if a tree transformation has reached fixed point.
 - `semanticEquals` equivalent in execution (i.e. will return the same 
result)

If you ran into problems adding qualifiers to `equals` then I think it is 
likely that that code should actually be using `semanticEquals`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/9385#discussion_r43817697
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1019,7 +1019,16 @@ class Analyzer(
  * scoping information for attributes and can be removed once analysis is 
complete.
  */
 object EliminateSubQueries extends Rule[LogicalPlan] {
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
--- End diff --

Thank you for your comments! If we doing transformUp, subquery will be 
removed at first. Then, Project(projectList, child: Subquery) will not be 
applicable in this case. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536438
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536456
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153529973
  
@cloud-fan @dbtsai , Jenkins did not start the testing. Could you let 
Jenkins to test it? 

Thank you! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/9385#discussion_r43826123
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1019,7 +1019,16 @@ class Analyzer(
  * scoping information for attributes and can be removed once analysis is 
complete.
  */
 object EliminateSubQueries extends Rule[LogicalPlan] {
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
+case Project(projectList, child: Subquery) => {
+  Project(
+projectList.flatMap {
--- End diff --

Thank you! I did the change based on your suggestion. : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536898
  
**[Test build #44981 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44981/consoleFull)**
 for PR 9385 at commit 
[`a26763d`](https://github.com/apache/spark/commit/a26763d758bc58dacf81be171428ede215775532).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread dbtsai
Github user dbtsai commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536164
  
I think there is some issue in Jenkins. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread dbtsai
Github user dbtsai commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536227
  
Jenkins, add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153554640
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153554643
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44981/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153554468
  
**[Test build #44981 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44981/consoleFull)**
 for PR 9385 at commit 
[`a26763d`](https://github.com/apache/spark/commit/a26763d758bc58dacf81be171428ede215775532).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153620226
  
@dbtsai Thank you! 

Please let me know if you need any extra code change. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-02 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9385#discussion_r43608456
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1019,7 +1019,16 @@ class Analyzer(
  * scoping information for attributes and can be removed once analysis is 
complete.
  */
 object EliminateSubQueries extends Rule[LogicalPlan] {
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
--- End diff --

Any particular reason for switching to transformDown?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-02 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9385#discussion_r43608157
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1019,7 +1019,16 @@ class Analyzer(
  * scoping information for attributes and can be removed once analysis is 
complete.
  */
 object EliminateSubQueries extends Rule[LogicalPlan] {
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
+case Project(projectList, child: Subquery) => {
+  Project(
+projectList.flatMap {
--- End diff --

Could we simplify this to a map and remove the :: Nil we have in the two 
sub cases? since it seems we are always returning a single element list for 
every case so it 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-10-31 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-152699107
  
Logically when we remove `Subquery`, we should remove related `qualifiers`, 
however, can you construct a case that hit problems because of not removing 
`qualifiers`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-10-31 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-152708351
  
So far, I just observed this strange ghosting values when I read the 
optimized logical tree, but my query did not trigger any issue. 

Based on my understanding, usage of qualifiers is still limited in the 
current code base. It could be a potential issue when we support more complex 
SQL syntax/functions. Thus, I submitted this pull request to resolve this 
issue. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-10-30 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/9385#discussion_r43559826
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1019,7 +1019,16 @@ class Analyzer(
  * scoping information for attributes and can be removed once analysis is 
complete.
  */
 object EliminateSubQueries extends Rule[LogicalPlan] {
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {
+case Project(projectList, child: Subquery) => {
+  Project(
+projectList.flatMap {
+  case ar: AttributeReference if 
ar.qualifiers.contains(child.alias) =>
--- End diff --

Should I use NamedExpression to replace AttributeReference? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-152664745
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-10-30 Thread dbtsai
Github user dbtsai commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-152669562
  
Jenkins, okay to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-10-30 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/9385

[SPARK-11433] [SQL] Cleanup the subquery name after eliminating subquery

This fix is to remove the subquery name in qualifiers after eliminating 
subquery.  

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark eliminateSubQ

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9385.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9385


commit db69ccf2c60679c4ca111a618190258d5b5cef62
Author: Xiao Li 
Date:   2015-10-30T22:12:19Z

cleanup the subquery name after eliminating subquery




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org