[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18779


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131432438
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1)
 }
   }
+
+  test("order-by ordinal.") {
+val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", 
"b")
+checkAnswer(
+  df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
--- End diff --

The same here. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131432337
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 }
 assert(e.message.contains("aggregate functions are not allowed in 
GROUP BY"))
   }
+
+  test("SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal.") {
+val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", 
"b")
+checkAnswer(
+  df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")),
+  Seq(Row(3, 4, 6, 7, 9)))
+checkAnswer(
+  df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")),
+  Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6)))
+
+df.createOrReplaceTempView("data")
+checkAnswer(
+  spark.sql("select 3, 4, sum(b) from data group by 1, 2"),
--- End diff --

Nit: Please use upper cases for SQL keywords.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131432378
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 }
 assert(e.message.contains("aggregate functions are not allowed in 
GROUP BY"))
   }
+
+  test("SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal.") {
+val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", 
"b")
+checkAnswer(
+  df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")),
+  Seq(Row(3, 4, 6, 7, 9)))
+checkAnswer(
+  df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")),
+  Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6)))
+
+df.createOrReplaceTempView("data")
+checkAnswer(
+  spark.sql("select 3, 4, sum(b) from data group by 1, 2"),
+  Seq(Row(3, 4, 9)))
+checkAnswer(
+  spark.sql("select 3 as c, 4 as d, sum(b) from data group by c, d"),
--- End diff --

The same here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131432191
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 }
 assert(e.message.contains("aggregate functions are not allowed in 
GROUP BY"))
   }
+
+  test("SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal.") {
+val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", 
"b")
+checkAnswer(
+  df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")),
+  Seq(Row(3, 4, 6, 7, 9)))
+checkAnswer(
+  df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")),
+  Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6)))
+
+df.createOrReplaceTempView("data")
--- End diff --

One more comment. You need to use `withTempView`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131361398
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 }
 assert(e.message.contains("aggregate functions are not allowed in 
GROUP BY"))
   }
+
+  test("SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal.") {
+val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", 
"b")
+checkAnswer(
+  df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")),
+  Seq(Row(3, 4, 6, 7, 9)))
+checkAnswer(
+  df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")),
+  Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6)))
+
+df.createOrReplaceTempView("data")
+checkAnswer(
+  spark.sql("select 3, 4, sum(b) from data group by 1, 2"),
+  Seq(Row(3, 4, 9)))
+checkAnswer(
+  spark.sql("select 3 as c, 4 as d, sum(b) from data group by c, d"),
+  Seq(Row(3, 4, 9)))
--- End diff --

I've verified that the above tests will fail without this fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131360956
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1)
 }
   }
+
+  test("order-by ordinal.") {
+val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", 
"b")
+checkAnswer(
+  df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
+  Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 
3, 1), Row(7, 3, 2)))
--- End diff --

Hmm, maybe we still can keep this test for the case that someone changes 
the Sort/Project relationship in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131360533
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1)
 }
   }
+
+  test("order-by ordinal.") {
+val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", 
"b")
+checkAnswer(
+  df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
+  Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 
3, 1), Row(7, 3, 2)))
--- End diff --

OK. I see. When we input query like `df.select(lit(7), 'a, 
'b).orderBy(lit(1), lit(2), lit(3))`, the query plan looks like:

Sort [7#22 ASC NULLS FIRST, a#5 ASC NULLS FIRST, b#6 ASC NULLS FIRST], 
true
+- Project [7 AS 7#22, a#5, b#6]
   +- Project [_1#2 AS a#5, _2#3 AS b#6]
  +- LocalRelation [_1#2, _2#3]

We have a `Project` below `Sort`. The ordinal `1` be replaced with the 
attribute `7#22`. So we won't get an int literal 7 here. That is why it passes.

Can you have a test for ordinal order-by that show different behavior? If 
not, I think ordinal order-by should be safe from this bug. And we don't need 
to add this test.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131353213
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1)
 }
   }
+
+  test("order-by ordinal.") {
+val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", 
"b")
+checkAnswer(
+  df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
+  Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 
3, 1), Row(7, 3, 2)))
--- End diff --

I've run this test. Even without `transform` -> `resolveOperators`, it 
still works. Can you check it again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131351217
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/order-by-ordinal.sql 
---
@@ -34,3 +34,8 @@ set spark.sql.orderByOrdinal=false;
 -- 0 is now a valid literal
 select * from data order by 0;
 select * from data sort by 0;
+
+select 4 as k, 5, b from data order by k, 2, 3;
+
+set spark.sql.orderByOrdinal=true;
+select 4 as k, 5, b from data order by k, 2, 3;
--- End diff --

 it is not necessary, i will remove it.thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131349806
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/order-by-ordinal.sql 
---
@@ -34,3 +34,8 @@ set spark.sql.orderByOrdinal=false;
 -- 0 is now a valid literal
 select * from data order by 0;
 select * from data sort by 0;
+
+select 4 as k, 5, b from data order by k, 2, 3;
+
+set spark.sql.orderByOrdinal=true;
+select 4 as k, 5, b from data order by k, 2, 3;
--- End diff --

Do we still need to keep this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131349769
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,14 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
--- End diff --

Why do we need to have change like this? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131334204
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 3, 4, sum(b) from data group by 1, 2;
+select 3 as c, 4 as d, sum(b) from data group by c, d;
--- End diff --

Oh, I see,thanks. 
They are  not necessary, i will remove  them.
Also these test cases already exist in the `DataFrameAggregateSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131330713
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 3, 4, sum(b) from data group by 1, 2;
+select 3 as c, 4 as d, sum(b) from data group by c, d;
--- End diff --

Because I ran them without the change `transform` -> `resolveOperators`, 
and there is no error. Do you have different result?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131329917
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 3, 4, sum(b) from data group by 1, 2;
+select 3 as c, 4 as d, sum(b) from data group by c, d;
--- End diff --

You mean these test cases are not necessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131329942
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 }
 assert(e.message.contains("aggregate functions are not allowed in 
GROUP BY"))
   }
+
+  test("SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal.") {
--- End diff --

ok,thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131328405
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 }
 assert(e.message.contains("aggregate functions are not allowed in 
GROUP BY"))
   }
+
+  test("SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal.") {
--- End diff --

Please also add order-by test too. Maybe add to `DataFrameSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131328048
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 3, 4, sum(b) from data group by 1, 2;
+select 3 as c, 4 as d, sum(b) from data group by c, d;
--- End diff --

Do we need to add those query tests? They are actually no effect in testing 
this bug.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-04 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131323389
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+select 3, 4, sum(b) from data group by 1, 2;
--- End diff --

ok, i will to.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131311220
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+select 3, 4, sum(b) from data group by 1, 2;
--- End diff --

We can move those test queries to the test suite like 
`DataFrameAggregateSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131311047
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+select 3, 4, sum(b) from data group by 1, 2;
--- End diff --

Only the methods like `Dataset.show` causes en-entrance of analyzed plans. 
`collect` won't. I guess the queries here are evaluated with `collect`.

scala> sql("select 4, b, sum(b) from data group by 1, 2").show
org.apache.spark.sql.AnalysisException: GROUP BY position 4 is not in 
select list (valid range is [1, 3]); line 1 pos 7

scala> sql("select 4, b, sum(b) from data group by 1, 2").collect
res2: Array[org.apache.spark.sql.Row] = Array([4,3,3], [4,4,4], 
[4,2,2])





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131309985
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+select 3, 4, sum(b) from data group by 1, 2;
--- End diff --

Can those tests this bug? No matter I use `transform` or  
`resolveOperators`, they generate the same results.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131309398
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+select 3, 4, sum(b) from data group by 1, 2;
--- End diff --

Can those tests this bug? No matter I use `transform` or  
`resolveOperators`, they generate the same results.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131309163
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
--- End diff --

OK,thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131309076
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+select 3, 4, sum(b) from data group by 1, 2;
--- End diff --

Also, it  fixes this query:`select 3 as c, 4 as d, sum(b) from data group 
by c, d;`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131308502
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
--- End diff --

Could you move this line to the line before the line 68


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131308421
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql 
---
@@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 
having a > 0;
 -- mixed cases: group-by ordinals and aliases
 select a, a AS k, count(b) from data group by k, 1;
 
--- turn of group by ordinal
+-- turn off group by ordinal
 set spark.sql.groupByOrdinal=false;
 
 -- can now group by negative literal
 select sum(b) from data group by -1;
+
+-- SPARK-21580 ints in aggregation expressions are taken as group-by 
ordinal
+select 4, b from data group by 1, 2;
+
+set spark.sql.groupByOrdinal=true;
+
+select 4, b from data group by 1, 2;
+
+select 3, 4, sum(b) from data group by 1, 2;
--- End diff --

Except this line, does this PR fix any other queries? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131305760
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

Aha, sorry, I made a mistake here in inspecting Analyzer...

Yeah, the whole bug is due to re-entrance of an analyzed plan. Currently we 
can solve it by using `resolveOperators`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131300093
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

 Run `SubstituteUnresolvedOrdinals` with ` fixedPoint`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131299271
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

@10110346 May I ask which once you run `SubstituteUnresolvedOrdinals` with? 
`Once` or `fixedPoint`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131298714
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

For this Change: `transform -> resolveOperators`,it can fix the whole bug, 
I have tested it. @gatorsmile  @viirya


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131294567
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

Because it keeps the redundant Aliases at the top level. That may not what 
we want.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131292299
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

changing `Aggregate(grouping.map(trimAliases), cleanedAggs, child)` to  ` 
Aggregate(grouping.map(trimNonTopLevelAliases), cleanedAggs, child) in 
`CleanupAliases` ,
looks like it can fix the whole bug 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131290132
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -528,7 +541,15 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   } else {
 groupByExpressions
   }
-  Aggregate(mappedGroupByExpressions, selectExpressions, query)
+
+  // Replaces ordinal in 'group by' with UnresolvedOrdinal expression
+  val newGroups = mappedGroupByExpressions.map {
--- End diff --

ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131290115
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -245,14 +246,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   query: LogicalPlan): LogicalPlan = withOrigin(ctx) {
 import ctx._
 
+// Replaces ordinal in 'order by' or 'sort by' with UnresolvedOrdinal 
expression
+def replacesOrdinal(orders: Seq[SortOrder]): Seq[SortOrder] = {
--- End diff --

Because we need to apply the same on Dataset API. We may need to extract 
this out so we can reuse it to Dataset API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131287607
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

If we change this to `resolveOperators` and let 
`SubstituteUnresolvedOrdinals` run with `Once`. Looks like it can also solve 
those issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131286674
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinalsSuite.scala
 ---
@@ -1,66 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.analysis.TestRelations.testRelation2
-import org.apache.spark.sql.catalyst.dsl.expressions._
-import org.apache.spark.sql.catalyst.dsl.plans._
-import org.apache.spark.sql.catalyst.expressions.Literal
-import org.apache.spark.sql.internal.SQLConf
-
-class SubstituteUnresolvedOrdinalsSuite extends AnalysisTest {
-  private lazy val a = testRelation2.output(0)
-  private lazy val b = testRelation2.output(1)
-
-  test("unresolved ordinal should not be unresolved") {
-// Expression OrderByOrdinal is unresolved.
-assert(!UnresolvedOrdinal(0).resolved)
-  }
-
-  test("order by ordinal") {
--- End diff --

Similarly, we should add two tests like here to a parser test suite like 
`PlanParserSuite`.

In the new tests, we test if order-by/group-by SQL queries can be correctly 
parsed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131286224
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinalsSuite.scala
 ---
@@ -1,66 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.analysis.TestRelations.testRelation2
-import org.apache.spark.sql.catalyst.dsl.expressions._
-import org.apache.spark.sql.catalyst.dsl.plans._
-import org.apache.spark.sql.catalyst.expressions.Literal
-import org.apache.spark.sql.internal.SQLConf
-
-class SubstituteUnresolvedOrdinalsSuite extends AnalysisTest {
-  private lazy val a = testRelation2.output(0)
-  private lazy val b = testRelation2.output(1)
-
-  test("unresolved ordinal should not be unresolved") {
--- End diff --

This test can be moved to some place like `AnalysisSuite`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131285941
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinalsSuite.scala
 ---
@@ -1,66 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.analysis.TestRelations.testRelation2
-import org.apache.spark.sql.catalyst.dsl.expressions._
-import org.apache.spark.sql.catalyst.dsl.plans._
-import org.apache.spark.sql.catalyst.expressions.Literal
-import org.apache.spark.sql.internal.SQLConf
-
-class SubstituteUnresolvedOrdinalsSuite extends AnalysisTest {
--- End diff --

We may move those tests to parser related test suites.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131285585
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

By replacing `transform` with `resolveOperators`, we can just fix the issue 
brought by `SubstituteUnresolvedOrdinals` applies on analyzed plan.

The other issues like int literals in aggregation expressions are taken as 
ordinal are still there.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131200152
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, 
SortOrder}
-import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
LogicalPlan, Sort}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
-import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.IntegerType
-
-/**
- * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal 
expression.
- */
-class SubstituteUnresolvedOrdinals(conf: SQLConf) extends 
Rule[LogicalPlan] {
-  private def isIntLiteral(e: Expression) = e match {
-case Literal(_, IntegerType) => true
-case _ => false
-  }
-
-  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
--- End diff --

let us fix the whole bug using one line change.  : )

`transform` -> `resolveOperators`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-02 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131058058
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -42,13 +42,5 @@ class SubstituteUnresolvedOrdinals(conf: SQLConf) 
extends Rule[LogicalPlan] {
 case other => other
   }
   withOrigin(s.origin)(s.copy(order = newOrders))
-
-case a: Aggregate if conf.groupByOrdinal && 
a.groupingExpressions.exists(isIntLiteral) =>
--- End diff --

Yeah, we should do it. We should completely remove 
`SubstituteUnresolvedOrdinals`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-02 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131053346
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala
 ---
@@ -42,13 +42,5 @@ class SubstituteUnresolvedOrdinals(conf: SQLConf) 
extends Rule[LogicalPlan] {
 case other => other
   }
   withOrigin(s.origin)(s.copy(order = newOrders))
-
-case a: Aggregate if conf.groupByOrdinal && 
a.groupingExpressions.exists(isIntLiteral) =>
--- End diff --

Since you are moving group by to the parser, how about moving both to the 
parser?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-02 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r131053311
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -528,7 +529,15 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   } else {
 groupByExpressions
   }
-  Aggregate(mappedGroupByExpressions, selectExpressions, query)
+
+  // Replaces ordinal in 'order by' or 'group by' with 
UnresolvedOrdinal expression
--- End diff --

`order by`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-01 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130613115
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

@10110346 Please see previous comment: 
https://github.com/apache/spark/pull/18779#issuecomment-319007812

The basic idea is, instead of resolve `UnresolvedOrdinal` to actual 
aggregate expressions, we resolve it to `ResolvedOrdinal`.

Then in the beginning of optimization, we replace all `ResolvedOrdinal` to 
actual aggregate expressions.

What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130507794
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

thanks, I'm not very familiar with this module , how to do it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130506005
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
--- End diff --

This rule is before `CleanupAliases`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130505545
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

@maropu yap, I think so. Because `RemoveLiteralFromGroupExpressions` takes 
care of this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130505351
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

If we reimplement this by using `ResolvedOrdinal`, we probably have no 
chance to hit the empty issue hrere, I think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130504819
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

When the aggregates that have an empty input, if we remove all grouping 
expressions, this triggers the ungrouped code path (which aways returns a 
single row). So the query semantics are changed.

You meant this is not an issue anymore?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130504796
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
--- End diff --

Have they  been trimmed in `aggregateExpressions`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130503562
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

Actually, We have supported  `GlobalAggregates` and `group by null`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130336844
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

Please see the comments at 
https://github.com/apache/spark/blob/75b168fd30bb9a52ae223b6f1df73da4b1316f2e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala#L1254-L1257.
 We should not make the grouping expressions empty.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130330782
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

It looks no problem here, although  we drop all grouping expressions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130300193
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
+case _ => filterGroups :+= ng
+  }
+}
+Aggregate(filterGroups, aggs, child)
--- End diff --

Similar to `RemoveLiteralFromGroupExpressions`, we shouldn't drop all 
grouping expressions if they are all int literals after resolved from 
`UnresolvedOrdinal`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130293284
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
+var filterGroups: Seq[Expression] = Nil
+newGroups.foreach { ng =>
+  ng match {
+case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= 
ng
--- End diff --

I think we may need to trim all possible Aliases, e.g., Alias(Alias(1, 
'a'), 'b').


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-07-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18779#discussion_r130292918
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1010,7 +1014,16 @@ class Analyzer(
 s"(valid range is [1, ${aggs.size}])")
   case o => o
 }
-Aggregate(newGroups, aggs, child)
+// If an aggregateExpression is integer, after replaced with this 
aggregateExpression, the
+// groupExpression will be considered as an ordinal still, so we 
should filter it.
--- End diff --

Please also comment on why this filtering is safe from changing grouping 
result. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org