[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-06 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r552672436



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -659,12 +670,23 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 AttributeReference("value", StringType)()), true)
 }
 
-// Create the transform.
+val plan = visitCommonSelectQueryClausePlan(
+  relation,
+  lateralView,
+  transformClause.namedExpressionSeq,
+  whereClause,
+  aggregationClause,
+  havingClause,
+  windowClause,
+  isDistinct = false)
+
+// Create the transform, here we pass UnresolvedStart as ScriptTransform's 
input.
+// In analyzer after child plan is resolved, we resolve UnresolvedStart as 
child's output.
 ScriptTransformation(
-  expressions,
+  Seq(UnresolvedStar(None)),

Review comment:
   > hm, on second thought, we cannot remove this param `input` from 
`ScriptTransformation` in this PR? Since the input exprs of the current 
`ScriptTransformation` implementaiton always coms from child's output, IIUC we 
don't need this param anymore?
   
   It looks like this. We can replace `input` with `child. output` directly. 
That's a really nice suggestion since current way(converting LogicalPlan it 
looks a little weird). If we remove this `input` parameter and add correct  
comment.
   The whole process looks more natural.  I have tried it in local, a new big 
diff. How about a new ticket about refactor this? 

##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -659,12 +670,23 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 AttributeReference("value", StringType)()), true)
 }
 
-// Create the transform.
+val plan = visitCommonSelectQueryClausePlan(
+  relation,
+  lateralView,
+  transformClause.namedExpressionSeq,
+  whereClause,
+  aggregationClause,
+  havingClause,
+  windowClause,
+  isDistinct = false)
+
+// Create the transform, here we pass UnresolvedStart as ScriptTransform's 
input.
+// In analyzer after child plan is resolved, we resolve UnresolvedStart as 
child's output.
 ScriptTransformation(
-  expressions,
+  Seq(UnresolvedStar(None)),

Review comment:
   > hm, on second thought, we cannot remove this param `input` from 
`ScriptTransformation` in this PR? Since the input exprs of the current 
`ScriptTransformation` implementaiton always coms from child's output, IIUC we 
don't need this param anymore?
   
   It looks like this. We can replace `input` with `child.output` directly. 
That's a really nice suggestion since current way(converting LogicalPlan it 
looks a little weird). If we remove this `input` parameter and add correct  
comment.
   The whole process looks more natural.  I have tried it in local, a new big 
diff. How about a new ticket about refactor this? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-05 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551973037



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
##
@@ -301,6 +302,185 @@ class SparkSqlParserSuite extends AnalysisTest {
 List.empty, List.empty, None, None, false)))
   }
 
+  test("SPARK-28227: script transform with row format delimit with 
aggregation") {
+assertEqual(
+  """
+|SELECT TRANSFORM(a, sum(b), max(c))
+|  ROW FORMAT DELIMITED
+|  FIELDS TERMINATED BY ','
+|  COLLECTION ITEMS TERMINATED BY '#'
+|  MAP KEYS TERMINATED BY '@'
+|  LINES TERMINATED BY '\n'
+|  NULL DEFINED AS 'null'
+|  USING 'cat' AS (a, b, c)
+|  ROW FORMAT DELIMITED
+|  FIELDS TERMINATED BY ','
+|  COLLECTION ITEMS TERMINATED BY '#'
+|  MAP KEYS TERMINATED BY '@'
+|  LINES TERMINATED BY '\n'
+|  NULL DEFINED AS 'NULL'
+|FROM testData
+|GROUP BY a
+|HAVING sum(b) > 10
+  """.stripMargin,
+  ScriptTransformation(
+Seq(UnresolvedStar(None)),
+"cat",
+Seq(AttributeReference("a", StringType)(),
+  AttributeReference("b", StringType)(),
+  AttributeReference("c", StringType)()),
+UnresolvedHaving(
+  GreaterThan(
+UnresolvedFunction("sum", Seq(UnresolvedAttribute("b")), 
isDistinct = false),
+Literal(10)),
+  Aggregate(
+Seq('a),
+Seq(
+  'a,
+  UnresolvedAlias(
+UnresolvedFunction("sum", Seq(UnresolvedAttribute("b")), 
isDistinct = false), None),
+  UnresolvedAlias(
+UnresolvedFunction("max", Seq(UnresolvedAttribute("c")), 
isDistinct = false), None)
+),
+UnresolvedRelation(TableIdentifier("testData",
+ScriptInputOutputSchema(
+  Seq(("TOK_TABLEROWFORMATFIELD", ","),
+("TOK_TABLEROWFORMATCOLLITEMS", "#"),
+("TOK_TABLEROWFORMATMAPKEYS", "@"),
+("TOK_TABLEROWFORMATNULL", "null"),
+("TOK_TABLEROWFORMATLINES", "\n")),
+  Seq(("TOK_TABLEROWFORMATFIELD", ","),
+("TOK_TABLEROWFORMATCOLLITEMS", "#"),
+("TOK_TABLEROWFORMATMAPKEYS", "@"),
+("TOK_TABLEROWFORMATNULL", "NULL"),
+("TOK_TABLEROWFORMATLINES", "\n")), None, None,

Review comment:
   > I feel this part is redundant, so could you pull out this part as a 
shared variable?
   
   How about current change? merge two UT.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-05 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551959690



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,126 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM(b, a, CAST(c AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2, 3)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+  USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM(
+  b AS d5, a,
+  CASE
+WHEN c > 100 THEN 1
+WHEN c < 100 THEN 2
+  ELSE 3 END)
+  USING 'cat' AS (a, b,  c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(b, a, c + 1)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(*)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;

Review comment:
   > I just want to know why these tests are added in this PR... That's 
because these tests seems to be not related to aggregation/window/lateralView.
   
   Yea, here want to show after this pr's change, each kind of expressions can 
work well, such as `alias`, `case when`, `binary compute` etc...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-05 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551771493



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -692,13 +712,41 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
   havingClause: HavingClauseContext,
   windowClause: WindowClauseContext,
   relation: LogicalPlan): LogicalPlan = withOrigin(ctx) {
+val isDistinct = selectClause.setQuantifier() != null &&
+  selectClause.setQuantifier().DISTINCT() != null
+
+// Visit common project

Review comment:
   Just remove this





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-05 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551771170



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -659,12 +670,21 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 AttributeReference("value", StringType)()), true)
 }
 
+val plan = visitCommonSelectQueryClausePlan(relation,
+  lateralView,
+  transformClause.namedExpressionSeq,
+  whereClause,
+  aggregationClause,
+  havingClause,
+  windowClause,
+  isDistinct = false)
+
 // Create the transform.
 ScriptTransformation(
-  expressions,
+  Seq(UnresolvedStar(None)),

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-05 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551771073



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,126 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM(b, a, CAST(c AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2, 3)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+  USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM(
+  b AS d5, a,
+  CASE
+WHEN c > 100 THEN 1
+WHEN c < 100 THEN 2
+  ELSE 3 END)
+  USING 'cat' AS (a, b,  c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(b, a, c + 1)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(*)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(b AS d, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM(b AS d, MAX(a) FILTER (WHERE a > 3) AS max_a, CAST(SUM(c) AS 
STRING))
+  USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(sum(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 2
+GROUP BY b;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max_a > 0;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max(a) > 1;
+
+SELECT TRANSFORM(b, MAX(a) OVER w as max_a, CAST(SUM(c) OVER w AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+WINDOW w AS (PARTITION BY b ORDER BY a);
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(SUM(c) AS STRING), myCol, myCol2)
+  USING 'cat' AS (a, b, c, d, e)
+FROM script_trans
+LATERAL VIEW explode(array(array(1,2,3))) myTable AS myCol
+LATERAL VIEW explode(myTable.myCol) myTable2 AS myCol2
+WHERE a <= 4
+GROUP BY b, myCol, myCol2
+HAVING max(a) > 1;
+
+

Review comment:
   DOne





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-05 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551770899



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,126 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM(b, a, CAST(c AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2, 3)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+  USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM(
+  b AS d5, a,
+  CASE
+WHEN c > 100 THEN 1
+WHEN c < 100 THEN 2
+  ELSE 3 END)
+  USING 'cat' AS (a, b,  c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(b, a, c + 1)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(*)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(b AS d, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM(b AS d, MAX(a) FILTER (WHERE a > 3) AS max_a, CAST(SUM(c) AS 
STRING))
+  USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(sum(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 2
+GROUP BY b;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max_a > 0;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max(a) > 1;
+
+SELECT TRANSFORM(b, MAX(a) OVER w as max_a, CAST(SUM(c) OVER w AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+WINDOW w AS (PARTITION BY b ORDER BY a);
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(SUM(c) AS STRING), myCol, myCol2)
+  USING 'cat' AS (a, b, c, d, e)
+FROM script_trans
+LATERAL VIEW explode(array(array(1,2,3))) myTable AS myCol
+LATERAL VIEW explode(myTable.myCol) myTable2 AS myCol2
+WHERE a <= 4
+GROUP BY b, myCol, myCol2
+HAVING max(a) > 1;
+
+
+FROM(
+  FROM script_trans
+  SELECT TRANSFORM(a, b)
+USING 'cat' AS (`a` INT, b STRING)
+) t
+SELECT a + 1;
+
+FROM(
+  SELECT TRANSFORM(a, SUM(b) b)
+USING 'cat' AS (`a` INT, b STRING)
+  FROM script_trans
+  GROUP BY a
+) t
+SELECT (b + 1) AS result
+ORDER BY result;
+
+MAP k / 10 USING 'cat' AS (one) FROM (SELECT 10 AS k);
+
+FROM (SELECT 1 AS key, 100 AS value) src
+MAP src.*, src.key, CAST(src.key / 10 AS INT), CAST(src.key % 10 AS INT), 
src.value
+  USING 'cat' AS (k, v, tkey, ten, one, tvalue);
+
+SELECT TRANSFORM(1)
+  USING 'cat' AS (a)
+FROM script_trans
+HAVING true;
+
+SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true;
+
+SELECT TRANSFORM(1)
+  USING 'cat' AS (a)
+FROM script_trans
+HAVING true;
+
+SET spark.sql.parser.quotedRegexColumnNames=true;
+
+SELECT TRANSFORM(`(a|b)?+.+`)
+ USING 'cat' AS (c)
+FROM script_trans;

Review comment:
   DOne

##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,126 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM(b, a, CAST(c AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2, 3)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+  USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM(
+  b AS d5, a,
+  CASE
+WHEN c > 100 THEN 1
+WHEN c < 100 THEN 2
+  ELSE 3 END)
+  USING 'cat' AS (a, b,  c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(b, a, c + 1)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(*)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(b AS d, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM(b AS d, MAX(a) FILTER (WHERE a > 3) AS max_a, CAST(SUM(c) AS 
STRING))
+  USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(sum(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 2
+GROUP BY b;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max_a > 0;
+
+SELECT TRANSFORM(b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max(a) > 1;
+
+SELECT TRANSFORM(b, MAX(a) OVER w as max_a, CAST(SUM(c) OVER w AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+WINDOW w AS (PARTITION BY b ORDER BY a);
+
+SELECT TRANSFORM(b, MAX(a) as max_a, 

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-05 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551770515



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
##
@@ -301,6 +302,187 @@ class SparkSqlParserSuite extends AnalysisTest {
 List.empty, List.empty, None, None, false)))
   }
 
+  test("SPARK-28227: script transform with row format delimit with 
aggregation") {
+assertEqual(
+  """
+|SELECT TRANSFORM(a, sum(b), max(c))
+|  ROW FORMAT DELIMITED
+|  FIELDS TERMINATED BY ','
+|  COLLECTION ITEMS TERMINATED BY '#'
+|  MAP KEYS TERMINATED BY '@'
+|  LINES TERMINATED BY '\n'
+|  NULL DEFINED AS 'null'
+|  USING 'cat' AS (a, b, c)
+|  ROW FORMAT DELIMITED
+|  FIELDS TERMINATED BY ','
+|  COLLECTION ITEMS TERMINATED BY '#'
+|  MAP KEYS TERMINATED BY '@'
+|  LINES TERMINATED BY '\n'
+|  NULL DEFINED AS 'NULL'
+|FROM testData
+|GROUP BY a
+|HAVING sum(b) > 10
+  """.stripMargin,
+  ScriptTransformation(
+Seq(UnresolvedStar(None)),
+"cat",
+Seq(AttributeReference("a", StringType)(),
+  AttributeReference("b", StringType)(),
+  AttributeReference("c", StringType)()),
+UnresolvedHaving(
+  GreaterThan(
+UnresolvedFunction("sum", Seq(UnresolvedAttribute("b")), 
isDistinct = false),
+Literal(10)),
+  Aggregate(
+Seq('a),
+Seq(
+  'a,
+  UnresolvedAlias(
+UnresolvedFunction("sum", Seq(UnresolvedAttribute("b")), 
isDistinct = false), None),
+  UnresolvedAlias(
+UnresolvedFunction("max", Seq(UnresolvedAttribute("c")), 
isDistinct = false), None)
+),
+UnresolvedRelation(TableIdentifier("testData",
+ScriptInputOutputSchema(
+  Seq(("TOK_TABLEROWFORMATFIELD", ","),
+("TOK_TABLEROWFORMATCOLLITEMS", "#"),
+("TOK_TABLEROWFORMATMAPKEYS", "@"),
+("TOK_TABLEROWFORMATNULL", "null"),
+("TOK_TABLEROWFORMATLINES", "\n")),
+  Seq(("TOK_TABLEROWFORMATFIELD", ","),
+("TOK_TABLEROWFORMATCOLLITEMS", "#"),
+("TOK_TABLEROWFORMATMAPKEYS", "@"),
+("TOK_TABLEROWFORMATNULL", "NULL"),
+("TOK_TABLEROWFORMATLINES", "\n")), None, None,
+  List.empty, List.empty, None, None, false)))
+
+

Review comment:
   DOne

##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
##
@@ -301,6 +302,187 @@ class SparkSqlParserSuite extends AnalysisTest {
 List.empty, List.empty, None, None, false)))
   }
 
+  test("SPARK-28227: script transform with row format delimit with 
aggregation") {
+assertEqual(
+  """
+|SELECT TRANSFORM(a, sum(b), max(c))
+|  ROW FORMAT DELIMITED
+|  FIELDS TERMINATED BY ','
+|  COLLECTION ITEMS TERMINATED BY '#'
+|  MAP KEYS TERMINATED BY '@'
+|  LINES TERMINATED BY '\n'
+|  NULL DEFINED AS 'null'
+|  USING 'cat' AS (a, b, c)
+|  ROW FORMAT DELIMITED
+|  FIELDS TERMINATED BY ','
+|  COLLECTION ITEMS TERMINATED BY '#'
+|  MAP KEYS TERMINATED BY '@'
+|  LINES TERMINATED BY '\n'
+|  NULL DEFINED AS 'NULL'
+|FROM testData
+|GROUP BY a
+|HAVING sum(b) > 10
+  """.stripMargin,
+  ScriptTransformation(
+Seq(UnresolvedStar(None)),
+"cat",
+Seq(AttributeReference("a", StringType)(),
+  AttributeReference("b", StringType)(),
+  AttributeReference("c", StringType)()),
+UnresolvedHaving(
+  GreaterThan(
+UnresolvedFunction("sum", Seq(UnresolvedAttribute("b")), 
isDistinct = false),
+Literal(10)),
+  Aggregate(
+Seq('a),
+Seq(
+  'a,
+  UnresolvedAlias(
+UnresolvedFunction("sum", Seq(UnresolvedAttribute("b")), 
isDistinct = false), None),
+  UnresolvedAlias(
+UnresolvedFunction("max", Seq(UnresolvedAttribute("c")), 
isDistinct = false), None)
+),
+UnresolvedRelation(TableIdentifier("testData",
+ScriptInputOutputSchema(
+  Seq(("TOK_TABLEROWFORMATFIELD", ","),
+("TOK_TABLEROWFORMATCOLLITEMS", "#"),
+("TOK_TABLEROWFORMATMAPKEYS", "@"),
+("TOK_TABLEROWFORMATNULL", "null"),
+("TOK_TABLEROWFORMATLINES", "\n")),
+  Seq(("TOK_TABLEROWFORMATFIELD", ","),
+("TOK_TABLEROWFORMATCOLLITEMS", "#"),
+("TOK_TABLEROWFORMATMAPKEYS", "@"),
+("TOK_TABLEROWFORMATNULL", "NULL"),
+("TOK_TABLEROWFORMATLINES", "\n")), None, None,
+  List.empty, 

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551769365



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,126 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM(b, a, CAST(c AS STRING))
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2, 3)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+  USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM(
+  b AS d5, a,
+  CASE
+WHEN c > 100 THEN 1
+WHEN c < 100 THEN 2
+  ELSE 3 END)
+  USING 'cat' AS (a, b,  c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(b, a, c + 1)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(*)
+  USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;

Review comment:
   > These queries above are not supported in the current master?
   
   Support, what's wrong? It have a correct answer.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551711403



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
##
@@ -301,6 +302,61 @@ class SparkSqlParserSuite extends AnalysisTest {
 List.empty, List.empty, None, None, false)))
   }
 
+  test("SPARK-28227: script transform with row format delimit with 
aggregation") {

Review comment:
   > Please add tests for window/lateral view cases.
   
   Added

##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -659,12 +670,21 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 AttributeReference("value", StringType)()), true)
 }
 
+val plan = visitCommonSelectQueryClausePlan(relation,
+  lateralView,
+  transformClause.namedExpressionSeq,
+  whereClause,
+  aggregationClause,
+  havingClause,
+  windowClause,
+  isDistinct = false)

Review comment:
   > I think we don't need to support it in this PR and its okay to just 
follow the hive behaviour.
   
   Yea





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551677043



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -659,12 +670,21 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 AttributeReference("value", StringType)()), true)
 }
 
+val plan = visitCommonSelectQueryClausePlan(relation,
+  lateralView,
+  transformClause.namedExpressionSeq,
+  whereClause,
+  aggregationClause,
+  havingClause,
+  windowClause,
+  isDistinct = false)

Review comment:
   > In hive, one cannot use `distinct` for TRANSFORM? e.g., `SELECT 
TRANSFORM(distinct a, b)`?
   
   Not support. Do we need to support this?
   ```
   hive>
   >
   > select transform(distinct a) using 'cat' from (select 1 as a) temp;
   NoViableAltException(96@[104:1: selectExpression : ( ( tableAllColumns )=> 
tableAllColumns | expression );])
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA19.specialStateTransition(HiveParser_SelectClauseParser.java:5380)
at org.antlr.runtime.DFA.predict(DFA.java:80)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectExpression(HiveParser_SelectClauseParser.java:2215)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectExpressionList(HiveParser_SelectClauseParser.java:2297)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectTrfmClause(HiveParser_SelectClauseParser.java:1299)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:968)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:41964)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:36720)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:36987)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:36633)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:35822)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:35710)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2284)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1333)
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
   FAILED: ParseException line 1:17 cannot recognize input near 'distinct' 'a' 
')' in select expression
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551675638



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -1519,12 +1519,7 @@ class Analyzer(override val catalogManager: 
CatalogManager)
 }
   // If the script transformation input contains Stars, expand it.

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551672696



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -1519,12 +1519,7 @@ class Analyzer(override val catalogManager: 
CatalogManager)
 }
   // If the script transformation input contains Stars, expand it.
   case t: ScriptTransformation if containsStar(t.input) =>
-t.copy(
-  input = t.input.flatMap {
-case s: Star => s.expand(t.child, resolver)
-case o => o :: Nil
-  }
-)
+t.copy(input = t.child.output)

Review comment:
   Doneļ¼Œ can be supported.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551670573



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,114 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1, 2, 3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM (
+b AS d5, a,
+CASE
+  WHEN c > 100 THEN 1
+  WHEN c < 100 THEN 2
+ELSE 3 END)
+USING 'cat' AS (a, b,  c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b, a, c + 1)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (*)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b AS d, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM (b AS d, MAX(a) FILTER (WHERE a > 3) AS max_a, CAST(SUM(c) AS 
STRING))
+USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(sum(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 2
+GROUP BY b;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max_a > 0;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max(a) > 1;
+
+SELECT TRANSFORM (b, MAX(a) OVER w as max_a, CAST(SUM(c) OVER w AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+WINDOW w AS (PARTITION BY b ORDER BY a);
+
+set spark.sql.legacy.bucketedTableScan.outputOrdering=true;
+
+SELECT TRANSFORM (MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b)
+FROM script_trans
+WHERE a <= 4
+HAVING max(a) > 1;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(SUM(c) AS STRING), myCol, myCol2)
+USING 'cat' AS (a, b, c, d, e)
+FROM script_trans
+LATERAL VIEW explode(array(array(1,2,3))) myTable AS myCol
+LATERAL VIEW explode(myTable.myCol) myTable2 AS myCol2
+WHERE a <= 4
+GROUP BY b, myCol, myCol2
+HAVING max(a) > 1;
+
+
+FROM
+(FROM script_trans SELECT TRANSFORM(a, b) USING 'cat' AS (`a` INT, b STRING)) t
+SELECT a + 1;
+
+FROM
+(SELECT TRANSFORM(a, SUM(b) b)
+USING 'cat' AS (`a` INT, b STRING)

Review comment:
   Done

##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,114 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING))

Review comment:
   Done

##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,114 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1, 2, 3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM (
+b AS d5, a,

Review comment:
   Done

##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,114 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1, 2, 3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM (
+b AS d5, a,
+CASE
+  WHEN c > 100 THEN 1
+  WHEN c < 100 THEN 2
+ELSE 3 END)
+USING 'cat' AS (a, b,  c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b, a, c + 1)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (*)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b AS d, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM (b AS d, MAX(a) FILTER (WHERE a > 3) AS max_a, CAST(SUM(c) AS 
STRING))
+USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(sum(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 2
+GROUP BY b;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max_a > 0;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, 

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551668776



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,114 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, 
l FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1, 2, 3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1, 2)
+USING 'cat' AS (a INT, b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM (
+b AS d5, a,
+CASE
+  WHEN c > 100 THEN 1
+  WHEN c < 100 THEN 2
+ELSE 3 END)
+USING 'cat' AS (a, b,  c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b, a, c + 1)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (*)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b AS d, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM (b AS d, MAX(a) FILTER (WHERE a > 3) AS max_a, CAST(SUM(c) AS 
STRING))
+USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(sum(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 2
+GROUP BY b;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max_a > 0;
+
+SELECT TRANSFORM (b, MAX(a) as max_a, CAST(SUM(c) AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+GROUP BY b
+HAVING max(a) > 1;
+
+SELECT TRANSFORM (b, MAX(a) OVER w as max_a, CAST(SUM(c) OVER w AS STRING))
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4
+WINDOW w AS (PARTITION BY b ORDER BY a);
+
+set spark.sql.legacy.bucketedTableScan.outputOrdering=true;

Review comment:
   test for this 
https://github.com/apache/spark/pull/29087#discussion_r551342009





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551668632



##
File path: sql/core/src/test/resources/sql-tests/inputs/pivot.sql
##
@@ -46,7 +46,7 @@ SELECT * FROM (
   SELECT course, earnings FROM courseSales
 )
 PIVOT (
-  sum(earnings)
+  sum(earnings)LATERAL VIEW

Review comment:
   Mistake,  reverted.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation/window/lateralView

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551668237



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -659,12 +669,39 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 AttributeReference("value", StringType)()), true)
 }
 
+val namedExpressions = expressions.map {
+  case e: NamedExpression => e
+  case e: Expression => UnresolvedAlias(e)
+}
+
+def createProject() = if (namedExpressions.nonEmpty) {
+  Project(namedExpressions, withFilter)
+} else {
+  withFilter
+}
+
+val withProject = if (aggregationClause == null && havingClause != null) {
+  if (conf.getConf(SQLConf.LEGACY_HAVING_WITHOUT_GROUP_BY_AS_WHERE)) {
+// If the legacy conf is set, treat HAVING without GROUP BY as WHERE.
+withHavingClause(havingClause, createProject())
+  } else {
+// According to SQL standard, HAVING without GROUP BY means global 
aggregate.
+withHavingClause(havingClause, Aggregate(Nil, namedExpressions, 
withFilter))
+  }
+} else if (aggregationClause != null) {
+  val aggregate = withAggregationClause(aggregationClause, 
namedExpressions, withFilter)
+  aggregate.optionalMap(havingClause)(withHavingClause)
+} else {
+  // When hitting this branch, `having` must be null.
+  createProject()
+}

Review comment:
   > Is this part just copied from `withSelectQuerySpecification`? If so, 
could you share it between `withSelectQuerySpecification` and 
`withTransformQuerySpecification`?
   
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551391778



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,84 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, l 
FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING) )
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1,2,3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1,2)
+USING 'cat' AS (a INT,b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM (
+b AS d5, a,
+CASE
+  WHEN c > 100 THEN 1
+  WHEN c < 100 THEN 2
+ELSE 3 END)
+USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b, a, c + 1 )
+USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (*)
+USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b AS d, MAX(a) as max_a, CAST(SUM(c) AS STRING))

Review comment:
   > In hive, one can use an aggregate filter, e.g., `SELECT TRANSFORM (b 
AS d, MAX(a) FILTER (WHER a > 3) as max_a`? Could you add tests for it?
   
   Add a test case.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551342009



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -659,12 +669,39 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 AttributeReference("value", StringType)()), true)
 }
 
+val namedExpressions = expressions.map {
+  case e: NamedExpression => e
+  case e: Expression => UnresolvedAlias(e)
+}
+
+def createProject() = if (namedExpressions.nonEmpty) {
+  Project(namedExpressions, withFilter)
+} else {
+  withFilter
+}
+
+val withProject = if (aggregationClause == null && havingClause != null) {
+  if (conf.getConf(SQLConf.LEGACY_HAVING_WITHOUT_GROUP_BY_AS_WHERE)) {

Review comment:
   > Since this is a new feature, we don't need to follow this config here?
   
   We can follow this, why not, but I need to add test case about this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551327184



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,84 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, l 
FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING) )
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1,2,3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1,2)
+USING 'cat' AS (a INT,b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM (
+b AS d5, a,
+CASE
+  WHEN c > 100 THEN 1
+  WHEN c < 100 THEN 2
+ELSE 3 END)
+USING 'cat' AS (a,b,c)

Review comment:
   Done

##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,84 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, l 
FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING) )
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1,2,3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1,2)

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551326940



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
##
@@ -1041,11 +1041,11 @@ class PlanParserSuite extends AnalysisTest {
 |FROM testData
   """.stripMargin,
   ScriptTransformation(
-Seq('a, 'b, 'c),
+Seq(UnresolvedStar(None)),

Review comment:
   > Why do we need to update this?
   
   Since in parser level, we pass UnresolvedStar to replace all output of 
child's output and then resolve this in Analyzer level to realize supporting 
aggregation with transform





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551326208



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,84 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, l 
FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING) )
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1,2,3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1,2)
+USING 'cat' AS (a INT,b INT)
+FROM script_trans
+LIMIT 1;
+
+SELECT TRANSFORM (
+b AS d5, a,
+CASE
+  WHEN c > 100 THEN 1
+  WHEN c < 100 THEN 2
+ELSE 3 END)
+USING 'cat' AS (a,b,c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (b, a, c + 1 )

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551326110



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,84 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, l 
FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING) )
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1,2,3)
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM(1,2)
+USING 'cat' AS (a INT,b INT)

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551325918



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,84 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, l 
FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING) )
+USING 'cat' AS (a, b, c)
+FROM script_trans
+WHERE a <= 4;
+
+SELECT TRANSFORM (1,2,3)

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551325708



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -183,3 +189,84 @@ SELECT a, b, decode(c, 'UTF-8'), d, e, f, g, h, i, j, k, l 
FROM (
 NULL DEFINED AS 'NULL'
   FROM t
 ) tmp;
+
+SELECT TRANSFORM (b, a, CAST(c AS STRING) )

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551325000



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -582,7 +584,13 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 val from = OneRowRelation().optional(ctx.fromClause) {
   visitFromClause(ctx.fromClause)
 }
-withTransformQuerySpecification(ctx, ctx.transformClause, ctx.whereClause, 
from)
+withTransformQuerySpecification(
+  ctx,
+  ctx.transformClause,
+  ctx.whereClause,
+  ctx.aggregationClause,
+  ctx.havingClause, from

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2021-01-04 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r551323515



##
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##
@@ -506,7 +506,9 @@ fromStatementBody
 querySpecification
 : transformClause
   fromClause?
-  whereClause?  
#transformQuerySpecification
+  whereClause?
+  aggregationClause?
+  havingClause?   
#transformQuerySpecification

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-12-22 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r547288935



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
##
@@ -2558,6 +2558,131 @@ abstract class SQLQuerySuiteBase extends QueryTest with 
SQLTestUtils with TestHi
   }
 }
   }
+
+  test("SPARK-28227: test script transform with aggregation") {

Review comment:
   @maropu 
   UT have been moved to `transform.sql`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-07-14 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r454102278



##
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##
@@ -496,7 +496,9 @@ fromStatementBody
 querySpecification
 : transformClause
   fromClause?
-  whereClause?  
#transformQuerySpecification
+  whereClause?
+  aggregationClause?
+  havingClause?   
#transformQuerySpecification

Review comment:
   > Could you update the SQL doc, too?
   
   Can we add this after all things done? and we need to add a new page like 
`Where clause` and need to add a lot reference.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-07-13 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r454102278



##
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##
@@ -496,7 +496,9 @@ fromStatementBody
 querySpecification
 : transformClause
   fromClause?
-  whereClause?  
#transformQuerySpecification
+  whereClause?
+  aggregationClause?
+  havingClause?   
#transformQuerySpecification

Review comment:
   > Could you update the SQL doc, too?
   
   Can we add this after all things done? and we need to add a new page like 
`Where clause`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-07-13 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r454045113



##
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##
@@ -496,7 +496,9 @@ fromStatementBody
 querySpecification
 : transformClause
   fromClause?
-  whereClause?  
#transformQuerySpecification
+  whereClause?
+  aggregationClause?
+  havingClause?   
#transformQuerySpecification

Review comment:
   > Could you update the SQL doc, too?
   
   Yea.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-07-13 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r454101882



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
##
@@ -2558,6 +2558,131 @@ abstract class SQLQuerySuiteBase extends QueryTest with 
SQLTestUtils with TestHi
   }
 }
   }
+
+  test("SPARK-28227: test script transform with aggregation") {

Review comment:
   > Could you move the tests into `SQLQueryTestSuite`?
   
   This should wait for  https://github.com/apache/spark/pull/29085, since 
currently we can't use script transform in sql/core





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29087: [SPARK-28227][SQL] Support TRANSFORM with aggregation

2020-07-13 Thread GitBox


AngersZh commented on a change in pull request #29087:
URL: https://github.com/apache/spark/pull/29087#discussion_r454045113



##
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##
@@ -496,7 +496,9 @@ fromStatementBody
 querySpecification
 : transformClause
   fromClause?
-  whereClause?  
#transformQuerySpecification
+  whereClause?
+  aggregationClause?
+  havingClause?   
#transformQuerySpecification

Review comment:
   > Could you update the SQL doc, too?
   
   Yea.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org