[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20133


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159431742
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1366,6 +1388,15 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 }
   }
 
+  private def validateRowFormatFileFormat(
+  rowFormatCtx: Seq[RowFormatContext],
+  createFileFormatCtx: Seq[CreateFileFormatContext],
+  parentCtx: ParserRuleContext): Unit = {
+if (rowFormatCtx.size == 1 && createFileFormatCtx.size == 1) {
+  validateRowFormatFileFormat(rowFormatCtx.head, 
createFileFormatCtx.head, parentCtx)
--- End diff --

Will do it in a follow-up PR


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159431317
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1180,7 +1202,7 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 ctx)
 }
 
-val hasStorageProperties = (ctx.createFileFormat != null) || 
(ctx.rowFormat != null)
+val hasStorageProperties = (ctx.createFileFormat.size != 0) || 
(ctx.rowFormat.size != 0)
--- End diff --

Sure


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159431206
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1104,28 +1118,36 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 "CREATE TEMPORARY TABLE is not supported yet. " +
   "Please use CREATE TEMPORARY VIEW as an alternative.", ctx)
 }
-if (ctx.skewSpec != null) {
+if (ctx.skewSpec.size > 0) {
   operationNotAllowed("CREATE TABLE ... SKEWED BY", ctx)
 }
 
+checkDuplicateClauses(ctx.TBLPROPERTIES, "TBLPROPERTIES", ctx)
+checkDuplicateClauses(ctx.PARTITIONED, "PARTITIONED BY", ctx)
+checkDuplicateClauses(ctx.COMMENT, "COMMENT", ctx)
+checkDuplicateClauses(ctx.bucketSpec(), "CLUSTERED BY", ctx)
+checkDuplicateClauses(ctx.createFileFormat, "STORED AS/BY", ctx)
+checkDuplicateClauses(ctx.rowFormat, "ROW FORMAT", ctx)
+checkDuplicateClauses(ctx.locationSpec, "LOCATION", ctx)
+
 val dataCols = Option(ctx.columns).map(visitColTypeList).getOrElse(Nil)
 val partitionCols = 
Option(ctx.partitionColumns).map(visitColTypeList).getOrElse(Nil)
-val properties = 
Option(ctx.tablePropertyList).map(visitPropertyKeyValues).getOrElse(Map.empty)
+val properties = 
Option(ctx.tableProps).map(visitPropertyKeyValues).getOrElse(Map.empty)
--- End diff --

The last one, if we have multiple clauses. However, we blocks this in the 
above checks. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159364965
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1180,7 +1202,7 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 ctx)
 }
 
-val hasStorageProperties = (ctx.createFileFormat != null) || 
(ctx.rowFormat != null)
+val hasStorageProperties = (ctx.createFileFormat.size != 0) || 
(ctx.rowFormat.size != 0)
--- End diff --

shall we use `> 0` to be consistent with other places?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159364941
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1366,6 +1388,15 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 }
   }
 
+  private def validateRowFormatFileFormat(
+  rowFormatCtx: Seq[RowFormatContext],
+  createFileFormatCtx: Seq[CreateFileFormatContext],
+  parentCtx: ParserRuleContext): Unit = {
+if (rowFormatCtx.size == 1 && createFileFormatCtx.size == 1) {
+  validateRowFormatFileFormat(rowFormatCtx.head, 
createFileFormatCtx.head, parentCtx)
--- End diff --

shall we just combine this method and the old `validateRowFormatFileFormat`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159364923
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1104,28 +1118,36 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 "CREATE TEMPORARY TABLE is not supported yet. " +
   "Please use CREATE TEMPORARY VIEW as an alternative.", ctx)
 }
-if (ctx.skewSpec != null) {
+if (ctx.skewSpec.size > 0) {
   operationNotAllowed("CREATE TABLE ... SKEWED BY", ctx)
 }
 
+checkDuplicateClauses(ctx.TBLPROPERTIES, "TBLPROPERTIES", ctx)
+checkDuplicateClauses(ctx.PARTITIONED, "PARTITIONED BY", ctx)
+checkDuplicateClauses(ctx.COMMENT, "COMMENT", ctx)
+checkDuplicateClauses(ctx.bucketSpec(), "CLUSTERED BY", ctx)
+checkDuplicateClauses(ctx.createFileFormat, "STORED AS/BY", ctx)
+checkDuplicateClauses(ctx.rowFormat, "ROW FORMAT", ctx)
+checkDuplicateClauses(ctx.locationSpec, "LOCATION", ctx)
+
 val dataCols = Option(ctx.columns).map(visitColTypeList).getOrElse(Nil)
 val partitionCols = 
Option(ctx.partitionColumns).map(visitColTypeList).getOrElse(Nil)
-val properties = 
Option(ctx.tablePropertyList).map(visitPropertyKeyValues).getOrElse(Map.empty)
+val properties = 
Option(ctx.tableProps).map(visitPropertyKeyValues).getOrElse(Map.empty)
--- End diff --

what's the meaning of `ctx.tableProps` now? the union of all TABLE PROPERTY 
list?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159364557
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -383,23 +383,34 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
* {{{
*   CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db_name.]table_name
*   USING table_provider
-   *   [OPTIONS table_property_list]
-   *   [PARTITIONED BY (col_name, col_name, ...)]
-   *   [CLUSTERED BY (col_name, col_name, ...)
-   *[SORTED BY (col_name [ASC|DESC], ...)]
-   *INTO num_buckets BUCKETS
-   *   ]
-   *   [LOCATION path]
-   *   [COMMENT table_comment]
-   *   [TBLPROPERTIES (property_name=property_value, ...)]
+   *   create_table_clauses
*   [[AS] select_statement];
+   *
+   *   create_table_clauses (order insensitive):
+   * [OPTIONS table_property_list]
+   * [PARTITIONED BY (col_name, col_name, ...)]
+   * [CLUSTERED BY (col_name, col_name, ...)
+   *   [SORTED BY (col_name [ASC|DESC], ...)]
+   *   INTO num_buckets BUCKETS
+   * ]
+   * [LOCATION path]
+   * [COMMENT table_comment]
+   * [TBLPROPERTIES (property_name=property_value, ...)]
* }}}
*/
   override def visitCreateTable(ctx: CreateTableContext): LogicalPlan = 
withOrigin(ctx) {
 val (table, temp, ifNotExists, external) = 
visitCreateTableHeader(ctx.createTableHeader)
 if (external) {
   operationNotAllowed("CREATE EXTERNAL TABLE ... USING", ctx)
 }
+
+checkDuplicateClauses(ctx.TBLPROPERTIES, "TBLPROPERTIES", ctx)
+checkDuplicateClauses(ctx.OPTIONS, "OPTIONS", ctx)
+checkDuplicateClauses(ctx.PARTITIONED, "PARTITIONED BY", ctx)
+checkDuplicateClauses(ctx.COMMENT, "COMMENT", ctx)
+checkDuplicateClauses(ctx.bucketSpec(), "CLUSTERED BY", ctx)
+checkDuplicateClauses(ctx.locationSpec, "LOCATION", ctx)
--- End diff --

nit `ctx.LOCATION`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159269896
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1971,8 +1971,8 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   s"""
  |CREATE TABLE t(a int, b int, c int, d int)
  |USING parquet
- |PARTITIONED BY(a, b)
  |LOCATION "${dir.toURI}"
+ |PARTITIONED BY(a, b)
--- End diff --

Oh, I see. +1.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159170627
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -875,12 +875,13 @@ class HiveDDLSuite
 
   test("desc table for Hive table - bucketed + sorted table") {
 withTable("tbl") {
-  sql(s"""
-CREATE TABLE tbl (id int, name string)
-PARTITIONED BY (ds string)
-CLUSTERED BY(id)
-SORTED BY(id, name) INTO 1024 BUCKETS
-""")
+  sql(
+s"""
+  |CREATE TABLE tbl (id int, name string)
+  |CLUSTERED BY(id)
+  |SORTED BY(id, name) INTO 1024 BUCKETS
+  |PARTITIONED BY (ds string)
+""".stripMargin)
--- End diff --

The same here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159170624
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1971,8 +1971,8 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   s"""
  |CREATE TABLE t(a int, b int, c int, d int)
  |USING parquet
- |PARTITIONED BY(a, b)
  |LOCATION "${dir.toURI}"
+ |PARTITIONED BY(a, b)
--- End diff --

This is an end-to-end test for `ORDER-INSENSITIVENESS`. I do not want to 
introduce a new one for it 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159170240
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -384,22 +384,31 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
*   CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db_name.]table_name
*   USING table_provider
*   [OPTIONS table_property_list]
-   *   [PARTITIONED BY (col_name, col_name, ...)]
-   *   [CLUSTERED BY (col_name, col_name, ...)
-   *[SORTED BY (col_name [ASC|DESC], ...)]
-   *INTO num_buckets BUCKETS
-   *   ]
-   *   [LOCATION path]
-   *   [COMMENT table_comment]
-   *   [TBLPROPERTIES (property_name=property_value, ...)]
+   *   create_table_clauses
*   [[AS] select_statement];
+   *
+   *   create_table_clauses (order insensitive):
+   * [PARTITIONED BY (col_name, col_name, ...)]
--- End diff --

forgot it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159170197
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -408,9 +417,17 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 .map(visitIdentifierList(_).toArray)
 .getOrElse(Array.empty[String])
 val properties = 
Option(ctx.tableProps).map(visitPropertyKeyValues).getOrElse(Map.empty)
-val bucketSpec = Option(ctx.bucketSpec()).map(visitBucketSpec)
+val bucketSpec = if (ctx.bucketSpec().size > 1) {
+  duplicateClausesNotAllowed("CLUSTERED BY", ctx)
--- End diff --

Sure


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159170195
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserUtils.scala
 ---
@@ -39,6 +41,17 @@ object ParserUtils {
 throw new ParseException(s"Operation not allowed: $message", ctx)
   }
 
+  def duplicateClausesNotAllowed(message: String, ctx: ParserRuleContext): 
Nothing = {
+throw new ParseException(s"Found duplicate clauses: $message", ctx)
--- End diff --

Sounds good to me! 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159168832
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala
 ---
@@ -1153,65 +1191,165 @@ class DDLParserSuite extends PlanTest with 
SharedSQLContext {
 }
   }
 
+  test("Test CTAS against data source tables") {
+val s1 =
+  """
+|CREATE TABLE IF NOT EXISTS mydb.page_view
+|USING parquet
+|COMMENT 'This is the staging page view table'
+|LOCATION '/user/external/page_view'
+|TBLPROPERTIES ('p1'='v1', 'p2'='v2')
+|AS SELECT * FROM src
+  """.stripMargin
+
+val s2 =
+  """
+|CREATE TABLE IF NOT EXISTS mydb.page_view
+|USING parquet
+|LOCATION '/user/external/page_view'
+|COMMENT 'This is the staging page view table'
+|TBLPROPERTIES ('p1'='v1', 'p2'='v2')
+|AS SELECT * FROM src
+  """.stripMargin
+
+val s3 =
+  """
+|CREATE TABLE IF NOT EXISTS mydb.page_view
+|USING parquet
+|COMMENT 'This is the staging page view table'
+|LOCATION '/user/external/page_view'
+|TBLPROPERTIES ('p1'='v1', 'p2'='v2')
+|AS SELECT * FROM src
+  """.stripMargin
+
+checkParsing(s1)
+checkParsing(s2)
+checkParsing(s3)
+
+def checkParsing(sql: String): Unit = {
+  val (desc, exists) = extractTableDesc(sql)
+  assert(exists)
+  assert(desc.identifier.database == Some("mydb"))
+  assert(desc.identifier.table == "page_view")
+  assert(desc.storage.locationUri == Some(new 
URI("/user/external/page_view")))
+  assert(desc.schema.isEmpty) // will be populated later when the 
table is actually created
+  assert(desc.comment == Some("This is the staging page view table"))
+  assert(desc.viewText.isEmpty)
+  assert(desc.viewDefaultDatabase.isEmpty)
+  assert(desc.viewQueryColumnNames.isEmpty)
+  assert(desc.partitionColumnNames.isEmpty)
+  assert(desc.provider == Some("parquet"))
+  assert(desc.properties == Map("p1" -> "v1", "p2" -> "v2"))
+}
+  }
+
   test("Test CTAS #1") {
 val s1 =
-  """CREATE EXTERNAL TABLE IF NOT EXISTS mydb.page_view
+  """
+|CREATE EXTERNAL TABLE IF NOT EXISTS mydb.page_view
 |COMMENT 'This is the staging page view table'
 |STORED AS RCFILE
 |LOCATION '/user/external/page_view'
 |TBLPROPERTIES ('p1'='v1', 'p2'='v2')
-|AS SELECT * FROM src""".stripMargin
+|AS SELECT * FROM src
+   """.stripMargin
--- End diff --

nit. extra space before `"""`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159168804
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -875,12 +875,13 @@ class HiveDDLSuite
 
   test("desc table for Hive table - bucketed + sorted table") {
 withTable("tbl") {
-  sql(s"""
-CREATE TABLE tbl (id int, name string)
-PARTITIONED BY (ds string)
-CLUSTERED BY(id)
-SORTED BY(id, name) INTO 1024 BUCKETS
-""")
+  sql(
+s"""
+  |CREATE TABLE tbl (id int, name string)
+  |CLUSTERED BY(id)
+  |SORTED BY(id, name) INTO 1024 BUCKETS
+  |PARTITIONED BY (ds string)
+""".stripMargin)
--- End diff --

Can we keep the original `HiveDDLSuite.scala` file, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159168759
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1971,8 +1971,8 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   s"""
  |CREATE TABLE t(a int, b int, c int, d int)
  |USING parquet
- |PARTITIONED BY(a, b)
  |LOCATION "${dir.toURI}"
+ |PARTITIONED BY(a, b)
--- End diff --

Is it a relevant change? Since the PR is about ORDER-INSENSITIVENESS, can 
we keep the original code instead of making an irrelevant change like this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159150554
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -384,22 +384,31 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
*   CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db_name.]table_name
*   USING table_provider
*   [OPTIONS table_property_list]
-   *   [PARTITIONED BY (col_name, col_name, ...)]
-   *   [CLUSTERED BY (col_name, col_name, ...)
-   *[SORTED BY (col_name [ASC|DESC], ...)]
-   *INTO num_buckets BUCKETS
-   *   ]
-   *   [LOCATION path]
-   *   [COMMENT table_comment]
-   *   [TBLPROPERTIES (property_name=property_value, ...)]
+   *   create_table_clauses
*   [[AS] select_statement];
+   *
+   *   create_table_clauses (order insensitive):
+   * [PARTITIONED BY (col_name, col_name, ...)]
--- End diff --

Isn't `[OPTIONS table_property_list]` one of `create_table_clauses`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159164626
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -408,9 +417,17 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 .map(visitIdentifierList(_).toArray)
 .getOrElse(Array.empty[String])
 val properties = 
Option(ctx.tableProps).map(visitPropertyKeyValues).getOrElse(Map.empty)
-val bucketSpec = Option(ctx.bucketSpec()).map(visitBucketSpec)
+val bucketSpec = if (ctx.bucketSpec().size > 1) {
+  duplicateClausesNotAllowed("CLUSTERED BY", ctx)
--- End diff --

Can you split the validation logic and the extraction logic? In this case 
I'd move the check to line 411 and do the extract on line 420.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2018-01-01 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20133#discussion_r159156756
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserUtils.scala
 ---
@@ -39,6 +41,17 @@ object ParserUtils {
 throw new ParseException(s"Operation not allowed: $message", ctx)
   }
 
+  def duplicateClausesNotAllowed(message: String, ctx: ParserRuleContext): 
Nothing = {
+throw new ParseException(s"Found duplicate clauses: $message", ctx)
--- End diff --

We cannot merge these two functions to check the duplication?
e.g.,
```
  def checkDuplicateClauses[T](nodes: util.List[T], clauseName: String, 
ctx: ParserRuleContext): Unit = {
if (nodes.size() > 1) {
  throw new ParseException(s"Found duplicate clauses: $clauseName", ctx)
}
  }
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20133: [SPARK-22934] [SQL] Make optional clauses order i...

2017-12-31 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/20133

[SPARK-22934] [SQL] Make optional clauses order insensitive for CREATE 
TABLE SQL statement

## What changes were proposed in this pull request?
Currently, our CREATE TABLE syntax require the EXACT order of clauses. It 
is pretty hard to remember the exact order. Thus, this PR is to make optional 
clauses order insensitive for `CREATE TABLE` SQL statement.

```
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db_name.]table_name
[(col_name1 col_type1 [COMMENT col_comment1], ...)]
USING datasource
[OPTIONS (key1=val1, key2=val2, ...)]
[PARTITIONED BY (col_name1, col_name2, ...)]
[CLUSTERED BY (col_name3, col_name4, ...) INTO num_buckets BUCKETS]
[LOCATION path]
[COMMENT table_comment]
[TBLPROPERTIES (key1=val1, key2=val2, ...)]
[AS select_statement]
```

The proposal is to make the following clauses order insensitive. 
```
[OPTIONS (key1=val1, key2=val2, ...)]
[PARTITIONED BY (col_name1, col_name2, ...)]
[CLUSTERED BY (col_name3, col_name4, ...) INTO num_buckets BUCKETS]
[LOCATION path]
[COMMENT table_comment]
[TBLPROPERTIES (key1=val1, key2=val2, ...)]
```

The same idea is also applicable to Create Hive Table. 
```
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
[(col_name1[:] col_type1 [COMMENT col_comment1], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name2[:] col_type2 [COMMENT col_comment2], ...)]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION path]
[TBLPROPERTIES (key1=val1, key2=val2, ...)]
[AS select_statement]
```

The proposal is to make the following clauses order insensitive. 
```
[COMMENT table_comment]
[PARTITIONED BY (col_name2[:] col_type2 [COMMENT col_comment2], ...)]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION path]
[TBLPROPERTIES (key1=val1, key2=val2, ...)]
```

## How was this patch tested?
Added test cases

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark createDataSourceTableDDL

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20133.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20133


commit 8ae8f1832a62caf10a62511f339402c0d94f89ea
Author: gatorsmile 
Date:   2018-01-01T07:03:01Z

fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org