cloud-fan commented on a change in pull request #23259: [SPARK-26215][SQL] 
Define reserved/non-reserved keywords based on the ANSI SQL standard
URL: https://github.com/apache/spark/pull/23259#discussion_r258481481
 
 

 ##########
 File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
 ##########
 @@ -744,40 +750,66 @@ number
     | MINUS? BIGDECIMAL_LITERAL       #bigDecimalLiteral
     ;
 
+// A list of reserved keywords in Spark SQL. These keywords are reserved when 
`spark.sql.parser.ansi.enabled` = true.
+// Currently, we only reserve the ANSI keywords below that almost all the ANSI 
SQL standards (SQL-92, SQL-99,
+// SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL reserve.
+//
+// NOTE: The ANTLR tokens in `SqlBase.g4` must exist in either `ansiReserved` 
or `ansiNonReserved`. Therefore,
+// when one adds a new token in this file, ones must update one of the two 
rules, too.
+ansiReserved
+    : ALL | AND | ANY | AS | AUTHORIZATION | BOTH | CASE | CAST | CHECK | 
COLLATE | COLUMN | CONSTRAINT | CREATE
+    | CROSS | CURRENT_DATE | CURRENT_TIME | CURRENT_TIMESTAMP | CURRENT_USER | 
DISTINCT | ELSE | END | EXCEPT
+    | FALSE | FETCH | FOR | FOREIGN | FROM | FULL | GRANT | GROUP | HAVING | 
IN | INNER | INTERSECT | INTO | IS
+    | JOIN | LEADING | LEFT | NATURAL | NOT | NULL | ON | ONLY | OR | ORDER | 
OUTER | OVERLAPS | PRIMARY
+    | REFERENCES | RIGHT | SELECT | SESSION_USER | SOME | TABLE | THEN | TO | 
TRAILING | UNION | UNIQUE | USER
+    | USING | WHEN | WHERE | WITH
+    ;
+
+// When `spark.sql.parser.ansi.enabled` = true, the `ansiNonReserved` keywords 
can be used for identifiers.
+// Otherwise (`spark.sql.parser.ansi.enabled` = false), we follow the existing 
Spark SQL behaviour until v3.0:
+// the `nonReserved` keywords can be used instead.
+ansiNonReserved
+    : ADD | AFTER | ALTER | ANALYZE | ANTI | ARCHIVE | ARRAY | ASC | AT | 
BETWEEN | BUCKET | BUCKETS | BY | CACHE
+    | CASCADE | CHANGE | CLEAR | CLUSTER | CLUSTERED | CODEGEN | COLLECTION | 
COLUMNS | COMMENT | COMMIT
+    | COMPACT | COMPACTIONS | COMPUTE | CONCATENATE | COST | CUBE | CURRENT | 
DATA | DATABASE | DATABASES
+    | DBPROPERTIES | DEFINED | DELETE | DELIMITED | DESC | DESCRIBE | DFS | 
DIRECTORIES | DIRECTORY | DISTRIBUTE
+    | DIV | DROP | ESCAPED | EXCHANGE | EXISTS | EXPLAIN | EXPORT | EXTENDED | 
EXTERNAL | EXTRACT | FIELDS
+    | FILEFORMAT | FIRST | FOLLOWING | FORMAT | FORMATTED | FUNCTION | 
FUNCTIONS | GLOBAL | GROUPING | IF
+    | IGNORE | IMPORT | INDEX | INDEXES | INPATH | INPUTFORMAT | INSERT | 
INTERVAL | ITEMS | KEYS | LAST
+    | LATERAL | LAZY | LIKE | LIMIT | LINES | LIST | LOAD | LOCAL | LOCATION | 
LOCK | LOCKS | LOGICAL | MACRO
+    | MAP | MSCK | NO | NULLS | OF | OPTION | OPTIONS | OUT | OUTPUTFORMAT | 
OVER | OVERWRITE | PARTITION
+    | PARTITIONED | PARTITIONS | PERCENT | PERCENTLIT | PIVOT | PRECEDING | 
PRINCIPALS | PURGE | RANGE
+    | RECORDREADER | RECORDWRITER | RECOVER | REDUCE | REFRESH | RENAME | 
REPAIR | REPLACE | RESET | RESTRICT
+    | REVOKE | RLIKE | ROLE | ROLES | ROLLBACK | ROLLUP | ROW | ROWS | SCHEMA 
| SEMI | SEPARATED | SERDE
+    | SERDEPROPERTIES | SET | SETMINUS | SETS | SHOW | SKEWED | SORT | SORTED 
| START | STATISTICS | STORED | STRATIFY
+    | STRUCT | TABLES | TABLESAMPLE | TBLPROPERTIES | TEMPORARY | TERMINATED | 
TOUCH | TRANSACTION | TRANSACTIONS
+    | TRANSFORM | TRUE | TRUNCATE | UNARCHIVE | UNBOUNDED | UNCACHE | UNLOCK | 
UNSET | USE | VALUES | VIEW | WINDOW
+    ;
+
 nonReserved
-    : SHOW | TABLES | COLUMNS | COLUMN | PARTITIONS | FUNCTIONS | DATABASES
-    | ADD
-    | OVER | PARTITION | RANGE | ROWS | PRECEDING | FOLLOWING | CURRENT | ROW 
| LAST | FIRST | AFTER
-    | MAP | ARRAY | STRUCT
-    | PIVOT | LATERAL | WINDOW | REDUCE | TRANSFORM | SERDE | SERDEPROPERTIES 
| RECORDREADER
-    | DELIMITED | FIELDS | TERMINATED | COLLECTION | ITEMS | KEYS | ESCAPED | 
LINES | SEPARATED
-    | EXTENDED | REFRESH | CLEAR | CACHE | UNCACHE | LAZY | GLOBAL | TEMPORARY 
| OPTIONS
-    | GROUPING | CUBE | ROLLUP
-    | EXPLAIN | FORMAT | LOGICAL | FORMATTED | CODEGEN | COST
-    | TABLESAMPLE | USE | TO | BUCKET | PERCENTLIT | OUT | OF
-    | SET | RESET
-    | VIEW | REPLACE
-    | IF
-    | POSITION
-    | EXTRACT
-    | NO | DATA
-    | START | TRANSACTION | COMMIT | ROLLBACK | IGNORE
-    | SORT | CLUSTER | DISTRIBUTE | UNSET | TBLPROPERTIES | SKEWED | STORED | 
DIRECTORIES | LOCATION
-    | EXCHANGE | ARCHIVE | UNARCHIVE | FILEFORMAT | TOUCH | COMPACT | 
CONCATENATE | CHANGE
-    | CASCADE | RESTRICT | BUCKETS | CLUSTERED | SORTED | PURGE | INPUTFORMAT 
| OUTPUTFORMAT
-    | DBPROPERTIES | DFS | TRUNCATE | COMPUTE | LIST
-    | STATISTICS | ANALYZE | PARTITIONED | EXTERNAL | DEFINED | RECORDWRITER
-    | REVOKE | GRANT | LOCK | UNLOCK | MSCK | REPAIR | RECOVER | EXPORT | 
IMPORT | LOAD | VALUES | COMMENT | ROLE
-    | ROLES | COMPACTIONS | PRINCIPALS | TRANSACTIONS | INDEX | INDEXES | 
LOCKS | OPTION | LOCAL | INPATH
-    | ASC | DESC | LIMIT | RENAME | SETS
-    | AT | NULLS | OVERWRITE | ALL | ANY | ALTER | AS | BETWEEN | BY | CREATE 
| DELETE
-    | DESCRIBE | DROP | EXISTS | FALSE | FOR | GROUP | IN | INSERT | INTO | IS 
|LIKE
-    | NULL | ORDER | OUTER | TABLE | TRUE | WITH | RLIKE
-    | AND | CASE | CAST | DISTINCT | DIV | ELSE | END | FUNCTION | INTERVAL | 
MACRO | OR | STRATIFY | THEN
-    | UNBOUNDED | WHEN
-    | DATABASE | SELECT | FROM | WHERE | HAVING | TO | TABLE | WITH | NOT
-    | DIRECTORY
-    | BOTH | LEADING | TRAILING
+    : ADD | AFTER | ALL | ALTER | ANALYZE | AND | ANY | ARCHIVE | ARRAY | AS | 
ASC | AT | AUTHORIZATION | BETWEEN
+    | BOTH | BUCKET | BUCKETS | BY | CACHE | CASCADE | CASE | CAST | CHANGE | 
CHECK | CLEAR | CLUSTER | CLUSTERED
+    | CODEGEN | COLLATE | COLLECTION | COLUMN | COLUMNS | COMMENT | COMMIT | 
COMPACT | COMPACTIONS | COMPUTE
+    | CONCATENATE | CONSTRAINT | COST | CREATE | CUBE | CURRENT | CURRENT_DATE 
| CURRENT_TIME | CURRENT_TIMESTAMP
+    | CURRENT_USER | DATA | DATABASE | DATABASES | DBPROPERTIES | DEFINED | 
DELETE | DELIMITED | DESC | DESCRIBE | DFS
+    | DIRECTORIES | DIRECTORY | DISTINCT | DISTRIBUTE | DIV | DROP | ELSE | 
END | ESCAPED | EXCHANGE | EXISTS | EXPLAIN
+    | EXPORT | EXTENDED | EXTERNAL | EXTRACT | FALSE | FETCH | FIELDS | 
FILEFORMAT | FIRST | FOLLOWING | FOR | FOREIGN
+    | FORMAT | FORMATTED | FROM | FUNCTION | FUNCTIONS | GLOBAL | GRANT | 
GROUP | GROUPING | HAVING | IF | IGNORE
+    | IMPORT | IN | INDEX | INDEXES | INPATH | INPUTFORMAT | INSERT | INTERVAL 
| INTO | IS | ITEMS | KEYS | LAST
+    | LATERAL | LAZY | LEADING | LIKE | LIMIT | LINES | LIST | LOAD | LOCAL | 
LOCATION | LOCK | LOCKS | LOGICAL | MACRO
+    | MAP | MSCK | NO | NOT | NULL | NULLS | OF | ONLY | OPTION | OPTIONS | OR 
| ORDER | OUT | OUTER | OUTPUTFORMAT
+    | OVER | OVERLAPS | OVERWRITE | PARTITION | PARTITIONED | PARTITIONS | 
PERCENTLIT | PIVOT | POSITION | PRECEDING
+    | PRIMARY | PRINCIPALS | PURGE | RANGE | RECORDREADER | RECORDWRITER | 
RECOVER | REDUCE | REFERENCES | REFRESH
+    | RENAME | REPAIR | REPLACE | RESET | RESTRICT | REVOKE | RLIKE | ROLE | 
ROLES | ROLLBACK | ROLLUP | ROW | ROWS
+    | SELECT | SEPARATED | SERDE | SERDEPROPERTIES | SESSION_USER | SET | SETS 
| SHOW | SKEWED | SOME | SORT | SORTED
+    | START | STATISTICS | STORED | STRATIFY | STRUCT | TABLE | TABLES | 
TABLESAMPLE | TBLPROPERTIES | TEMPORARY
+    | TERMINATED | THEN | TO | TOUCH | TRAILING | TRANSACTION | TRANSACTIONS | 
TRANSFORM | TRUE | TRUNCATE | UNARCHIVE
+    | UNBOUNDED | UNCACHE | UNLOCK | UNIQUE | UNSET | USE | USER | VALUES | 
VIEW | WHEN | WHERE | WINDOW | WITH
+    ;
+
+strictNonReserved
 
 Review comment:
   ```
   | {!ansi}? reserved       #unquotedIdentifier
   | {!ansi}? nonReserved    #unquotedIdentifier
   ```
   We should only keep `| {!ansi}? nonReserved    #unquotedIdentifier`
   
   And `reserved` + `nonReserved` are also all keywords. we should mention it 
as well.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to