Reynold Xin created SPARK-12362:
-----------------------------------
Summary: Inline a full-fledged SQL parser
Key: SPARK-12362
URL: https://issues.apache.org/jira/browse/SPARK-12362
Project: Spark
Issue Type: Improvement
Components: SQL
Reporter: Reynold Xin
Spark currently has two SQL parsers it is using: a simple one based on Scala
parser combinator, and another one based on Hive.
Neither is a good long term solution. The parser combinator one has bad error
messages for users and does not warn when there are conflicts in the defined
grammar. The Hive one depends directly on Hive itself, and as a result, it is
very difficult to introduce new grammar.
The goal of the ticket is to create a single SQL query parser that is powerful
enough to replace the existing ones. The requirements for the new parser are:
1. Can support almost all of HiveQL
2. Can support all existing SQL parser built using Scala parser combinators
3. Can be used for expression parsing in addition to SQL query parsing
4. Can provide good error messages for incorrect syntax
Rather than building one from scratch, we should investigate whether we can
leverage existing open source projects such as Hive (by inlining the parser
part) or Calcite.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]