[
https://issues.apache.org/jira/browse/HIVE-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084596#comment-17084596
]
Krisztian Kasa commented on HIVE-19064:
---------------------------------------
The non-standard functionality enables double quote enclosed string literals
like
{code:java}
SELECT "This is a string" FROM t;
{code}
To enable this functionality the setting
*hive.support.quoted.identifiers=column* is used.
The following query is a simplified version of the query in test *quote2.q*
{code:java}
set hive.support.quoted.identifiers=column;
CREATE TABLE t (c1 int);
SELECT "a\"" FROM t;
{code}
With this patch which enables SQL standard functionality this query fails
{code:java}
org.apache.hadoop.hive.ql.parse.ParseException: line 3:10 cannot recognize
input near 'a' '""' 'FROM' in selection target
{code}
Cause: in order to enable double quoted identifiers the *QuotedIdentifier* rule
was extended in *HiveLexer.g*
{code:java}
fragment
QuotedIdentifier
:
{allowQuotedId() == Quotation.BACKTICKS}? ('`' ( '``' | ~('`') )* '`') {
setText(StringUtils.replace(getText().substring(1, getText().length() -1 ),
"``", "`")); }
| {allowQuotedId() == Quotation.STANDARD}? ('\"' ( '\"\"' | ~('\"') )*
'\"') { setText(StringUtils.replace(getText().substring(1, getText().length()
-1 ), "\"\"", "\"")); }
;
{code}
According to SQL standard the new rule escapes double quotes by duplication of
the double quote character like:
{code:java}
set hive.support.quoted.identifiers=standard;
CREATE TABLE t ("col0 ""Zero"" " int);
SELECT "col0 ""Zero"" " FROM t;
{code}
When the parser reads the first double quote character while parsing the query
*SELECT "a\"" FROM t;* in *column* mode first it tries to apply the
*QuotedIdentifier*. The Semantic predicate
{code}
{allowQuotedId() == Quotation.STANDARD}?
{code}
in the rule prevents applying the subrule
{code}
('\"' ( '\"\"' | ~('\"') )* '\"')
{code}
which makes sense since we want to treat the text
{code}
"a\""
{code}
as a String literal.
However the parser doesn't rewind the input stream to the *"* character but
reads the next one which is 'a'. It is not falling back to StringLiteral rule
which would accept the text.
> Add mode to support delimited identifiers enclosed within double quotation
> --------------------------------------------------------------------------
>
> Key: HIVE-19064
> URL: https://issues.apache.org/jira/browse/HIVE-19064
> Project: Hive
> Issue Type: Improvement
> Components: Parser, SQL
> Affects Versions: 3.0.0
> Reporter: Jesus Camacho Rodriguez
> Assignee: Krisztian Kasa
> Priority: Major
> Attachments: HIVE-19064.01.patch, HIVE-19064.02.patch,
> HIVE-19064.03.patch, HIVE-19064.4.patch
>
>
> As per SQL standard. Hive currently uses `` (backticks). Default will
> continue being backticks, but we will support identifiers within double
> quotation via configuration parameter.
> This issue will also extends support for arbitrary char sequences, e.g.,
> containing {{~ ! @ # $ % ^ & * () , < >}}, in database and table names.
> Currently, special characters are only supported for column names.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)