[jira] [Comment Edited] (CALCITE-528) Creating output row type of a Join does not obey case-sensitivity flags

Jinfeng Ni (JIRA) Wed, 17 Dec 2014 12:24:45 -0800

    [ 
https://issues.apache.org/jira/browse/CALCITE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250467#comment-14250467
 ]


Jinfeng Ni edited comment on CALCITE-528 at 12/17/14 8:23 PM:
--------------------------------------------------------------

Hi Julian,

Seems to me that Calcite base code does not see this case-sensitive issue 
simply because Calcite connection config uses LEX.ORACLE by default.  (

{code}
CalciteConnectionProperty.java 

 /** Lexical policy. */
  LEX("lex", Type.ENUM, Lex.ORACLE, false),

Lex.java

 /** Lexical policy similar to Oracle. The case of identifiers enclosed in
   * double-quotes is preserved; unquoted identifiers are converted to
   * upper-case; after which, identifiers are matched case-sensitively. */
  ORACLE(Quoting.DOUBLE_QUOTE, Casing.TO_UPPER, Casing.UNCHANGED, true),

  /** Lexical policy similar to MySQL. (To be precise: MySQL on Windows;
   * MySQL on Linux uses case-sensitive matching, like the Linux file system.)
   * The case of identifiers is preserved whether or not they quoted;
   * after which, identifiers are matched case-insensitively.
   * Back-ticks allow identifiers to contain non-alphanumeric characters. */
  MYSQL(Quoting.BACK_TICK, Casing.UNCHANGED, Casing.UNCHANGED, false),

{code}

In SqlToRelConverter, caseSensitive is hard-coded to "true". This works for for 
LEX.ORACLE, since all the unquoted identifier have been converted to 
upper-case. However, it will not work for LEX.MYSQL or other LEX which uses 
"UNCHANGED" casing. 

{code}
SqlToRElConverter.java:3323

      final boolean caseSensitive = true; // name already fully-qualified
      e = rexBuilder.makeFieldAccess(e, name, caseSensitive);
{code}

Therefore, I think we need pass the SqlParser's LEX case-sensitive to 
validator, and probably SqlRelConverter. That  way, it will work for either 
LEX.ORACLE or other LEX configuration.

Also, I see probably almost all the queries in JdbcTest.java uses quoted column 
names.  If I change column name to unquoted, then Calcite will complain "Column 
'EMPID' not found in any table" error, for the following simple query: 

{code}
select empid from "hr"."emps"
{code}

That probably happens because CatalogReader use case-sensitive=true, while the 
unquoted identifier have been converted to upper-case. 

Any suggestion? Thanks!



was (Author: jni):
Hi Julian,

Seems to me that Calcite base code does not see this case-sensitive issue 
simply because Calcite connection config uses LEX.ORACLE by default.  (

{code}
CalciteConnectionProperty.java 

 /** Lexical policy. */
  LEX("lex", Type.ENUM, Lex.ORACLE, false),

Lex.java

 /** Lexical policy similar to Oracle. The case of identifiers enclosed in
   * double-quotes is preserved; unquoted identifiers are converted to
   * upper-case; after which, identifiers are matched case-sensitively. */
  ORACLE(Quoting.DOUBLE_QUOTE, Casing.TO_UPPER, Casing.UNCHANGED, true),

  /** Lexical policy similar to MySQL. (To be precise: MySQL on Windows;
   * MySQL on Linux uses case-sensitive matching, like the Linux file system.)
   * The case of identifiers is preserved whether or not they quoted;
   * after which, identifiers are matched case-insensitively.
   * Back-ticks allow identifiers to contain non-alphanumeric characters. */
  MYSQL(Quoting.BACK_TICK, Casing.UNCHANGED, Casing.UNCHANGED, false),

{code}

In SqlToRelConverter, caseSensitive is hard-coded to "true". This works for for 
LEX.ORACLE, since all the quoted identifier have been converted to upper-case. 
However, it will not work for LEX.MYSQL or other LEX which uses "UNCHANGED" 
casing. 

{code}
SqlToRElConverter.java:3323

      final boolean caseSensitive = true; // name already fully-qualified
      e = rexBuilder.makeFieldAccess(e, name, caseSensitive);
{code}

Therefore, I think we need pass the SqlParser's LEX case-sensitive to 
validator, and probably SqlRelConverter. That  way, it will work for either 
LEX.ORACLE or other LEX configuration.

Also, I see probably almost all the queries in JdbcTest.java uses quoted column 
names.  If I change column name to unquoted, then Calcite will complain "Column 
'EMPID' not found in any table" error, for the following simple query: 

{code}
select empid from "hr"."emps"
{code}

That probably happens because CatalogReader use case-sensitive=true, while the 
unquoted identifier have been converted to upper-case. 

Any suggestion? Thanks!


> Creating output row type of a Join does not obey case-sensitivity flags
> -----------------------------------------------------------------------
>
>                 Key: CALCITE-528
>                 URL: https://issues.apache.org/jira/browse/CALCITE-528
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 0.9.1-incubating
>            Reporter: Aman Sinha
>            Assignee: Julian Hyde
>
> In JoinRelBase.createJoinType() which creates a row type of the output row, a 
> HashSet of String is used to keep track of unique field names.  The field 
> names  'column1'  and 'Column1' will both be stored.   This creates a problem 
> for systems which are treating identifiers as case-insensitive (such as 
> Drill) which rely on a Project below a Join to create unique names if the 
> join columns are the same name (regardless of case).  
> Ideally, the comparison for this should be done based on the criteria 
> specified in the Lex settings when instantiating the 
> SqlParser.ParserConfigImpl.  So, if the parser was created with MYSQL Lex 
> settings (see Lex.java), it should be obeyed by the 
> JoinRelBase.createJoinType().  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CALCITE-528) Creating output row type of a Join does not obey case-sensitivity flags

Reply via email to