[ 
https://issues.apache.org/jira/browse/CALCITE-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795601#comment-16795601
 ] 

Danny Chan commented on CALCITE-2928:
-------------------------------------

[~julianhyde] I found these docs: 

1. MYSQL   [1] 
2. ORACLE [2] [3]

Oracle is case-sensitive for database internal, but for unquoted identifier 
Oracle will change them to uppercase silently, this is also what Calcite's 
default behavior.

So for users, identifiers like <userName> and <USERNAME> are the same, which is 
case-insensitive.

Now the default behavior for Calcite is in-consistent for ListOperatorTable, 
the unquoted identifier  is parsed to all uppercase but ListOperatorTable look 
up operators case-sensitively. So if a use register a function with lowerCase 
name, it will not work anymore.

[1] https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html
[2] https://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements008.htm
[3] 
https://stackoverflow.com/questions/7425153/reason-why-oracle-is-case-sensitive

> Make UDF lookup default to case insensitive
> -------------------------------------------
>
>                 Key: CALCITE-2928
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2928
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Danny Chan
>            Assignee: Danny Chan
>            Priority: Major
>             Fix For: 1.19.0
>
>
> Now for Calcite, we make default parser config unquotedCasing to 
> Lex.ORACLE.unquotedCasing(to uppercase)[1], and caseSensitive to 
> Lex.ORACLE.caseSensitive(case sensitive true).
> So if we have a UDAF named my_func and query with sql like:
> {code:java}
> select f0, my_func(f1) from table1 group by f0;
> {code}
> We would got a unparsed sql:
> {code:java}
> SELECT F0, MY_FUNC(F1) FROM TABLE1 GROUP BY F0;
> {code}
> For CalciteCatalogReader we hard code the function look up to case sensitive 
> true[2],
> For ListSqlOperatorTable we make the operator name lookup case sensitive 
> true[3].
> For ReflectiveSqlOperatorTable, we make built-in operators 
> case-insensitively[4].
> For most of the cases, we use ListSqlOperatorTable to register our UDFs[5] 
> chained with SqlStdOperatorTable(which composite a ChainedSqlOperatorTable), 
> which finally passed to CalciteCatalogReader for validation.
> So there are some questions i have:
> 1. Why we make built-in operators look up case-insensitively while 
> ListSqlOperatorTable(for UDFs) case-sensitively, with default unquotedCasing 
> of TO_UPPERCASE.
> 2. What is the usage of CalciteCatalogReader#lookupOperatorOverloads i only 
> saw it used in a unit test LookupOperatorOverloadsTest.
> It seems that make UDF look up case-sensitively does not make any sense, 
> users will never distinguish their function with just word cases. And i 
> checked also MYSQL, ORACLE, POSTGRES, their UDFs are all registered 
> case-insensitively.
> [1] 
> https://github.com/apache/calcite/blob/ffca956be03a99cd11e440d652b09674aaa727e6/core/src/main/java/org/apache/calcite/sql/parser/SqlParser.java#L231
> [2] 
> https://github.com/apache/calcite/blob/ffca956be03a99cd11e440d652b09674aaa727e6/core/src/main/java/org/apache/calcite/prepare/CalciteCatalogReader.java#L166
> [3] 
> https://github.com/apache/calcite/blob/ffca956be03a99cd11e440d652b09674aaa727e6/core/src/main/java/org/apache/calcite/sql/util/ListSqlOperatorTable.java#L63
> [4] 
> https://github.com/apache/calcite/blob/ffca956be03a99cd11e440d652b09674aaa727e6/core/src/main/java/org/apache/calcite/sql/util/ReflectiveSqlOperatorTable.java#L103
> [5] 
> https://github.com/apache/calcite/blob/ffca956be03a99cd11e440d652b09674aaa727e6/core/src/test/java/org/apache/calcite/test/MockSqlOperatorTable.java#L46



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to