GitHub user yhuai opened a pull request:

    https://github.com/apache/spark/pull/1317

    [SPARK-2339][SQL] SQL parser in sql-core is case sensitive, but a table 
alias is converted to lower case when we create Subquery

    Reported by 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Join-throws-exception-td8599.html
    After we get the table from the catalog, because the table has an alias, we 
will temporarily insert a Subquery. Then, we convert the table alias to lower 
case no matter if the parser is case sensitive or not.
    To see the issue ...
    ```
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    import sqlContext.createSchemaRDD
    
    case class Person(name: String, age: Int)
    
    val people = 
sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p 
=> Person(p(0), p(1).trim.toInt))
    people.registerAsTable("people")
    
    sqlContext.sql("select PEOPLE.name from people PEOPLE")
    ```
    The plan is ...
    ```
    == Query Plan ==
    Project ['PEOPLE.name]
     ExistingRdd [name#0,age#1], MapPartitionsRDD[4] at mapPartitions at 
basicOperators.scala:176
    ```
    You can find that `PEOPLE.name` is not resolved.
    
    This PR introduces three changes.
    1.  If a table has an alias, the catalog will not lowercase the alias. If a 
lowercase alias is needed, the analyzer will do the work.
    2.  A catalog has a new val caseSensitive that indicates if this catalog is 
case sensitive or not. For example, a SimpleCatalog is case sensitive, but 
    3.  Corresponding unit tests.
    With this PR, case sensitivity of database names and table names is handled 
by the catalog. Case sensitivity of other identifiers are handled by the 
analyzer.
    
    JIRA: https://issues.apache.org/jira/browse/SPARK-2339

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yhuai/spark SPARK-2339

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1317.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1317
    
----
commit 12d8006f738e299a08621c382bef4a0a23a72b6f
Author: Yin Huai <[email protected]>
Date:   2014-07-07T16:55:59Z

    Handling case sensitivity correctly.
    This patch introduces three changes.
    1. If a table has an alias, the catalog will not lowercase the alias. If a 
lowercase alias is needed, the analyzer will do the work.
    2. A catalog has a new val caseSensitive that indicates if this catalog is 
case sensitive or not. For example, a SimpleCatalog is case sensitive, but
    3. Corresponding unit tests.
    With this patch, case sensitivity of database names and table names is 
handled by the catalog. Case sensitivity of other identifiers is handled by the 
analyzer.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to