[jira] [Comment Edited] (PHOENIX-6668) Spark3 connector cannot distinguish column name cases

Attila Zsolt Piros (Jira) Mon, 19 Sep 2022 01:37:06 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606489#comment-17606489
 ]


Attila Zsolt Piros edited comment on PHOENIX-6668 at 9/19/22 8:36 AM:
----------------------------------------------------------------------

It might be interesting how I have came to this:
I have just loaded the table and printed its schema:
{noformat}
    val dfR = spark.sqlContext.read
      .format("phoenix")
      .options(Map("table" -> "TABLE2", PhoenixDataSource.ZOOKEEPER_URL -> 
quorumAddress))
      .load()

    dfR.printSchema()
{noformat}

Which was like:

{noformat}
root
 |-- ID: long (nullable = true)
 |-- TABLE1_ID: long (nullable = true)
 |-- t2col1: string (nullable = true)
{noformat}


Then I have realised it was complaining for all the columns (when the old 
schema was used):
{noformat}
Cannot find data for output column 'ID'
Cannot find data for output column 'TABLE1_ID'
Cannot find data for output column 't2col1'
{noformat}



was (Author: attilapiros):
It might be interesting how I have came to this:
I have just loaded the table and printed its schema:
{noformat}
    val dfR = spark.sqlContext.read
      .format("phoenix")
      .options(Map("table" -> "TABLE2", PhoenixDataSource.ZOOKEEPER_URL -> 
quorumAddress))
      .load()

    dfR.printSchema()
{noformat}

Then I have realised it was complaining for all the columns (when the old 
schema was used):
{noformat}
Cannot find data for output column 'ID'
Cannot find data for output column 'TABLE1_ID'
Cannot find data for output column 't2col1'
{noformat}


> Spark3 connector cannot distinguish column name cases
> -----------------------------------------------------
>
>                 Key: PHOENIX-6668
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6668
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Istvan Toth
>            Priority: Major
>
> The Spark2 connector handled lowercase and mixed case column names correctly 
> in DataFrame definitions.
> Spark3 only does case-insensitive column resolving, and even 
> _spark.sql.caseSensitive_  doesn't seem to do anything, neither backquouting.
> Again, this is not something that can likely be fixes from the Phoenix side 
> without changes in Spark, and this ticket is mainy for documenting this 
> regression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (PHOENIX-6668) Spark3 connector cannot distinguish column name cases

Reply via email to