[jira] [Commented] (SPARK-25132) Spark returns NULL for a column whose Hive metastore schema and Parquet schema are in different letter cases

yucai (JIRA) Thu, 16 Aug 2018 08:10:19 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-25132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582667#comment-16582667
 ]


yucai commented on SPARK-25132:
-------------------------------

If Spark allows data source case insensitive, query t2 should return number.
If Spark does not allow data source case insensitive, Spark should remind user 
with warning, return NULL may lead to the potential issue that is very 
difficult to find.

> Spark returns NULL for a column whose Hive metastore schema and Parquet 
> schema are in different letter cases
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-25132
>                 URL: https://issues.apache.org/jira/browse/SPARK-25132
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: Chenxiao Mao
>            Priority: Major
>
> Spark SQL returns NULL for a column whose Hive metastore schema and Parquet 
> schema are in different letter cases, regardless of spark.sql.caseSensitive 
> set to true or false.
> Here is a simple example to reproduce this issue:
> scala> spark.range(5).toDF.write.mode("overwrite").saveAsTable("t1")
> spark-sql> show create table t1;
> CREATE TABLE `t1` (`id` BIGINT)
> USING parquet
> OPTIONS (
>  `serialization.format` '1'
> )
> spark-sql> CREATE TABLE `t2` (`ID` BIGINT)
>  > USING parquet
>  > LOCATION 'hdfs://localhost/user/hive/warehouse/t1';
> spark-sql> select * from t1;
> 0
> 1
> 2
> 3
> 4
> spark-sql> select * from t2;
> NULL
> NULL
> NULL
> NULL
> NULL
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-25132) Spark returns NULL for a column whose Hive metastore schema and Parquet schema are in different letter cases

Reply via email to