[ https://issues.apache.org/jira/browse/HIVE-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HIVE-19580. ----------------------------------- Resolution: Not A Problem OK. closing. Trying hard to think of the best way to classify, e.g. cannot reproduce, etc, invalid. Closing as "not a problem" as it's not an ASF issue. > Hive 2.3.2 with ORC files & stored on S3 are case sensitive on EMR > ------------------------------------------------------------------ > > Key: HIVE-19580 > URL: https://issues.apache.org/jira/browse/HIVE-19580 > Project: Hive > Issue Type: Bug > Affects Versions: 2.3.2 > Environment: EMR s3:// connector > Spark 2.3 but also true for lower versions > Hive 2.3.2 > Reporter: Arthur Baudry > Priority: Major > Fix For: 2.3.2 > > > Original file is csv: > COL1,COL2 > 1,2 > ORC file are created with Spark 2.3: > scala> val df = spark.read.option("header","true").csv("/user/hadoop/file") > scala> df.printSchema > root > |– COL1: string (nullable = true)| > |– COL2: string (nullable = true)| > scala> df.write.orc("s3://bucket/prefix") > In Hive: > hive> CREATE EXTERNAL TABLE test_orc(COL1 STRING, COL2 STRING) STORED AS ORC > LOCATION ("s3://bucket/prefix"); > hive> SELECT * FROM test_orc; > OK > NULL NULL > *Everyfield is null. However if fields are generated using lower case in > Spark schemas then everything works.* > The reason why I'm raising this bug is that we have customers using Hive > 2.3.2 to read files we generate through Spark and all our code base is > addressing fields using upper case while this is incompatible with their Hive > instance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)