Hi Alexandre, It is the result of a feature in Hive 2.3.2. What is going on is that in Hive 1.2, the reader would match the file's schema using position rather than the column's name. In Hive 2.3, we've moved to name equivalence, which is better, but is a change in behavior.
You can get the old behavior, if you configure orc.force.positional.evolution to true. In ORC 1.5, which is included by Hive 3.1, you can also use orc.schema.evolution.case.sensitive to still use name matching, but ignore case. As always in ORC, you can either change the property globally in your configuration or set the table property for a more localized change. .. Owen On Mon, Oct 8, 2018 at 12:41 PM Alexandre Crayssac <[email protected]> wrote: > Hello everyone, > > I observed a different behavior between Hive version 1.2.1 and 2.3.2 > (that's the only two versions I've been able to test). > > When creating an external table pointing to ORC files and having upper > cased column names in the ORC files metadata I'm able to read the data on > 1.2.1 but not on 2.3.2 (i.e. all rows have NULL value). > > I tested with both upper cased and lower cased column names in my CREATE > TABLE statement and it does not work in both cases. Looks like normal since > column names are normalized to lower case in Hive. > > So, I would like to know if this is a feature or a bug in Hive 2.3.2 ? > > In fact, if this is a feature it would be impossible to have upper case > column names in ORC files if we want to use them as an external table in > Hive 2.3.2. > > I already posted an issue on Hive's JIRA but someone told me it would be > better to ask here (https://issues.apache.org/jira/browse/HIVE-20693) > > Please, let me know if you need more informations. > > Kind regards, > > Alexandre >
