Re: Case-sensitivity for column names when reading from ORC files in Hive

Owen O'Malley Mon, 08 Oct 2018 12:59:17 -0700

Hi Alexandre,
  It is the result of a feature in Hive 2.3.2. What is going on is that in
Hive 1.2, the reader would match the file's schema using position rather
than the column's name. In Hive 2.3, we've moved to name equivalence, which
is better, but is a change in behavior.


You can get the old behavior, if you configure
orc.force.positional.evolution to true. In ORC 1.5, which is included by
Hive 3.1, you can also use orc.schema.evolution.case.sensitive to still use
name matching, but ignore case. As always in ORC, you can either change the
property globally in your configuration or set the table property for a
more localized change.

.. Owen

On Mon, Oct 8, 2018 at 12:41 PM Alexandre Crayssac <[email protected]>
wrote:

> Hello everyone,
>
> I observed a different behavior between Hive version 1.2.1 and 2.3.2
> (that's the only two versions I've been able to test).
>
> When creating an external table pointing to ORC files and having upper
> cased column names in the ORC files metadata I'm able to read the data on
> 1.2.1 but not on 2.3.2 (i.e. all rows have NULL value).
>
> I tested with both upper cased and lower cased column names in my CREATE
> TABLE statement and it does not work in both cases. Looks like normal since
> column names are normalized to lower case in Hive.
>
> So, I would like to know if this is a feature or a bug in Hive 2.3.2 ?
>
> In fact, if this is a feature it would be impossible to have upper case
> column names in ORC files if we want to use them as an external table in
> Hive 2.3.2.
>
> I already posted an issue on Hive's JIRA but someone told me it would be
> better to ask here (https://issues.apache.org/jira/browse/HIVE-20693)
>
> Please, let me know if you need more informations.
>
> Kind regards,
>
> Alexandre
>

Re: Case-sensitivity for column names when reading from ORC files in Hive

Reply via email to