Re: Case-sensitivity for column names when reading from ORC files in Hive

Alexandre Crayssac Mon, 08 Oct 2018 13:03:02 -0700

Hello Owen,

Thanks for your quick and precise answer! It's now clearer to me.


Alexandre

On Mon, Oct 8, 2018, 21:58 Owen O'Malley <[email protected]> wrote:

> Hi Alexandre,
>   It is the result of a feature in Hive 2.3.2. What is going on is that in
> Hive 1.2, the reader would match the file's schema using position rather
> than the column's name. In Hive 2.3, we've moved to name equivalence, which
> is better, but is a change in behavior.
>
> You can get the old behavior, if you configure
> orc.force.positional.evolution to true. In ORC 1.5, which is included by
> Hive 3.1, you can also use orc.schema.evolution.case.sensitive to still use
> name matching, but ignore case. As always in ORC, you can either change the
> property globally in your configuration or set the table property for a
> more localized change.
>
> .. Owen
>
> On Mon, Oct 8, 2018 at 12:41 PM Alexandre Crayssac <[email protected]>
> wrote:
>
>> Hello everyone,
>>
>> I observed a different behavior between Hive version 1.2.1 and 2.3.2
>> (that's the only two versions I've been able to test).
>>
>> When creating an external table pointing to ORC files and having upper
>> cased column names in the ORC files metadata I'm able to read the data on
>> 1.2.1 but not on 2.3.2 (i.e. all rows have NULL value).
>>
>> I tested with both upper cased and lower cased column names in my CREATE
>> TABLE statement and it does not work in both cases. Looks like normal since
>> column names are normalized to lower case in Hive.
>>
>> So, I would like to know if this is a feature or a bug in Hive 2.3.2 ?
>>
>> In fact, if this is a feature it would be impossible to have upper case
>> column names in ORC files if we want to use them as an external table in
>> Hive 2.3.2.
>>
>> I already posted an issue on Hive's JIRA but someone told me it would be
>> better to ask here (https://issues.apache.org/jira/browse/HIVE-20693)
>>
>> Please, let me know if you need more informations.
>>
>> Kind regards,
>>
>> Alexandre
>>
>

Re: Case-sensitivity for column names when reading from ORC files in Hive

Reply via email to