[
https://issues.apache.org/jira/browse/HIVE-25453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ádám Szita resolved HIVE-25453.
-------------------------------
Fix Version/s: 4.0.0
Resolution: Fixed
Committed to master. Thanks for the review [~pvary]!
> Add LLAP IO support for Iceberg ORC tables
> ------------------------------------------
>
> Key: HIVE-25453
> URL: https://issues.apache.org/jira/browse/HIVE-25453
> Project: Hive
> Issue Type: New Feature
> Reporter: Ádám Szita
> Assignee: Ádám Szita
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 3h 20m
> Remaining Estimate: 0h
>
> Adding support for reading Iceberg ORC tables via LLAP..
> The easy part is swapping out the plain simple VectorizedOrcRecordReader to
> LlapRecordReader.
> The hard part is maintaining correctness even after a series of schema
> changes that are normally allowed to Iceberg/ORC, but were not for simple ORC
> or therefore for LLAP. To make it all work, LLAP had to be made to support a
> broader schema evolution.
> Before this change LLAP made the simple assumption that the reader and file
> schemas match all columns, now separate physical and logical read schemas and
> corresponding include lists are used instead. Also added
> logicalOrderedColumnIds here, which holds indices from the reader schema, but
> in file schema order - a necessary tool for mapping the results produced by
> LLAP, as LLAP always reads columns in the order as they are written out in
> the file.
> Also added a new CLI driver class for testing the cached reads from
> Iceberg/ORC tables via LLAP.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)