[
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16039900#comment-16039900
]
Teddy Choi commented on HIVE-12631:
-----------------------------------
The 10th patch fixed data inconsistency bug and made OrcAcidEncodedDataConsumer
to utilize VectorizedOrcAcidRowBatchReader.
[~ekoifman], the changes in VectorizedOrcAcidRowBatchReader are minor.
- Allowed ColumnVector arrays instead of VectorizedRowBatch.
- Replaced BitSet objects with int arrays to reuse objects.
- Removed exception raising code in memory size check.
LlapIoImpl uses ColumnVectorProducer for async I/O and uses LowLevelCache for
caching. I made OrcColumnVectorProducer produce ColumnVectorBatch from ORC ACID
files, OrcEncodedDataReader to cache original files, and
VectorizedOrcAcidRowBatchReader to cache delta files.
Thank you.
> LLAP: support ORC ACID tables
> -----------------------------
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
> Issue Type: Bug
> Components: llap, Transactions
> Reporter: Sergey Shelukhin
> Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch,
> HIVE-12631.1.patch, HIVE-12631.2.patch, HIVE-12631.3.patch,
> HIVE-12631.4.patch, HIVE-12631.5.patch, HIVE-12631.6.patch,
> HIVE-12631.7.patch, HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and
> parallelization of reads and processing. This path does not support ACID. As
> far as I remember ACID logic is embedded inside ORC format; we need to
> refactor it to be on top of some interface, if practical; or just port it to
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is
> currently low-level (CB-level in ORC), so we could just use it to read bases
> and deltas (deltas should be cached with higher priority) and merge as usual.
> We could also cache merged representation in future.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)