[ 
https://issues.apache.org/jira/browse/ORC-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206700#comment-16206700
 ] 

Alan Gates commented on ORC-255:
--------------------------------

Comments on the general approach:

I am assuming the reader is aware it is reading an ACID 2 file and understands 
the layout of ACID files.  I did put support in AcidConstants to get to the 
appropriate columns.  I also added a separate call in OrcFile to make clear 
that the reader was expecting an ACID file.  It does return a Reader, just like 
createReader, but it is a different implementation.  I chose to add a separate 
call for two reasons.  One, the reader must supply at least a ValidTxnsList in 
order to get a valid AcidReader.  Two, this prevents OrcFile from needing to 
examine the file(s) to be read to see if they are ACID or not, which would have 
introduced additional work for the majority non-ACID case.  

The first thing createReaderForAcidFile will do is parse the ACID directory, 
using the passed in ValidTxnList, to determine which files are valid for it and 
which are not.  If the caller is generating splits and wishes to avoid every 
createReader call parsing the directory it can call AcidDirectoryParser itself 
and attach the resulting ParsedAcidDirectory to the options when calling 
createReaderForAcidFile.  In this case the provided ParsedAcidDirectory will be 
used by OrcFileAcidHelper rather than it parsing the directory.

While the code does expect the reader to understand the ACID file format, it 
does not expect the reader to understand when particular files in an ACID 
directory are valid and when they aren't.  If a reader is created for a file 
that is invalid based on the provided ValidTxnsList, it will return a null 
reader that returns no rows.

AcidReader extends ReaderImpl with the only difference being that it returns an 
AcidRecordReader.

An AcidRecordReader uses RecordReaderImpl to fetch a batch and then applies two 
possible filters.  The first is the ValidTxnsList, which may select out some or 
all rows from the batch.  If delete deltas are present they will also be 
applied to the batch.  These two things mean that batches comes back from the 
AcidRecordReader will most likely have the selectInUse boolean set to true, 
which usually would not be expected for batches coming out of ORC.  

Deletes are handled in one of two ways.  When a directory is parsed a valid set 
of delete deltas will be determined, based on the ValidTxnsList.  If we 
estimate that they will fit into memory (based on a config variable of the 
number of deletes allowed) then they will be put into a trie of hash tables 
keyed by original transaction id.  As each input file (base or delta) is read, 
it is sorted by original transaction id, so the appropriate hash table can be 
fetched from the trie and each record checked to see whether it has been 
deleted.  If the deletes will not fit in memory than as each insert file is 
read the deletes will be read via a merge (since they are already sorted).  
This is much less efficient since every delete file has to be read and every 
delete record examined for every insert file, but it is guaranteed to run in 
cases where all of the deletes can not fit into memory.

> Add support for reading ACID2 files to ORC
> ------------------------------------------
>
>                 Key: ORC-255
>                 URL: https://issues.apache.org/jira/browse/ORC-255
>             Project: ORC
>          Issue Type: New Feature
>          Components: ACID
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>
> When ORC was split out of Hive the reading and writing of ACID files was left 
> in Hive.  This blocks non-Hive users from reading or writing ACID.  I propose 
> to add support for ACID to ORC.
> At this point I only propose to add support for ACID 2 (that is, the version 
> that will be released in Hive 3, which simplifies the storage to have only 
> inserts and deletes (updates are an insert plus a delete)).
> Also, note that to use this readers and writers would still have to interact 
> with the Hive metastore to get a list of valid transactions and acquire 
> appropriate locks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to