GitHub user alanfgates opened a pull request:

    https://github.com/apache/orc/pull/179

    ORC-255

    This is not ready for commit.  I'm just putting it up so people can start 
looking at it and giving feedback.
    
    As noted in the JIRA, this only deals with ACID2 and the vector batch 
interface.
    
    This depends on an unreleased version of Hive's storage-api.  It also fails 
when running TestRecordReaderImpl due to changes in storage-api's DiskRangeList.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/alanfgates/orc orc255

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/orc/pull/179.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #179
    
----
commit 96026a342bf531c9c12b3cc8a127f33026cba6b9
Author: Alan Gates <[email protected]>
Date:   2017-09-15T18:40:01Z

    WIP Ported parsing parts of Hive's AcidUtils into AcidDirectoryParser and 
supporting classes.  Haven't finished the testing yet.

commit 12477e216caee814fd3c6545a3a7c938d54369b8
Author: Alan Gates <[email protected]>
Date:   2017-09-26T22:57:50Z

    Finished testing AcidDirectoryParser.

commit 096072c6c6628f1bcee4ec931ec785e136e11e23
Author: Alan Gates <[email protected]>
Date:   2017-09-27T00:15:28Z

    Changed AcidVersionedDirectory to track txn information for files in 
addition to just FileStatus.

commit df66d52047938c239948bd04559c95d4fcac2227
Author: Alan Gates <[email protected]>
Date:   2017-09-27T22:45:46Z

    Moved AcidVersionedDirectory to ParsedAcidDirectory to better fit with 
terminology of AcidDirectoryParser and ParsedAcidFile.  Added ability to 
determine whether a given input file from the directory should be read and to 
determine which delete deltas to use for a given input file.  Fixed a number of 
bugs I found along the way.

commit 111e1308a0cbf79862e80c97f5c6ca9c78b38273
Author: Alan Gates <[email protected]>
Date:   2017-09-28T20:00:12Z

    Added ability to read insert files (base and normal delta).  Haven't yet 
done delete files.

commit b8a7e6d7da40e83d193140d565463caf83379ee1
Author: Alan Gates <[email protected]>
Date:   2017-09-30T00:59:09Z

    WIP, wrote the initial code for handling the deletes.  Haven't tested it 
yet.

commit 9146dd6020d63694e0b5773b2f092c102e78b0da
Author: Alan Gates <[email protected]>
Date:   2017-10-03T19:54:09Z

    Fixed a bunch of errors in delete handling.  Added unit tests for delete 
testing.

commit 90ff039b83c2a198b5b7117b8c554c989a374af7
Author: Alan Gates <[email protected]>
Date:   2017-10-04T23:37:25Z

    Went overboard on caching delete sets.  I'm going to simplify this a bunch 
and remove the caching.  But checking in now in case I change my mind and 
decide to go back to the caching.

commit acaabe6272e57e2bce0c9af5f74d61a2e1510709
Author: Alan Gates <[email protected]>
Date:   2017-10-05T00:53:30Z

    Simplified delete sets to be attached to a ParsedAcidDirectory instead of 
trying to cache them.  That leaves it up to the user to make sure there aren't 
too many ParsedAcidDirectories live in a process, each with its own DeleteSet.

commit ed77b1e89a390c2c451b821a84f4a76595ad3cda
Author: Alan Gates <[email protected]>
Date:   2017-10-12T00:04:23Z

    Most likely useless changes.  I don't think I need the 
MergingAcidRecordReader.  But keeping it for now in case I turn out to be 
wrong.  It has happened before.

----


---

Reply via email to