Eugene Koifman created HIVE-17296:
-------------------------------------

             Summary: Acid tests with multiple splits
                 Key: HIVE-17296
                 URL: https://issues.apache.org/jira/browse/HIVE-17296
             Project: Hive
          Issue Type: Test
          Components: Transactions
    Affects Versions: 3.0.0
            Reporter: Eugene Koifman
            Assignee: Eugene Koifman
            Priority: Critical


data files in an Acid table are ORC files which may have multiple stripes
for such files in base/ or delta/ (and original files with non acid to acid 
conversion) are split by OrcInputFormat into multiple (stripe sized) chunks.
There is additional logic in in OrcRawRecordMerger 
(discoverKeyBounds/discoverOriginalKeyBounds) that is not tested by any E2E 
tests since none of the have enough data to generate multiple stripes in a 
single file.

testRecordReaderOldBaseAndDelta/testRecordReaderNewBaseAndDelta/testOriginalReaderPair
in TestOrcRawRecordMerger has some logic to test this but it really needs e2e 
tests.

With ORC-228 it will be possible to write such tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to