[ 
https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731580#comment-13731580
 ] 

Prasanth J commented on HIVE-4123:
----------------------------------

Following fixes were added to this patch
 - Removed FIXMEs
 - For determining the type of integer encoding (DIRECT/DIRECT_V2) used by 
dictionaries, a new encoding type DICTIONARY_V2 is added. DICTIONARY_V2 uses 
DIRECT_V2 encoding for dictionary data and length streams. In earlier patch, 
there is no way to determined if dictionaries used DIRECT or DIRECT_V2 
encoding. This patch addresses this issue. I am not sure if there is any other 
way to determine this without adding new encoding type. 
 - addressed code review comment related to having if/then/else in flush() 
method of RunLengthIntegerWriterV2

                
> The RLE encoding for ORC can be improved
> ----------------------------------------
>
>                 Key: HIVE-4123
>                 URL: https://issues.apache.org/jira/browse/HIVE-4123
>             Project: Hive
>          Issue Type: New Feature
>          Components: File Formats
>    Affects Versions: 0.12.0
>            Reporter: Owen O'Malley
>            Assignee: Prasanth J
>              Labels: orcfile
>             Fix For: 0.12.0
>
>         Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt, 
> HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, HIVE-4123.5.txt, 
> ORC-Compression-Ratio-Comparison.xlsx
>
>
> The run length encoding of integers can be improved:
> * tighter bit packing
> * allow delta encoding
> * allow longer runs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to