[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

Sergey Shelukhin (JIRA) Thu, 10 Mar 2016 19:11:28 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sergey Shelukhin updated HIVE-9660:
-----------------------------------
    Attachment: HIVE-9660.WIP0.patch

WIP patch that takes care of the reading; the writing is only done for 
compressed path and not done for string writer yet cause its logic is 
different... whether it works at all is an open question.
Also, my head hurts now... I feel like after researching how Kerberos works.


> store end offset of compressed data for RG in RowIndex in ORC
> -------------------------------------------------------------
>
>                 Key: HIVE-9660
>                 URL: https://issues.apache.org/jira/browse/HIVE-9660
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-9660.WIP0.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

Reply via email to