[ 
https://issues.apache.org/jira/browse/HIVE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12450:
-----------------------------------------
    Attachment: HIVE-12450.4.patch

Added the test cases to tez as well.

> OrcFileMergeOperator does not use correct compression buffer size
> -----------------------------------------------------------------
>
>                 Key: HIVE-12450
>                 URL: https://issues.apache.org/jira/browse/HIVE-12450
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>    Affects Versions: 1.2.0, 1.3.0, 1.2.1, 2.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Critical
>         Attachments: HIVE-12450.1.patch, HIVE-12450.2.patch, 
> HIVE-12450.3.patch, HIVE-12450.4.patch, zlib-hang.png
>
>
> OrcFileMergeOperator checks for compatibility before merging orc files. This 
> compatibility check include checking compression buffer size. But the output 
> file that is created does not honor the compression buffer size and always 
> defaults to 256KB. This will not be a problem when reading the orc file but 
> can create unwanted memory pressure because of wasted space within 
> compression buffer.
> This issue also can make the merged file unreadable under certain cases. For 
> example, if the original compression buffer size is 8KB and if  
> hive.exec.orc.default.buffer.size is set to 4KB. The merge file operator will 
> use 4KB instead of actual 8KB which can result in hanging of ORC reader (more 
> specifically ZlibCodec will wait for more compression buffers). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to