[
https://issues.apache.org/jira/browse/HIVE-15290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829464#comment-15829464
]
Prasanth Jayachandran commented on HIVE-15290:
----------------------------------------------
Please file this issue under ORC project as orc module will be removed from
hive soon.
> Stripe size smaller than specified.
> -----------------------------------
>
> Key: HIVE-15290
> URL: https://issues.apache.org/jira/browse/HIVE-15290
> Project: Hive
> Issue Type: Bug
> Components: ORC
> Affects Versions: 1.2.0, 1.2.1, 2.0.0, 2.1.0, 2.0.1
> Reporter: Yuxing Yao
>
> In Hive-1.2.0, the real stripe size of output orc file will be very small if
> most of table data are empty, result in too many Column Statistics objects
> consumes most of the memory.
> I found it become better in Hive-2.0.1, but the stripe size still much
> smaller than expected.
> I saw there's a Jira item: https://issues.apache.org/jira/browse/HIVE-13232
> moved the compressed = null out of if block, this changes helps a lot, but
> for completely fix this, another change is needed in
> `OutStream.getBufferSize()`
> I've created the PR:
> https://github.com/apache/hive/pull/118
> Please take a look.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)