[jira] [Comment Edited] (HIVE-7685) Parquet memory manager

Brock Noland (JIRA) Mon, 29 Sep 2014 08:31:49 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151792#comment-14151792
 ]


Brock Noland edited comment on HIVE-7685 at 9/29/14 3:31 PM:
-------------------------------------------------------------

Hi Dong,

Ok, thank you for the investigation. I think we can either put the parquet 
memory manager in Parquet or add API's to expose the information required to 
implement the memory manager in Hive. Either approach is fine by me, we can 
take this work up in PARQUET-108.

Brock


was (Author: brocknoland):
Hi Dong,

Ok, thank you for the investigation. I think we can either put the parquet 
memory manager in Parquet or add API's to expose the information required to 
implement the memory manager in HIve. Either approach is fine by me, we can 
take this work up in PARQUET-108.

Brock

> Parquet memory manager
> ----------------------
>
>                 Key: HIVE-7685
>                 URL: https://issues.apache.org/jira/browse/HIVE-7685
>             Project: Hive
>          Issue Type: Improvement
>          Components: Serializers/Deserializers
>            Reporter: Brock Noland
>
> Similar to HIVE-4248, Parquet tries to write large very large "row groups". 
> This causes Hive to run out of memory during dynamic partitions when a 
> reducer may have many Parquet files open at a given time.
> As such, we should implement a memory manager which ensures that we don't run 
> out of memory due to writing too many row groups within a single JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-7685) Parquet memory manager

Reply via email to