[ 
https://issues.apache.org/jira/browse/PARQUET-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated PARQUET-1006:
-------------------------------------
    Affects Version/s: 1.12.0

> ColumnChunkPageWriter uses only heap memory.
> --------------------------------------------
>
>                 Key: PARQUET-1006
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1006
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.8.0, 1.12.0
>            Reporter: Vitalii Diravka
>            Assignee: Vitalii Diravka
>            Priority: Major
>
> After PARQUET-160 was resolved, ColumnChunkPageWriter started using 
> ConcatenatingByteArrayCollector. There are all data is collected in the List 
> of byte[], before writing the page. No way to use direct memory for 
> allocating buffers. ByteBufferAllocator is present in the 
> [ColumnChunkPageWriter|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageWriteStore.java#L73]
>  class, but never used.
> Using of java heap space in some cases can cause OOM exceptions or GC's 
> overhead. 
> ByteBufferAllocator should be used in the ConcatenatingByteArrayCollector or 
> OutputStream classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to