[
https://issues.apache.org/jira/browse/PARQUET-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gang Wu updated PARQUET-1006:
-----------------------------
Fix Version/s: (was: 1.13.0)
> ColumnChunkPageWriter uses only heap memory.
> --------------------------------------------
>
> Key: PARQUET-1006
> URL: https://issues.apache.org/jira/browse/PARQUET-1006
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: 1.8.0, 1.12.0
> Reporter: Vitalii Diravka
> Assignee: Vitalii Diravka
> Priority: Major
>
> After PARQUET-160 was resolved, ColumnChunkPageWriter started using
> ConcatenatingByteArrayCollector. There are all data is collected in the List
> of byte[], before writing the page. No way to use direct memory for
> allocating buffers. ByteBufferAllocator is present in the
> [ColumnChunkPageWriter|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageWriteStore.java#L73]
> class, but never used.
> Using of java heap space in some cases can cause OOM exceptions or GC's
> overhead.
> ByteBufferAllocator should be used in the ConcatenatingByteArrayCollector or
> OutputStream classes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)