Steven Paster created PARQUET-1465:
--------------------------------------
Summary: CLONE - Add a way to append encoded blocks in
ParquetFileWriter
Key: PARQUET-1465
URL: https://issues.apache.org/jira/browse/PARQUET-1465
Project: Parquet
Issue Type: New Feature
Components: parquet-mr
Affects Versions: 1.8.0
Reporter: Steven Paster
Assignee: Ryan Blue
Fix For: 1.9.0, 1.8.2
Concatenating two files together currently requires reading the source files
and rewriting the content from scratch. This ends up taking a lot of memory,
even if the data is already encoded correctly and blocks just need to be
appended and have their metadata updated. Merging two files should be fast and
not take much memory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)