Ryan Blue created PARQUET-382: --------------------------------- Summary: Add a way to append encoded blocks in ParquetFileWriter Key: PARQUET-382 URL: https://issues.apache.org/jira/browse/PARQUET-382 Project: Parquet Issue Type: New Feature Components: parquet-mr Affects Versions: 1.8.0 Reporter: Ryan Blue Assignee: Ryan Blue
Concatenating two files together currently requires reading the source files and rewriting the content from scratch. This ends up taking a lot of memory, even if the data is already encoded correctly and blocks just need to be appended and have their metadata updated. Merging two files should be fast and not take much memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)