Andrew Baranec created PARQUET-2184:
---------------------------------------
Summary: Improve SnappyCompressor buffer expansion performance
Key: PARQUET-2184
URL: https://issues.apache.org/jira/browse/PARQUET-2184
Project: Parquet
Issue Type: Improvement
Components: parquet-mr
Affects Versions: 1.13.0
Reporter: Andrew Baranec
The existing implementation of SnappyCompressor will only allocate enough bytes
for the buffer passed into setInput(). This leads to suboptimal performance
when there are patterns of writes that cause repeated buffer expansions. In
the worst case it must copy the entire buffer for every single invocation of
setInput()
Instead of allocating a buffer of size current + write length, there should be
an expansion strategy that reduces the amount of copying required.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)