Ignalina commented on issue #9835:
URL: https://github.com/apache/arrow-rs/issues/9835#issuecomment-4513457017
I have a compression archive format (znippy) that stores already compressed
files (like jars/gz etc) directly as a as LargeBinary in Arrow IPC Stream.
On a 32-core AMD with pcie 4.0 NVMe:
- Direct write_all() to NVMe: 2775 MB/s
- Arrow IPC StreamWriter: 777 MB/s
The bottleneck is write_buffer() at writer.rs:2083:
arrow_data.extend_from_slice(buffer);
A path that avoids this redundant copy would recover the full 2775 MB/s
immediately.
https://github.com/Ignalina/znippy
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]