tustvold opened a new issue, #6309:
URL: https://github.com/apache/arrow-rs/issues/6309

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   <!--
   A clear and concise description of what the problem is. Ex. I'm always 
frustrated when [...] 
   (This section helps Arrow developers understand the context and *why* for 
this feature, in addition to  the *what*)
   -->
   
   The ArrowWriter buffers encodes data pages as `RecordBatch` are submitted, 
so as to keep a lid on memory usage - 
https://github.com/apache/arrow-rs/pull/4280.
   
   It then calls 
[SerializedRowGroupWriter::append_column](https://docs.rs/parquet/latest/parquet/file/writer/struct.SerializedRowGroupWriter.html#method.append_column)
 to write these buffered pages. This reads the data as `Bytes` using 
`ChunkReader` and then writes them out to the underlying `Write`.
   
   AsyncArrowWriter also uses this, however, despite `AsyncFileWriter` 
supporting `Bytes`, which is what the data has been buffered as, the use of the 
slice-based `Write` forces an unnecessary copy.
   
   **Describe the solution you'd like**
   <!--
   A clear and concise description of what you want to happen.
   -->
   
   Some mechanism to avoid needing to perform this copy.
   
   **Describe alternatives you've considered**
   <!--
   A clear and concise description of any alternative solutions or features 
you've considered.
   -->
   
   We could not do this, in practice the overheads of any network IO are likely 
to massively dominate that of a single mempcy, but creating this to document it
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to