zenyanle opened a new pull request, #7098:
URL: https://github.com/apache/opendal/pull/7098
Previously, `write_from` called `copy_to_bytes` with the full remaining
length. If the input `Buf` was non-contiguous (e.g., a `Chain`), this forced a
deep copy to merge memory into a single contiguous block.
This commit optimizes the implementation by:
1. Introducing a fast path for contiguous buffers to avoid Vec allocation.
2. Iterating over chunks for non-contiguous buffers, collecting them into a
`Vec<Bytes>` to preserve zero-copy behavior where possible.
The fix has been verified with added unit tests. Local validation with the
`bytes-utils` crate's `SegmentedBuf` also confirmed the expected zero-copy
behavior.
# Which issue does this PR close?
Closes # (fill in issue number if exists)
# Rationale for this change
When using `write_from` with chained buffers like `Chain<Bytes, Bytes>`, the
previous implementation called `copy_to_bytes(remaining())` on the entire
buffer. For `Chain`, this triggers a deep copy because `Chain::copy_to_bytes`
must allocate new memory to merge non-contiguous chunks into a single
contiguous block.
Since `Bytes::copy_to_bytes` is zero-copy (only incrementing the reference
count) when sliced within a chunk, we can preserve this property by iterating
over each chunk individually. This ensures that each `Bytes` chunk is extracted
without data copying.
# What changes are included in this PR?
- Optimized `Writer::write_from` with two code paths:
- **Fast path**: When `chunk().len() == remaining()`, the buffer is
contiguous, so we directly call `copy_to_bytes(remaining())` without Vec
allocation.
- **Slow path**: For non-contiguous buffers, iterate using `while
has_remaining()` and collect chunks via `copy_to_bytes(chunk().len())`,
preserving zero-copy for `Bytes` chunks.
- Added unit test `test_writer_write_from_chain` to verify the behavior with
chained buffers.
# Are there any user-facing changes?
No breaking changes. This is a performance optimization that reduces memory
copies when using `write_from` with non-contiguous `Buf` implementations. The
API remains unchanged.
# AI Usage Statement
This PR was developed with assistance from GitHub Copilot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]