[I] Serialization performance tuning how-to? [avro-rs]

via GitHub Tue, 28 Oct 2025 00:47:59 -0700


brent-statsig opened a new issue, #322:
URL: https://github.com/apache/avro-rs/issues/322


   I'm working through writing a service that downloads `n` files concurrently 
- then writes them to a single avro file.
   - Is there a recommended way to parallel serialize data? I see `Writer` 
calls `maybe_write_header` in all public append APIs, along with `into_inner`, 
which makes it hard to just get the raw bytes of serialized rows without the 
header attached. It doesn't look like the raw `Serializer` impl in `ser.rs` is 
public either. What is the recommended way to split serialization work across 
cores?
   - Schema validation per-value appended is expensive - it would be really 
nice to have compile flags around it so it can be stripped out for production, 
or have a sampling rate attached to it to retain some runtime safety? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Serialization performance tuning how-to? [avro-rs]

Reply via email to