adriangb commented on code in PR #7015:
URL: https://github.com/apache/arrow-rs/pull/7015#discussion_r1992326453
##########
arrow-json/src/writer/encoder.rs:
##########
@@ -196,9 +236,18 @@ impl Encoder for StructArrayEncoder<'_> {
let mut is_first = true;
// Nulls can only be dropped in explicit mode
let drop_nulls = (self.struct_mode == StructMode::ObjectOnly) &&
!self.explicit_nulls;
- for field_encoder in &mut self.encoders {
- let is_null = is_some_and(field_encoder.nulls.as_ref(), |n|
n.is_null(idx));
- if drop_nulls && is_null {
+
+ // Collect all the field nulls buffers up front to avoid dynamic
dispatch in the loop
+ // This creates a temporary Vec, but avoids repeated virtual calls
which should be a net win
Review Comment:
I had run it and there was a measurable (~30%) slowdown. This makes sense,
it was doing 2 dynamic dispatch calls for every row instead of just 1.
I've now refactored this to avoid the dynamic dispatch altogether by
encapsulating the dynamic encoder + null buffer in a helper struct. So it's
basically the same as before except that things get passed around as a struct
instead of a tuple of unrelated values. I don't love the name
`EncoderWithNullBuffer`, open to suggestions.
Benchmarks now show no difference vs. `main`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]