alamb commented on code in PR #9093:
URL: https://github.com/apache/arrow-rs/pull/9093#discussion_r3093236379
##########
parquet/src/arrow/array_reader/fixed_len_byte_array.rs:
##########
@@ -284,6 +291,15 @@ fn move_values<F>(
}
impl ValuesBuffer for FixedLenByteArrayBuffer {
+ fn with_capacity(_capacity: usize) -> Self {
+ // byte_length is not known at trait level, so we return a default
buffer
+ // The decoder will pre-allocate when it knows both capacity and
byte_length
+ Self {
+ buffer: Vec::new(),
+ byte_length: None,
+ }
+ }
+
fn pad_nulls(
Review Comment:
And the matching use-site change would be:
```suggestion
None => {
out.byte_length = Some(self.byte_length);
if out.buffer.is_empty()
&& let Some(values_capacity) = out.values_capacity.take()
{
// now that the byte length per output element is is
known,
// allocate the actual needed space.
let byte_capacity =
values_capacity.saturating_mul(self.byte_length);
out.buffer = Vec::with_capacity(byte_capacity);
}
}
```
##########
parquet/src/arrow/array_reader/fixed_len_byte_array.rs:
##########
@@ -284,6 +291,15 @@ fn move_values<F>(
}
impl ValuesBuffer for FixedLenByteArrayBuffer {
+ fn with_capacity(_capacity: usize) -> Self {
+ // byte_length is not known at trait level, so we return a default
buffer
+ // The decoder will pre-allocate when it knows both capacity and
byte_length
+ Self {
+ buffer: Vec::new(),
+ byte_length: None,
+ }
+ }
+
fn pad_nulls(
Review Comment:
You could potentially improve the allocation in the fixed size buffer too
by deferring the allocation until we know the value_capaoity
```suggestion
fn with_capacity(capacity: usize) -> Self {
// `byte_length` is not known initially, so preserve the value-count
// hint so the first decode can allocate the exact byte capacity.
Self {
buffer: Vec::new(),
byte_length: None,
values_capacity: Some(capacity),
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]