nevi-me commented on a change in pull request #7306:
URL: https://github.com/apache/arrow/pull/7306#discussion_r432927349
##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -500,6 +500,49 @@ impl<T: ArrowPrimitiveType> PrimitiveBuilder<T> {
Ok(())
}
+ /// Appends values from a slice of type `T` and a validity byte slice
+ pub fn append_values(
+ &mut self,
+ values: &[T::Native],
+ is_valid: &[u8],
+ offset: usize,
+ ) -> Result<()> {
Review comment:
I've given it more thought @houqp . I think adding offsets is out of the
scope of the function, because the JIRA was initially created to address the
need to manually pass primitive data and its validity map (`append_slice()` but
with validity).
I'm thinking a simpler function like the below could work:
```rust
/// Appends values from a slice of type `T` and a validity boolean slice
pub fn append_values(
&mut self,
values: &[T::Native],
is_valid: &[bool],
) -> Result<()> {
if values.len() != is_valid.len() {
return Err(ArrowError::InvalidArgumentError(
"Value and validity lengths must be equal".to_string(),
));
}
self.bitmap_builder.append_slice(is_valid)?;
self.values_builder.append_slice(values)
}
```
Then we can implement an `append_data(&mut self, data: ArrayDataRef) ->
Result<()>;` which generalises over all builder types (with booleans and other
types that need specialisation handled separately).
The problem with changing `append_values` above to take `ArrayDataRef` is
that there's no guarantee that the underlying data type would meet the `T:
ArrowNumericArray` constraint.
@sunchao @paddyhoran this originates from an old comment
(https://github.com/apache/arrow/pull/2858/files#r228808948) where the
suggestion was to use `memcpy` to efficiently achieve the above. If `memcpy`
would work, perhaps it should be applied on `append_slice`? Also, are you fine
with `is_valid: &[bool]` instead of it being bitpacked? If it's packed, it's
less convenient for everyone as the end-user has to create the validity map,
and we have to unpack it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]