Jefffrey commented on code in PR #9988:
URL: https://github.com/apache/arrow-rs/pull/9988#discussion_r3486910743
##########
arrow-data/src/data.rs:
##########
@@ -791,6 +794,179 @@ impl ArrayData {
unsafe { builder.build_unchecked() }
}
+ /// Returns a new [`ArrayData`] valid for `data_type` containing `len`
+ /// default (non-null) values.
+ ///
+ /// Unlike [`Self::new_null`], the returned array never has a null mask;
+ /// its values are the zero / empty representation for the type:
+ ///
+ /// * primitive types: a zeroed values buffer (e.g. `0` for ints, `0.0`
for floats)
+ /// * `Boolean`: a zeroed bitmap (all `false`)
+ /// * `Binary` / `Utf8` / `LargeBinary` / `LargeUtf8` / view variants:
empty
+ /// byte slices
+ /// * `FixedSizeBinary(n)`: `n` zero bytes per element
+ /// * `List` / `LargeList` / `Map` / `ListView` / `LargeListView`: empty
inner
+ /// lists (offsets/sizes all zero, child is empty)
+ /// * `FixedSizeList`: recursively default-filled child of length
`list_len * len`
+ /// * `Struct`: each child field is recursively default-filled
+ /// * `Union`: in `Sparse` mode every child is default-filled; in `Dense`
mode
+ /// only the first child is default-filled and the rest are empty
+ /// * `RunEndEncoded`: a single run of `len` default values
+ /// * `Dictionary`: keys all point to index `0`, which holds a single
+ /// default value (or no values, if `len == 0`)
+ pub fn new_default(data_type: &DataType, len: usize) -> Self {
Review Comment:
i do wonder about having this as a new public api, as it seems a bit of a
niche usecase 🤔
how often would users need an array that is initialized to "default" values?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]