alamb commented on a change in pull request #923:
URL: https://github.com/apache/arrow-rs/pull/923#discussion_r745032837
##########
File path: arrow/src/datatypes/field.rs
##########
@@ -107,6 +107,47 @@ impl Field {
self.nullable
}
+ /// Returns a (flattened) vector containing all fields contained within
this field (including it self)
+ pub(crate) fn fields(&self) -> Vec<&Field> {
+ let mut collected_fields = vec![self];
+ match &self.data_type {
+ DataType::Struct(fields) | DataType::Union(fields) =>
collected_fields
+ .append(&mut fields.iter().map(|f|
f.fields()).flatten().collect()),
Review comment:
I think you can save an intermediate `collect` here by doing something
like:
```suggestion
DataType::Struct(fields) | DataType::Union(fields) =>
collected_fields
.extend(fields.iter().map(|f| f.fields()).flatten()),
```
Probably doesn't make any practical difference, mostly a stylistic difference
##########
File path: arrow/src/datatypes/field.rs
##########
@@ -107,6 +107,47 @@ impl Field {
self.nullable
}
+ /// Returns a (flattened) vector containing all fields contained within
this field (including it self)
+ pub(crate) fn fields(&self) -> Vec<&Field> {
+ let mut collected_fields = vec![self];
+ match &self.data_type {
+ DataType::Struct(fields) | DataType::Union(fields) =>
collected_fields
+ .append(&mut fields.iter().map(|f|
f.fields()).flatten().collect()),
+ DataType::List(field)
+ | DataType::LargeList(field)
+ | DataType::FixedSizeList(field, _)
+ | DataType::Map(field, _) => collected_fields.push(field),
+ _ => (),
+ }
+
+ collected_fields
+ }
+
+ /// Adds all (potentially nested) `Field` instances selected by the
dictionary ID they use into
+ /// the `dictfields` vector.
+ fn collect_fields_with_dict_id<'a>(
+ &'a self,
+ dictfields: &'_ mut Vec<&'a Field>,
+ id: i64,
+ ) {
+ for field in self.fields() {
+ if let DataType::Dictionary(_, _) = field.data_type() {
+ if field.dict_id == id {
+ dictfields.push(field);
+ }
+ }
+ }
+ }
+
+ /// Returns a vector containing all (potentially nested) `Field` instances
selected by the
+ /// dictionary ID they use
+ #[inline]
+ pub(crate) fn fields_with_dict_id(&self, id: i64) -> Vec<&Field> {
+ let mut fields = Vec::new();
+ self.collect_fields_with_dict_id(&mut fields, id);
+ fields
+ }
Review comment:
If you are into the Rust style of functional programming, I think you
can avoid the use of `mut` and intermediate `Vec`s by using
https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.filter_map
Something like
```suggestion
/// Returns a vector containing all (potentially nested) `Field`
instances selected by the
/// dictionary ID they use
#[inline]
pub(crate) fn fields_with_dict_id(&self, id: i64) -> Vec<&Field> {
self.fields()
.into_iter()
.filter_map(|field| {
match field.data_type() {
DataType::Dictionary(_, _) if field.dict_id == id => {
Some(field)
}
_ => None
}
})
.collect()
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]