alamb commented on a change in pull request #923: URL: https://github.com/apache/arrow-rs/pull/923#discussion_r745032837
########## File path: arrow/src/datatypes/field.rs ########## @@ -107,6 +107,47 @@ impl Field { self.nullable } + /// Returns a (flattened) vector containing all fields contained within this field (including it self) + pub(crate) fn fields(&self) -> Vec<&Field> { + let mut collected_fields = vec![self]; + match &self.data_type { + DataType::Struct(fields) | DataType::Union(fields) => collected_fields + .append(&mut fields.iter().map(|f| f.fields()).flatten().collect()), Review comment: I think you can save an intermediate `collect` here by doing something like: ```suggestion DataType::Struct(fields) | DataType::Union(fields) => collected_fields .extend(fields.iter().map(|f| f.fields()).flatten()), ``` Probably doesn't make any practical difference, mostly a stylistic difference ########## File path: arrow/src/datatypes/field.rs ########## @@ -107,6 +107,47 @@ impl Field { self.nullable } + /// Returns a (flattened) vector containing all fields contained within this field (including it self) + pub(crate) fn fields(&self) -> Vec<&Field> { + let mut collected_fields = vec![self]; + match &self.data_type { + DataType::Struct(fields) | DataType::Union(fields) => collected_fields + .append(&mut fields.iter().map(|f| f.fields()).flatten().collect()), + DataType::List(field) + | DataType::LargeList(field) + | DataType::FixedSizeList(field, _) + | DataType::Map(field, _) => collected_fields.push(field), + _ => (), + } + + collected_fields + } + + /// Adds all (potentially nested) `Field` instances selected by the dictionary ID they use into + /// the `dictfields` vector. + fn collect_fields_with_dict_id<'a>( + &'a self, + dictfields: &'_ mut Vec<&'a Field>, + id: i64, + ) { + for field in self.fields() { + if let DataType::Dictionary(_, _) = field.data_type() { + if field.dict_id == id { + dictfields.push(field); + } + } + } + } + + /// Returns a vector containing all (potentially nested) `Field` instances selected by the + /// dictionary ID they use + #[inline] + pub(crate) fn fields_with_dict_id(&self, id: i64) -> Vec<&Field> { + let mut fields = Vec::new(); + self.collect_fields_with_dict_id(&mut fields, id); + fields + } Review comment: If you are into the Rust style of functional programming, I think you can avoid the use of `mut` and intermediate `Vec`s by using https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.filter_map Something like ```suggestion /// Returns a vector containing all (potentially nested) `Field` instances selected by the /// dictionary ID they use #[inline] pub(crate) fn fields_with_dict_id(&self, id: i64) -> Vec<&Field> { self.fields() .into_iter() .filter_map(|field| { match field.data_type() { DataType::Dictionary(_, _) if field.dict_id == id => { Some(field) } _ => None } }) .collect() } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org