alamb commented on a change in pull request #923:
URL: https://github.com/apache/arrow-rs/pull/923#discussion_r745032837



##########
File path: arrow/src/datatypes/field.rs
##########
@@ -107,6 +107,47 @@ impl Field {
         self.nullable
     }
 
+    /// Returns a (flattened) vector containing all fields contained within 
this field (including it self)
+    pub(crate) fn fields(&self) -> Vec<&Field> {
+        let mut collected_fields = vec![self];
+        match &self.data_type {
+            DataType::Struct(fields) | DataType::Union(fields) => 
collected_fields
+                .append(&mut fields.iter().map(|f| 
f.fields()).flatten().collect()),

Review comment:
       I think you can save an intermediate `collect` here by doing something 
like:
   
   ```suggestion
               DataType::Struct(fields) | DataType::Union(fields) => 
collected_fields
                   .extend(fields.iter().map(|f| f.fields()).flatten()),
   ```
   
   Probably doesn't make any practical difference, mostly a stylistic difference

##########
File path: arrow/src/datatypes/field.rs
##########
@@ -107,6 +107,47 @@ impl Field {
         self.nullable
     }
 
+    /// Returns a (flattened) vector containing all fields contained within 
this field (including it self)
+    pub(crate) fn fields(&self) -> Vec<&Field> {
+        let mut collected_fields = vec![self];
+        match &self.data_type {
+            DataType::Struct(fields) | DataType::Union(fields) => 
collected_fields
+                .append(&mut fields.iter().map(|f| 
f.fields()).flatten().collect()),
+            DataType::List(field)
+            | DataType::LargeList(field)
+            | DataType::FixedSizeList(field, _)
+            | DataType::Map(field, _) => collected_fields.push(field),
+            _ => (),
+        }
+
+        collected_fields
+    }
+
+    /// Adds all (potentially nested) `Field` instances selected by the 
dictionary ID they use into
+    /// the `dictfields` vector.
+    fn collect_fields_with_dict_id<'a>(
+        &'a self,
+        dictfields: &'_ mut Vec<&'a Field>,
+        id: i64,
+    ) {
+        for field in self.fields() {
+            if let DataType::Dictionary(_, _) = field.data_type() {
+                if field.dict_id == id {
+                    dictfields.push(field);
+                }
+            }
+        }
+    }
+
+    /// Returns a vector containing all (potentially nested) `Field` instances 
selected by the
+    /// dictionary ID they use
+    #[inline]
+    pub(crate) fn fields_with_dict_id(&self, id: i64) -> Vec<&Field> {
+        let mut fields = Vec::new();
+        self.collect_fields_with_dict_id(&mut fields, id);
+        fields
+    }

Review comment:
       If you are into the Rust style of functional programming, I think you 
can avoid the use of `mut` and intermediate `Vec`s by using 
https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.filter_map
   
   Something like
   
   ```suggestion
       /// Returns a vector containing all (potentially nested) `Field` 
instances selected by the
       /// dictionary ID they use
       #[inline]
       pub(crate) fn fields_with_dict_id(&self, id: i64) -> Vec<&Field> {
           self.fields()
               .into_iter()
               .filter_map(|field| {
                   match field.data_type() {
                       DataType::Dictionary(_, _) if field.dict_id == id => {
                           Some(field)
                       }
                       _ => None
                   }
               })
               .collect()
       }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to