alamb commented on code in PR #8848:
URL: https://github.com/apache/arrow-datafusion/pull/8848#discussion_r1462373018


##########
datafusion/core/src/datasource/physical_plan/parquet/page_filter.rs:
##########
@@ -164,17 +169,23 @@ impl PagePruningPredicate {
         let mut row_selections = 
Vec::with_capacity(page_index_predicates.len());
         for predicate in page_index_predicates {
             // find column index by looking in the row group metadata.
-            let col_idx = find_column_index(predicate, &groups[0]);
 
+            let name = find_column_name(predicate);
+            let mut parquet_col = None;
+            if name.is_some() {
+                parquet_col =
+                    parquet_column(parquet_schema, arrow_schema, 
name.unwrap().as_str());
+            }

Review Comment:
   I think a more rust ideomatic way of doing this pattern that avoids the 
unwrap and mut is something like
   
   ```suggestion
               let parquet_col = name
                 .and_then(|name| {
                       parquet_column(parquet_schema, arrow_schema, 
name.as_str());
               });
   ```



##########
datafusion/core/src/datasource/physical_plan/parquet/mod.rs:
##########
@@ -2136,4 +2144,65 @@ mod tests {
         let execution_props = ExecutionProps::new();
         create_physical_expr(expr, &df_schema, &execution_props).unwrap()
     }
+
+    #[tokio::test]

Review Comment:
   I verified that this test fails without the code in this PR 👍 
   
   ```
   thread 
'datasource::physical_plan::parquet::tests::test_struct_filter_parquet' 
panicked at datafusion/core/src/datasource/physical_plan/parquet/mod.rs:2156:9:
   assertion `left == right` failed
     left: 0
    right: 1
   stack backtrace:
   ```



##########
datafusion/core/src/datasource/physical_plan/parquet/page_filter.rs:
##########
@@ -249,10 +260,7 @@ impl PagePruningPredicate {
 /// that `extract_page_index_push_down_predicates` only return
 /// predicate with one col)
 ///
-fn find_column_index(
-    predicate: &PruningPredicate,
-    row_group_metadata: &RowGroupMetaData,
-) -> Option<usize> {
+fn find_column_name(predicate: &PruningPredicate) -> Option<String> {

Review Comment:
   This function docstring needs to be updated if we make this change. However 
I have a slightly different proposal here: 
https://github.com/manoj-inukolunu/arrow-datafusion/pull/1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to