QuakeWang commented on code in PR #225:
URL: https://github.com/apache/paimon-rust/pull/225#discussion_r3045782179


##########
crates/paimon/src/arrow/filtering.rs:
##########
@@ -26,6 +26,69 @@ pub(crate) fn reader_pruning_predicates(data_predicates: 
Vec<Predicate>) -> Vec<
         .collect()
 }
 
+/// Remap predicates from table-level indices to file-level indices.
+/// Predicates referencing fields not present in the file are dropped.
+pub(crate) fn remap_predicates_to_file(
+    predicates: &[Predicate],
+    table_fields: &[DataField],
+    file_fields: &[DataField],
+) -> Vec<Predicate> {
+    let mapping = build_field_mapping(table_fields, file_fields);
+    predicates
+        .iter()
+        .filter_map(|p| remap_predicate(p, &mapping))
+        .collect()
+}
+
+fn remap_predicate(predicate: &Predicate, mapping: &[Option<usize>]) -> 
Option<Predicate> {
+    match predicate {
+        Predicate::Leaf {
+            column,
+            index,
+            data_type,
+            op,
+            literals,
+        } => {
+            let file_index = mapping.get(*index).copied().flatten()?;

Review Comment:
   Dropping the leaf here changes the schema-evolution semantics for missing 
columns. For an older file that does not contain `new_col`, filters like 
`new_col = 1`, `new_col > 0`, or `new_col IS NOT NULL` should become “cannot 
match”, not disappear entirely. Once we remove the leaf at remap time, the 
downstream pruning/read path never sees that constraint and can still read rows 
from old files.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to