GayathriSrividya opened a new pull request, #3448:
URL: https://github.com/apache/iceberg-python/pull/3448

   Closes #3272
   
   ## What this changes
   
   This PR updates the Arrow scan path in `_task_to_record_batches` to avoid 
redundant filtering when there are no positional deletes.
   
   - Keeps predicate pushdown in `Scanner.from_fragment` as the only filter 
path when `positional_deletes` is absent.
   - Applies `current_batch.filter(pyarrow_filter)` only in the 
positional-delete path, after deletes are applied.
   - Preserves empty-batch handling after both delete application and 
conditional filtering.
   
   ## Why
   
   The previous flow could perform an extra table-level refilter even when the 
scanner already applied the predicate. This change removes that stale 
workaround path while keeping correct behavior for positional delete scenarios.
   
   ## Tests
   
   Added regression coverage in `tests/io/test_pyarrow.py`:
   
   - 
`test_task_to_record_batches_filter_without_positional_deletes_avoids_table_refilter`
   - 
`test_task_to_record_batches_filter_with_positional_deletes_handles_empty_batch`
   
   Validated locally:
   
   - `python -m pytest tests/io/test_pyarrow.py -q -k 
"task_to_record_batches_nanos or 
filter_without_positional_deletes_avoids_table_refilter or 
filter_with_positional_deletes_handles_empty_batch"`
   - `make lint`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to