JingsongLi opened a new pull request, #8259:
URL: https://github.com/apache/paimon/pull/8259

   ## Summary
   
   Add Python `delete_by_filter` support for append-only Data Evolution tables. 
The API deletes rows matching a `Predicate` by rewriting affected files into 
remaining RowID-contiguous segments, preserving valid Data Evolution row-id 
ranges.
   
   ## Changes
   
   - Add batch and stream `TableUpdate.delete_by_filter` APIs.
   - Implement predicate-based deletion for row-tracking Data Evolution tables, 
including explicit `_ROW_ID` validation and file splitting around deleted rows.
   - Extend commit messages and file-store commits to emit manifest delete 
entries for rewritten files.
   - Update row-id conflict detection so a commit may add replacement subranges 
only when it also deletes the matching base file.
   - Add coverage for split deletes, whole-file deletes, no-op deletes, invalid 
RowIDs, post-update cleanup, manifest delete entries, and replacement-range 
conflict checks.
   
   ## Testing
   
   - `python -m unittest pypaimon.tests.table_update_test`
   - `python -m unittest pypaimon.tests.file_store_commit_test 
pypaimon.tests.write.conflict_detection_test pypaimon.tests.table_commit_test`
   - `python -m compileall -q pypaimon/write 
pypaimon/tests/table_update_test.py pypaimon/tests/file_store_commit_test.py 
pypaimon/tests/write/conflict_detection_test.py`
   - `git diff --check`
   
   ## Notes
   
   This support is intentionally limited to append-only Data Evolution tables 
with row tracking enabled. Primary-key tables and tables with dedicated 
blob/vector files are rejected for now.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to