+1 for this~
Thanks, wang.

Best,
Junhao

> 2026年6月22日 14:00,wang <[email protected]> 写道:
> 
> Hi Paimon community, I would like to start a discussion about supporting
> Deletion Vectors (DVs) for DataEvolution tables. In many AI scenarios,
> users frequently need to delete data (e.g., removing low-quality samples,
> deduplication, or excluding biased training data). The existing file-level
> DV in AppendTable is incompatible with DataEvolution, and rewriting files
> for random deletions causes small-file explosion and index invalidation. I
> prepared a short design document proposing a range-based DV approach, along
> with solutions for Merge Into, Compaction, and Vector Index adaptation:
> https://docs.google.com/document/d/14XHZCgtz_487eKq8k0s_hVfaVA9ETZw4rle19-qN7hY/edit?usp=sharing
> Could you please take a look and share your thoughts? Best, wang

Reply via email to