avshenuk opened a new issue, #16367:
URL: https://github.com/apache/pinot/issues/16367

   ### What needs to be done:
   Add functionality to PurgeTaskGenerator to automatically identify and delete 
segments with zero total documents during task generation.
   
   ### Why this enhancement is needed:
   Currently, PurgeTask processes all segments including those with zero 
documents. Empty segments consume metadata resources and processing time 
without providing any value. Automatic cleanup would improve cluster 
maintenance and resource utilization.
   
   This enhancement is particularly valuable for custom retention management 
scenarios where users implement complex retention logic (beyond Pinot's 
standard retention features) using PurgeTask with custom RecordPurger 
implementations. In such cases, segments can become empty over time as records 
are purged based on custom business rules, and automatic cleanup becomes 
essential for maintaining cluster efficiency.
   
   ### Proposed implementation:
   - Modify PurgeTaskGenerator to detect segments with ZKMetadata.totalDocs == 
0 during task generation
   - Collect empty segments in a separate list and delete them using the 
pinotHelixResourceManager.deleteSegments() API
   - Skip empty segments from regular purge task generation to avoid 
unnecessary processing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to