avshenuk opened a new issue, #16367: URL: https://github.com/apache/pinot/issues/16367
### What needs to be done: Add functionality to PurgeTaskGenerator to automatically identify and delete segments with zero total documents during task generation. ### Why this enhancement is needed: Currently, PurgeTask processes all segments including those with zero documents. Empty segments consume metadata resources and processing time without providing any value. Automatic cleanup would improve cluster maintenance and resource utilization. This enhancement is particularly valuable for custom retention management scenarios where users implement complex retention logic (beyond Pinot's standard retention features) using PurgeTask with custom RecordPurger implementations. In such cases, segments can become empty over time as records are purged based on custom business rules, and automatic cleanup becomes essential for maintaining cluster efficiency. ### Proposed implementation: - Modify PurgeTaskGenerator to detect segments with ZKMetadata.totalDocs == 0 during task generation - Collect empty segments in a separate list and delete them using the pinotHelixResourceManager.deleteSegments() API - Skip empty segments from regular purge task generation to avoid unnecessary processing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
