tarun11Mavani opened a new issue, #16689: URL: https://github.com/apache/pinot/issues/16689
### Problem Pinot currently uses TimeRetentionStrategy to remove segments based on their end time (the maximum time value in the segment). This means a segment remains queryable as long as its end time is within retention. If a segment spans multiple days (e.g., day x and x+1), it stays available until x+1 falls out of retention, even though records from x are already expired. This issue becomes more prominent when small segment merger is enabled, since compacted segments often contain multiple days of data. As a result, queries may return out-of-retention records. ### Proposal Introduce an option to automatically exclude out-of-retention records at query time. Pinot could apply a time filter based on the configured retention boundary, ensuring only valid data is returned. This behavior should initially be opt-in, controlled through a query option. For example: ``` SET skipOutOfRetentionValues=true; SELECT COUNT(*) FROM myTable; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
