tarun11Mavani opened a new issue, #16689:
URL: https://github.com/apache/pinot/issues/16689

   ### Problem
   
   Pinot currently uses TimeRetentionStrategy to remove segments based on their 
end time (the maximum time value in the segment). This means a segment remains 
queryable as long as its end time is within retention. If a segment spans 
multiple days (e.g., day x and x+1), it stays available until x+1 falls out of 
retention, even though records from x are already expired.
   
   This issue becomes more prominent when small segment merger is enabled, 
since compacted segments often contain multiple days of data. As a result, 
queries may return out-of-retention records.
   
   ### Proposal
   
   Introduce an option to automatically exclude out-of-retention records at 
query time. Pinot could apply a time filter based on the configured retention 
boundary, ensuring only valid data is returned.
   
   This behavior should initially be opt-in, controlled through a query option. 
For example:
   
   ```
   SET skipOutOfRetentionValues=true;
   SELECT COUNT(*) FROM myTable;
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to