vrajat opened a new pull request, #12157:
URL: https://github.com/apache/pinot/pull/12157

   Pinot may take multiple hours between polling a partition in a Kafka topic. 
One specific example is that Pinot took a long time to flush a segment to disk. 
In the meantime, messages in Kafka can expire if message retention time is 
small. 
   If `auto.offset.reset` is set to `smallest`, then Kafka will silently move 
the offset to the first available message leading to data loss. 
   Before consuming messages from Kafka, check if any messages have expired by 
comparing the `startOffset` in the `RealtimeSegmentDataManager` to the 
`beginOffset` of the Kafka partition. If `startOffset` < `beginOffset`, then 
log the condition as well as set guage to 1. The guage can be connected to an 
alerting system.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to