akshayrai opened a new pull request #5152: [TE][subscription] update 
subscription watermarks to use anomaly create time instead of end time
URL: https://github.com/apache/incubator-pinot/pull/5152
 
 
   Problem Statement: 
   The current subscription watermarks are designed to notify an anomaly only 
once (even if merged) and we maintain this by keeping track of the last 
notified anomaly end time (watermark). However, the assumption here was that 
newer anomalies will always be detected on newer data (that is, newer anomalies 
can never have start time < watermark). This puts the restriction when dealing 
with backfilled data and also in the case of missing data where the actual 
deviation anomalies on the data might be detected later. This PR tries to 
remove this restriction by leveraging the anomaly create time in the watermark.
   
   Proposed changes:
   * Replace the anomaly end time with the anomaly create time in the vector 
clock.
   * Remove highWaterMark field from subscription config - As of today, we 
maintain 2 watermarks, namely the last notified anomaly ID(highWaterMark) and 
the anomaly end time watermark(vector clocks) to ensure that each anomaly is 
notified only once. The main purpose of the highWaterMark is to filter out 
merged anomalies from the time window. This is no longer required once we start 
relying on the anomaly create time.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to