npawar opened a new issue #3642: Increase frequency of ValidationManager to fix 
realtime consumption stopped issues faster than 1 hour
URL: https://github.com/apache/incubator-pinot/issues/3642
 
 
   Currently the ValidationManager runs periodically, every 60 minutes. During 
a run, the ValidationManager checks for any partitions of realtime tables that 
are stopped consuming, and fixes the ideal state and segment metadata so that 
the consumption can resume.
   Running the ValidationManager every 60 minutes would mean that we can only 
guarantee freshness of data upto 60 minutes ago. We also have a check 
"isTooSoonToCorrect". This check skips correcting any segments that were 
updated within the last 10 minutes. This effectively pushes our freshness to 1h 
10m.
   
   This delay of 1h 10m is not acceptable to certain use cases. 
   
   One way to solve this is:
   1) Make the Validation Manager run sooner than every 60 minutes (10 
minutes?) We have to check if doing so would add additional load on zk. In 
particular we need to watch for operations that would read all segment metadata 
for all tables. We should also ensure that we do only the realtime correction 
part frequently, and not the other things that ValidationManager is responsible 
for (eg. checking offline segments for gaps in time)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to