> The lease time is set to 1 minute. Would it be ok to check this every minute, from every node?
Adding to that the default time intervals are - asyncDelay = 1 sec - The background operation are performed every 1 sec per cluster node. If nothing changes we would fire 1query/sec/cluster node to check the head revision - cluster lease time = 1 min - This is the time after a cluster lease would be renewed. So we need to decide the time interval for Job for detecting recovery condition Chetan Mehrotra On Wed, Apr 2, 2014 at 4:31 PM, Amit Jain <[email protected]> wrote: > Hi, > >>> 1) a cluster node starts up and sees it didn't shut down properly. I'm > not >>> sure this information is available, but remember we discussed this once. > > Yes, this case has been taken care of in the startup. > >>> this check could be done in the >>> background operations thread on a regular basis. probably depending on >>> the lease interval. > > The lease time is set to 1 minute. Would it be ok to check this every > minute, from every node? > > Thanks > Amit > > > On Wed, Apr 2, 2014 at 4:14 PM, Marcel Reutegger <[email protected]> wrote: > >> Hi, >> >> I think the recovery should be triggered automatically by the system when: >> >> 1) a cluster node starts up and sees it didn't shut down properly. I'm not >> sure this information is available, but remember we discussed this once. >> >> 2) a cluster node sees a lease timeout of another cluster node and >> initiates >> the recovery for the failed cluster node. this check could be done in the >> background operations thread on a regular basis. probably depending on >> the lease interval. >> >> In addition it would probably also be useful to have the recovery operation >> available as a command in oak-run. that way you can manually trigger it >> from >> the command line. WDYT? >> >> Regards >> Marcel >> >> > How do we expose _lastRev recovery operation? This would need to check >> > all >> > the cluster nodes info and run recovery for those nodes which need >> > recovery. >> > >> > 1. We either have a scheduled job which checks all the nodes and run the >> > recovery. What should be the interval to trigger the job? >> > 2. Or if we want it run only when triggered manually, then expose an >> > appropriate MBean. >> > >> > >> > Thanks >> > Amit >>
