On Thu, 8 Sep 2016 15:55:50 +0900 Digimer <[email protected]> wrote: > On 08/09/16 03:47 PM, Ulrich Windl wrote: > >>>> Shermal Fernando <[email protected]> schrieb am 08.09.2016 um > >>>> 06:41 in > > Nachricht > > <8ce6e8d87f896546b9c65ed80d30a4336578c...@lg-spmb-mbx02.lseg.stockex.local>: > >> The whole cluster will fail if the DC (crm daemon) is frozen due to CPU > >> starvation or hanging while trying to perform a IO operation. > >> Please share some thoughts on this issue. > > > > What is "the whole cluster will fail"? If the DC times out, some recovery > > will take place. > > Yup. The starved node should be declared lost by corosync, the remaining > nodes reform and if they're still quorate, the hung node should be > fenced. Recovery occur and life goes on.
+1 And fencing might either come from outside, or just from the server itself using watchdog. -- Jehan-Guillaume (ioguix) de Rorthais Dalibo _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
