> > I found something rule like this; > > When the following process was killed, the system would reboot. > > * ccm > > * cib > > * lrmd > > * crmd > > * pengine > > * tengine > > > > These processes would be restarted when they are killed. > > * FIFO > > * media (ex. write/read bcast) > > * stonithd > > * attrd > > * mgmtd > > * respawn (ex. pingd) > > > > If mcp is killed, Heartbeat2 is going to stop. > > (but, lrmd and mgmtd might remain...) > > > > Is there any policy what process is desired, restart itself or reboot the > > system? > > I think it wouldn't be hurt if the death of all process raise a reboot. > > It's simple. > > > Heartbeat has always restarted "client" processes until recently, and > restarted itself when it's own processes died. The reboot action is > certainly simple, but if the recovery works, then it's certainly more > gentle. > > I recently changed it so that when the "media" processes died that we > restarted them. Certain kinds of temporary hardware and administrator > malfunctions most commonly cause them to mess up, and Lars specifically > asked that they not die in this case. > > The FIFO process is certainly easy to restart, so I just added it to > restart (it's only used in R1 configurations). > > If our strategy works (and I think it does) then I think I like > "soft/safe" recovery when it is not too complicated.
I see... It might be proper that heartbeat restarts FIFO and media process when they die. But I thought that cib or crm kept some important status like transition graphs, so the system was going to reboot if they died to refresh the cluster. in that way, attrd seems to be able to be received into the group. Thanks, Junko _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
