[Linux-HA] resource group stuck with one resource started, the other stopped

Tim Serong Mon, 12 May 2008 01:27:18 -0700

Greetings,

I have a cluster with one resource group, containing two resources. Ifthe second resource in the group can't start (the start op fails), I endup with one resource started, and the other stopped, i.e. we have thefollowing sequence of events:


  1) res-1 start (succeeds)
  2) res-2 start (fails)
  3) res-2 stop  (succeeds)
  4) res-1 stop  (succeeds)
  5) res-1 start (succeeds)
  6) res-2 start (fails)
  7) ....and now we're stuck with res-1 started and res-2 stopped.
     No further starts/stops occur.

Given that resource groups behave otherwise like a set of co-locatedresources, I'd have expected that if res-2 continually failed to start,that res-1 would also be forced to stop.

I tried a similar test with two separate resources, with ordering &co-location constraints in place, and got the behaviour I expected:


  1) res-1 start (succeeds)
  2) res-2 start (fails)
  3) res-2 stop  (succeeds)
  4) res-1 stop  (succeeds)
  5) res-1 start (succeeds)
  6) res-2 start (fails)
  7) res-2 stop  (succeeds)
  8) res-1 stop  (succeeds)
  9) ....and now everything is stopped, because res-2 can't run.

I'm using heartbeat 2.1.3. Can anyone shed any light on likely reasonsfor this discrepancy in behaviour?


Thanks,

Tim
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] resource group stuck with one resource started, the other stopped

Reply via email to