> -----Ursprüngliche Nachricht-----
> Von: General Linux-HA mailing list <[email protected]>
> Gesendet: 15.01.08 10:23:03
> An: General Linux-HA mailing list <[email protected]>
> Betreff: Re: [Linux-HA] URGENT: Problem with configuration of STONITH device
> > probably you can remember. We had a discussion about that, because it was
> > said that the stonithd on one node would prevent to call a stonith
> > agent to kill its own node. Is this still true?
> 
> Yes.

Will that change?

> > running NOW on node 2 should be possible to shoot itself.
> > 
> > Dejan, can you remember?
> 
> Yes. One serious problem in this case is that the cluster can
> never know if the stonith operation was successful. Which would
> basically render the cluster unusable.

So, the upper question is probably NO by design?


> 
> > > It is an extra package provided by IBM. A tad heavy though: it's
> > > a Java application.
> > 
> > Yes, it slurps CPU cycles like a Bavarian...?h sorry..
> > Bohemian slurps beer.  ;-)
> > The worse: I got errors while monitoring regularly. Someone
> > posted here that this is related to timing problems.
> 
> I can recall that it worked for me(tm). However, it is very
> demanding in terms of memory/cpu and that's definitely not good
> for stonith.

I had monitoring running once per minute. Approx. once per day I got
an error. The error message was not very helpful as it was a java
stack trace.

> 
> Hmm, I probably picked the wrong version then. Buggy in a sense
> that it won't work at all? Can you describe the bug.

I'm sorry, but it's the worst case. It would not stonith and that' really
the primary goal.  :-(
I found out with my tests that the first parameter to the start/stop/restart
actions (therefore the second parameter) is the name of the node to 
stonith. My first script checked for exactly ONE parameter. In this case
obviously not correct. 

> 
> > So, Dejan, could you please check in the differences?
> > I have the (more) correct version attached.
> 
> Thanks, I'll update the repository.

Thank you. Sorry for the mess.


> > IMPORTANT: 
> > 1) The external stonith api allowes that an external stonith plugin
> > can be responsible of shooting more that one node. Parameter while calling.
> > My external stonith plugin shoots exactly the one node configured via
> > CIB. It ignores the parameter. Probably I should add a check for that.
> 
> What exactly do you refer to? All parameters are defined by the
> plugin itself. If that is the case, why should you ignore any of
> them :)

See the above answer. That's something I don't understand. I couldn't
find something in the documentation of the external stonith plugin API.
When a stonith action is triggered the plugin is called with the action name
and with the node name to stonith as second parameter.
I guess this is done to be able to create a configuration with lists of hosts
so that the plugin-cibconfig-combination can shoot more that one node.
But I'm not sure. In my first attempt I thought that ALL parameters are
given as environment variables and ONLY the action is given as parameter.

But probably ...better hopefully...you can enlighten me. :-)


> > 2) The RSA board allows only one telnet session at a time. So if someone 
> > logs
> > in to e.g. check something and at the same time a monitor cycle is started 
> > by
> > HA, the resource gets a monitor failure and will probably moved.
> 
> Yes, that's one typical problem with this class of devices. But
> there's nothing one can do about that.

Let a node shoot itself. (See answers above). It's not good, but better than 
no attempt of shooting.


Best regards
Andreas Mock
_______________________________________________________________________
Jetzt neu! Schützen Sie Ihren PC mit McAfee und WEB.DE. 30 Tage
kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to