Re: [Pacemaker] Howto write a STONITH agent

2011-07-23 Thread Prakash Velayutham
Holger Teutsch holger.teutsch@... writes:

 I had the same experience. Ilo is _extremely_ slow and unreliable.
 
 Go for external/ipmi.
 
 That works very fast and reliable. It is available with ILO 2.x
 firmware.
 
 - holger
 

Hi,

I am at this same point right now. I am trying to do remote power reset of HP
blades.

Do those work with IPMI too? What ports need to be opened up in firewall
for this?

Thanks,
Prakash


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Howto write a STONITH agent

2011-01-15 Thread Holger Teutsch
On Fri, 2011-01-14 at 17:10 +0100, Christoph Herrmann wrote:
 -Ursprüngliche Nachricht-
 Von: Dejan Muhamedagic deja...@fastmail.fm
 Gesendet: Fr 14.01.2011 12:31
 An: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org; 
 Betreff: Re: [Pacemaker] Howto write a STONITH agent
 
  Hi,
  
  On Thu, Jan 13, 2011 at 09:09:38PM +0100, Christoph Herrmann wrote:
   Hi,
   
   I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade 
   Edition 
  1.81 ...)
   But I'm not able to connect with them via the external/riloe agent.
   When i try:
   
   stonith -t external/riloe -p hostlist=node1 ilo_hostname=ilo1  
  ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
  ilo_powerdown_method=power -S
  
  Try this:
  
  stonith -t external/riloe hostlist=node1 ilo_hostname=ilo1  
  ilo_user=ilouser 
  ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
  ilo_powerdown_method=power -S
 
 thats much better (looks like PEBKAC ;-), thanks! But it is not reliable. 
 I've tested it about 10 times
 and 5 times it hangs.  That's not what I want.
I had the same experience. Ilo is _extremely_ slow and unreliable.

Go for external/ipmi.

That works very fast and reliable. It is available with ILO 2.x
firmware.

- holger


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Howto write a STONITH agent

2011-01-14 Thread Dejan Muhamedagic
Hi,

On Thu, Jan 13, 2011 at 09:09:38PM +0100, Christoph Herrmann wrote:
 Hi,
 
 I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade Edition 
 1.81 ...)
 But I'm not able to connect with them via the external/riloe agent.
 When i try:
 
 stonith -t external/riloe -p hostlist=node1 ilo_hostname=ilo1  
 ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
 ilo_powerdown_method=power -S

Try this:

stonith -t external/riloe hostlist=node1 ilo_hostname=ilo1  ilo_user=ilouser 
ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
ilo_powerdown_method=power -S

Thanks,

Dejan

 
 I get the following answer:
 
 external/riloe[14317]: ERROR: unknown power method %s, setting to power
 external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while 
 talking to ilo_hostname=ilo1
 
 ** (process:14315): CRITICAL **: external_run_cmd: Calling 
 '/usr/lib64/stonith/plugins/external/riloe status' returned 1
 
 ** (process:14315): CRITICAL **: external_status: 'riloe status' failed with 
 rc 1
 stonith: external/riloe device not accessible.
 
 
 But I can access ilo1 with http, https and ssh. The easiest way to reset a 
 node is to run:
 
 ssh -i ilo-sshkey ilouser@ilo1 reset system1 
 
 I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) than 
 debugging the existing one. But I'm looking for a short howto. I've read some 
 STONITH agents, but they are not completely self-explaining and I have some 
 questions. Is there a short howto write a stonith agent manual which google 
 and I were not able to find?
 Or should I post all questions to the list?
 here we go:
 
 1. (and most important): What does the status check do, if you have an agent 
 which runs as cloned resource (my ssh-ilo agent should run as a cloned 
 resource). Does it check all nodes? Is it possible to check the status of a 
 single node?
 2. What are the expected return codes?
 
 more to follow ;-)
 
 
 
 
 regards
 
 
Christoph :-)
 -- 
 Vorstand/Board of Management:
 Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
 Dr. Arno Steitz, Dr. Ingrid Zech
 Vorsitzender des Aufsichtsrats/
 Chairman of the Supervisory Board:
 Michel Lepert
 Sitz/Registered Office: Tuebingen
 Registergericht/Registration Court: Stuttgart
 Registernummer/Commercial Register No.: HRB 382196 
 
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: 
 http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Howto write a STONITH agent

2011-01-14 Thread Christoph Herrmann
-Ursprüngliche Nachricht-
Von: Dejan Muhamedagic deja...@fastmail.fm
Gesendet: Fr 14.01.2011 12:31
An: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org; 
Betreff: Re: [Pacemaker] Howto write a STONITH agent

 Hi,
 
 On Thu, Jan 13, 2011 at 09:09:38PM +0100, Christoph Herrmann wrote:
  Hi,
  
  I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade 
  Edition 
 1.81 ...)
  But I'm not able to connect with them via the external/riloe agent.
  When i try:
  
  stonith -t external/riloe -p hostlist=node1 ilo_hostname=ilo1  
 ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
 ilo_powerdown_method=power -S
 
 Try this:
 
 stonith -t external/riloe hostlist=node1 ilo_hostname=ilo1  ilo_user=ilouser 
 ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
 ilo_powerdown_method=power -S

thats much better (looks like PEBKAC ;-), thanks! But it is not reliable. I've 
tested it about 10 times
and 5 times it hangs.  That's not what I want.
Finally I will use my own ssh-ilo agent. It's very simple (KISS) and reliable. 
The external/riloe agent did not
look to simple.

So my questions still remain. Is there a HOWTO for writing stonith agents.
Is it usefull to write (to run) a stonith agent as cloned resource?
What should the status check do with a cloned stonith resource. Is it usefull 
in any way? (As long as I have 4 different nodes with 4 different ilo boards.)

 
Cheers,


  Christoph :-)


 Thanks,
 
 Dejan
 
  
  I get the following answer:
  
  external/riloe[14317]: ERROR: unknown power method %s, setting to power
  external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while 
 talking to ilo_hostname=ilo1
  
  ** (process:14315): CRITICAL **: external_run_cmd: Calling 
 '/usr/lib64/stonith/plugins/external/riloe status' returned 1
  
  ** (process:14315): CRITICAL **: external_status: 'riloe status' failed 
  with 
 rc 1
  stonith: external/riloe device not accessible.
  
  
  But I can access ilo1 with http, https and ssh. The easiest way to reset a 
 node is to run:
  
  ssh -i ilo-sshkey ilouser@ilo1 reset system1 
  
  I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) 
  than 
 debugging the existing one. But I'm looking for a short howto. I've read some 
 STONITH agents, but they are not completely self-explaining and I have some 
 questions. Is there a short howto write a stonith agent manual which google 
 and 
 I were not able to find?
  Or should I post all questions to the list?
  here we go:
  
  1. (and most important): What does the status check do, if you have an 
  agent 
 which runs as cloned resource (my ssh-ilo agent should run as a cloned 
 resource). Does it check all nodes? Is it possible to check the status of a 
 single node?
  2. What are the expected return codes?
  
  more to follow ;-)
  
  
  
  
  regards
  
  
 Christoph :-)
-- 
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Michel Lepert
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196 



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Howto write a STONITH agent

2011-01-14 Thread Dejan Muhamedagic
On Fri, Jan 14, 2011 at 05:10:17PM +0100, Christoph Herrmann wrote:
 -Ursprüngliche Nachricht-
 Von: Dejan Muhamedagic deja...@fastmail.fm
 Gesendet: Fr 14.01.2011 12:31
 An: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org; 
 Betreff: Re: [Pacemaker] Howto write a STONITH agent
 
  Hi,
  
  On Thu, Jan 13, 2011 at 09:09:38PM +0100, Christoph Herrmann wrote:
   Hi,
   
   I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade 
   Edition 
  1.81 ...)
   But I'm not able to connect with them via the external/riloe agent.
   When i try:
   
   stonith -t external/riloe -p hostlist=node1 ilo_hostname=ilo1  
  ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
  ilo_powerdown_method=power -S
  
  Try this:
  
  stonith -t external/riloe hostlist=node1 ilo_hostname=ilo1  
  ilo_user=ilouser 
  ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
  ilo_powerdown_method=power -S
 
 thats much better (looks like PEBKAC ;-), thanks! But it is not reliable. 
 I've tested it about 10 times
 and 5 times it hangs.  That's not what I want.

Did you try to find out why did it hang?

 Finally I will use my own ssh-ilo agent. It's very simple (KISS) and 
 reliable. The external/riloe agent did not
 look to simple.

Right. Let's everybody roll our own ;-

 So my questions still remain. Is there a HOWTO for writing stonith agents.

No.

 Is it usefull to write (to run) a stonith agent as cloned resource?

Sometimes. There are quite some resources. You can take a look
at clusterlabs.org.

 What should the status check do with a cloned stonith resource. Is it usefull 
 in any way? (As long as I have 4 different nodes with 4 different ilo boards.)

The status should check for the device status, not nodes.

Thanks,

Dejan

 
  
 Cheers,
 
 
   Christoph :-)
 
 
  Thanks,
  
  Dejan
  
   
   I get the following answer:
   
   external/riloe[14317]: ERROR: unknown power method %s, setting to power
   external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while 
  talking to ilo_hostname=ilo1
   
   ** (process:14315): CRITICAL **: external_run_cmd: Calling 
  '/usr/lib64/stonith/plugins/external/riloe status' returned 1
   
   ** (process:14315): CRITICAL **: external_status: 'riloe status' failed 
   with 
  rc 1
   stonith: external/riloe device not accessible.
   
   
   But I can access ilo1 with http, https and ssh. The easiest way to reset 
   a 
  node is to run:
   
   ssh -i ilo-sshkey ilouser@ilo1 reset system1 
   
   I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) 
   than 
  debugging the existing one. But I'm looking for a short howto. I've read 
  some 
  STONITH agents, but they are not completely self-explaining and I have some 
  questions. Is there a short howto write a stonith agent manual which google 
  and 
  I were not able to find?
   Or should I post all questions to the list?
   here we go:
   
   1. (and most important): What does the status check do, if you have an 
   agent 
  which runs as cloned resource (my ssh-ilo agent should run as a cloned 
  resource). Does it check all nodes? Is it possible to check the status of a 
  single node?
   2. What are the expected return codes?
   
   more to follow ;-)
   
   
   
   
   regards
   
   
  Christoph :-)
 -- 
 Vorstand/Board of Management:
 Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
 Dr. Arno Steitz, Dr. Ingrid Zech
 Vorsitzender des Aufsichtsrats/
 Chairman of the Supervisory Board:
 Michel Lepert
 Sitz/Registered Office: Tuebingen
 Registergericht/Registration Court: Stuttgart
 Registernummer/Commercial Register No.: HRB 382196 
 
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: 
 http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] Howto write a STONITH agent

2011-01-13 Thread Christoph Herrmann
Hi,

I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade Edition 
1.81 ...)
But I'm not able to connect with them via the external/riloe agent.
When i try:

stonith -t external/riloe -p hostlist=node1 ilo_hostname=ilo1  
ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
ilo_powerdown_method=power -S

I get the following answer:

external/riloe[14317]: ERROR: unknown power method %s, setting to power
external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while 
talking to ilo_hostname=ilo1

** (process:14315): CRITICAL **: external_run_cmd: Calling 
'/usr/lib64/stonith/plugins/external/riloe status' returned 1

** (process:14315): CRITICAL **: external_status: 'riloe status' failed with rc 
1
stonith: external/riloe device not accessible.


But I can access ilo1 with http, https and ssh. The easiest way to reset a node 
is to run:

ssh -i ilo-sshkey ilouser@ilo1 reset system1 

I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) than 
debugging the existing one. But I'm looking for a short howto. I've read some 
STONITH agents, but they are not completely self-explaining and I have some 
questions. Is there a short howto write a stonith agent manual which google and 
I were not able to find?
Or should I post all questions to the list?
here we go:

1. (and most important): What does the status check do, if you have an agent 
which runs as cloned resource (my ssh-ilo agent should run as a cloned 
resource). Does it check all nodes? Is it possible to check the status of a 
single node?
2. What are the expected return codes?

more to follow ;-)




regards


   Christoph :-)
-- 
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Michel Lepert
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196 



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Howto write a STONITH agent

2011-01-13 Thread Bob Haxo
Hi Christoph,

Have you taken a look in /usr/lib64/stonith/plugins/external?

The ipmi plugin might serve as a coding example/template. Or maybe the
drac5 plugin. At first glance, drac5 appears to be using ssh.

Bob Haxo


On Thu, 2011-01-13 at 21:09 +0100, Christoph Herrmann wrote:
 Hi,
 
 I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade Edition 
 1.81 ...)
 But I'm not able to connect with them via the external/riloe agent.
 When i try:
 
 stonith -t external/riloe -p hostlist=node1 ilo_hostname=ilo1  
 ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
 ilo_powerdown_method=power -S
 
 I get the following answer:
 
 external/riloe[14317]: ERROR: unknown power method %s, setting to power
 external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while 
 talking to ilo_hostname=ilo1
 
 ** (process:14315): CRITICAL **: external_run_cmd: Calling 
 '/usr/lib64/stonith/plugins/external/riloe status' returned 1
 
 ** (process:14315): CRITICAL **: external_status: 'riloe status' failed with 
 rc 1
 stonith: external/riloe device not accessible.
 
 
 But I can access ilo1 with http, https and ssh. The easiest way to reset a 
 node is to run:
 
 ssh -i ilo-sshkey ilouser@ilo1 reset system1 
 
 I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) than 
 debugging the existing one. But I'm looking for a short howto. I've read some 
 STONITH agents, but they are not completely self-explaining and I have some 
 questions. Is there a short howto write a stonith agent manual which google 
 and I were not able to find?
 Or should I post all questions to the list?
 here we go:
 
 1. (and most important): What does the status check do, if you have an agent 
 which runs as cloned resource (my ssh-ilo agent should run as a cloned 
 resource). Does it check all nodes? Is it possible to check the status of a 
 single node?
 2. What are the expected return codes?
 
 more to follow ;-)
 
 
 
 
 regards
 
 
Christoph :-)


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker