Re: [Linux-HA] STONITH for Amazon EC2 - fence_ec2

2012-10-09 Thread Kevin F. La Barre
Andreas,

Here is something interesting for you.  While attempting to define the
config. via crm configure I received and error stating the parameter was
invalid:

primitive ec2-fencing stonith:fence_ec2 \
params ec2-home=ec2_location pcmk_host_check=static-list
pcmk_host_list=node1 node2 node3 \
op monitor interval=600s timeout=300s \
op start start-delay=30s interval=0

ERROR: ec2-fencing: parameter ec2-home does not exist


I then issued a request for the metadata via stonith_admin:

# stonith_admin --metadata --agent=fence_ec2


... and received the following.  I see the action parameter but none of
the others?!  fence_ec2 is located in /usr/sbin, owned by root and 755 like
the other agents.  All other agents return XML with more/several
parameters.  Shouldn't we be seeing all the parameters in the XML output?
 I'm thinking port, ec2-home, tag, etc.

?xml version=1.0?
!DOCTYPE resource-agent SYSTEM ra-api-1.dtd
resource-agent name=fence_ec2
  version1.0/version
  longdesc lang=en
lt;!-- no value --gt;
  /longdesc
  shortdesc lang=enlt;!-- no value --gt;/shortdesc
  parameters
parameter name=action
  getopt mixed=-o /
  content type=string default=reboot /
  shortdesc lang=enFencing action (null, off, on, [reboot], status,
hostlist, devstatus)/shortdesc
/parameter
  /parameters
  actions
action name=start   timeout=20 /
action name=stoptimeout=15 /
action name=status  timeout=20 /
action name=monitor timeout=20 interval=3600/
action name=meta-data  timeout=15 /
  /actions
  special tag=heartbeat
version2.0/version
  /special
/resource-agent


-Kevin



On Mon, Oct 8, 2012 at 2:10 PM, Andreas Kurz andr...@hastexo.com wrote:

 On 10/06/2012 08:45 AM, Kevin F. La Barre wrote:
  I'm trying to get the fence_ec2 agent (link below) working and a bit
  confused on how it should be configured.  I have modified the agent with
  the EC2 key and cert, region, etc.  The part of confused about is the
  port argument and how it's supposed to work.  Am I supposed to hardcode
  the uname into the port variable or is this somehow passed into the
  script as an argument?   If I hardcode it, I don't understand how
 Pacemaker
  passes on the information as to which node to kill.  Versions and config.
  details follow.
 
  I apologize if this has been vague.  Please let me know if you need more
  information.
 
  Fencing agent:
 
 https://github.com/beekhof/fence_ec2/blob/392a146b232fbf2bf2f75605b1e92baef4be4a01/fence_ec2
 
  crm configure primitive ec2-fencing stonith::fence_ec2 \
  params action=reboot \
  op monitor interval=60s

 try something like:

 primitive stonith_my-ec2-nodes stonith:fence_ec2 \
 params ec2-home=/root/.ec2 pcmk_host_check=static-list
 pcmk_host_list=myec2-01 myec2-02 \
 op monitor interval=600s timeout=300s \
 op start start-delay=30s interval=0


 ... where the nodenames are sent as port paramter.

 Regards,
 Andreas

 --
 Need help with Pacemaker?
 http://www.hastexo.com/now

 
  Corosync v1.4.1
  Pacemaker v1.1.7
  CentOS 6.2
 
  -Kevin
  ___
  Linux-HA mailing list
  Linux-HA@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha
  See also: http://linux-ha.org/ReportingProblems
 





 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat not starting when both nodes are down

2012-10-09 Thread Nicolás
El 08/10/2012 20:56, Andreas Kurz escribió:
 On 10/08/2012 09:42 PM, Nicolás wrote:
 El 28/09/2012 20:42, Nicolás escribió:
 Hi all!

 I'm new to this list, I've been looking to get some info about this but
 I haven't seen anything, so I'm trying this way.

 I've successfully configured a 2-node cluster with DRBD + Heartbeat +
 Pacemaker. It works as expected.

 The problem comes when both nodes are down. Having this, after powering
 on one of the nodes, I can see it configuring the network but after this
 I never see the console for this machine. So I try to connect via SSH
 and realize that Heartbeat is not running. After I run it manually I can
 see the console for this node. This only happens when BOTH nodes are
 down. When just one is, everything goes right as Heartbeat starts
 automatically on the powering-on node.

 I see nothing relevant in logs, my conf is as follows:

 root@cluster1:~# cat /etc/ha.d/ha.cf | grep -e ^[^#]
 logfacility local0
 ucast eth1 192.168.0.91
 ucast eth0 192.168.20.51
 auto_failback on
 nodecluster1.gamez.es cluster2.gamez.es
 use_logd yes
 crm  on
 autojoin none

 Any ideas on what am I doing wrong?
 [...]

 For a new cluster use Corosync and not Heartbeat,disable DRBD init
 script and configure it as a Pacemaker master-slave resource.


Thanks for this! Once I disabled DRBD init script it worked as it should.

Regards,

Nicolás

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems