[Linux-HA] Antw: Re: Pacemaker - Resource dont get started on the standby node.

2013-06-17 Thread Ulrich Windl
Hi!

The problem seems to be this:
Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: WARN: unpack_rsc_op:
Processing failed op apache_start_0 on prod-hb-nmn-002: unknown error (1)

Check why apache won't start.

Regards,
Ulrich

 Parkirat parkiratba...@gmail.com schrieb am 15.06.2013 um 22:24 in 
 Nachricht
1371327853299-14687.p...@n3.nabble.com:
 Also adding the log while it tries to do the failover from master node to
 slave node:
 
 
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: notice:
 crmd_ha_status_callback: Status update: Node prod-hb-nmn-001 now has status
 [dead] (DC=true)
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: crm_update_peer_proc:
 prod-hb-nmn-001.ais is now offline
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: WARN: match_down_event: No
 match for shutdown action on 7910c4de-718d-45d7-b4da-24b3b65b9855
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: te_update_diff:
 Stonith/shutdown of 7910c4de-718d-45d7-b4da-24b3b65b9855 not matched
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: abort_transition_graph:
 te_update_diff:191 - Triggered transition abort (complete=1, tag=node_state,
 id=7910c4de-718d-45d7-b4da-24b3b65b9855, magic=NA, cib=0.80.18) : Node
 failure
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: do_state_transition:
 State transition S_IDLE - S_POLICY_ENGINE [ input=I_PE_CALC
 cause=C_FSA_INTERNAL origin=abort_transition_graph ]
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: do_state_transition: All
 1 cluster nodes are eligible to run resources.
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: do_pe_invoke: Query 163:
 Requesting the current CIB: S_POLICY_ENGINE
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: do_pe_invoke_callback:
 Invoking the PE: query=163, ref=pe_calc-dc-1371327387-113, seq=5, quorate=1
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: notice: unpack_config: On
 loss of CCM Quorum: Ignore
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: info: unpack_config: Node
 scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: info:
 determine_online_status: Node prod-hb-nmn-002 is online
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: WARN: unpack_rsc_op:
 Processing failed op apache_start_0 on prod-hb-nmn-002: unknown error (1)
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: notice: native_print:
 apache#011(ocf::heartbeat:apache):#011Stopped
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: info: get_failcount: apache
 has failed INFINITY times on prod-hb-nmn-002
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: WARN:
 common_apply_stickiness: Forcing apache away from prod-hb-nmn-002 after
 100 failures (max=100)
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: info: native_color:
 Resource apache cannot run anywhere
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: notice: LogActions: Leave
 resource apache#011(Stopped)
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: do_state_transition:
 State transition S_POLICY_ENGINE - S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
 cause=C_IPC_MESSAGE origin=handle_response ]
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: unpack_graph: Unpacked
 transition 48: 0 actions in 0 synapses
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: do_te_invoke: Processing
 graph 48 (ref=pe_calc-dc-1371327387-113) derived from
 /var/lib/pengine/pe-input-452.bz2
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: run_graph:
 
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: notice: run_graph: Transition
 48 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
 Source=/var/lib/pengine/pe-input-452.bz2): Complete
 Jun 15 20:16:27 prod-hb-nmn-002 pengine: [4401]: info: process_pe_message:
 Transition 48: PEngine Input stored in: /var/lib/pengine/pe-input-452.bz2
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: te_graph_trigger:
 Transition 48 is now complete
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: notify_crmd: Transition
 48 status: done - null
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: do_state_transition:
 State transition S_TRANSITION_ENGINE - S_IDLE [ input=I_TE_SUCCESS
 cause=C_FSA_INTERNAL origin=notify_crmd ]
 Jun 15 20:16:27 prod-hb-nmn-002 crmd: [4395]: info: do_state_transition:
 Starting PEngine Recheck Timer
 Jun 15 20:17:17 prod-hb-nmn-002 cibadmin: [5427]: info: Invoked: cibadmin
 -Ql
 Jun 15 20:17:17 prod-hb-nmn-002 cibadmin: [5428]: info: Invoked: cibadmin
 -Ql
 Jun 15 20:17:17 prod-hb-nmn-002 crm_shadow: [5437]: info: Invoked:
 crm_shadow -c __crmshell.5404
 Jun 15 20:17:17 prod-hb-nmn-002 cibadmin: [5438]: info: Invoked: cibadmin -p
 -R -o crm_config
 Jun 15 20:17:17 prod-hb-nmn-002 crm_shadow: [5440]: info: Invoked:
 crm_shadow -C __crmshell.5404 --force
 Jun 15 20:17:17 prod-hb-nmn-002 cib: [4391]: info: cib_process_request:
 Operation complete: op cib_replace for section 'all'

Re: [Linux-HA] Antw: ocf HA_RSCTMP directory location

2013-06-17 Thread Ulrich Windl
 David Vossel dvos...@redhat.com schrieb am 14.06.2013 um 16:21 in 
 Nachricht
206418282.11638415.1371219712940.javamail.r...@redhat.com:
[...]
 I think RAs should not rely on the fact that temp directories are clean when
 a resource is going to be started.
 
 The resource tmp directory has to get cleaned out on startup, if it doesn't 
 I don't think there is a good solution for resource agents to detect a stale 
 pid file from one that is current.  Nearly all the agents depend on this tmp 
 directory to get reinitialized.  If we decided not to depend on this logic, 
 every agent would have to be altered to account for this.  This would mean 
 adding a layer of complexity to the agents that should otherwise be 
 unnecessary.
[...]

But you only have to do it right once (in a procedure/function):
If the PID file exists
then
   if the PID file is newer than the time of reboot
   then
  if there is a process with this pid
if the process having that pid matches a given pattern
then
  the process is alive
else
  another process has this PID; remove stale PID file
fi
  else
remove stale pid file
  fi
   else
 remove stale pid file
   fi
fi

Everything else ist just wrong IMHO.

Regards,
Ulrich

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: Pacemaker - Resource dont get started on the standby node.

2013-06-17 Thread Parkirat
Thanks Ulrich,

I have figured out the problem. 
The actual problem was in the configuration file for the resource httpd. It
was correct in the Master node but the configuration was missing in the
standby node, which was not allowing it to start.

Regards,
Parkirat Singh Bagga.



--
View this message in context: 
http://linux-ha.996297.n3.nabble.com/Pacemaker-Resource-dont-get-started-on-the-standby-node-tp14686p14695.html
Sent from the Linux-HA mailing list archive at Nabble.com.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: Pacemaker - Resource dont get started on the standby node.

2013-06-17 Thread emmanuel segura
Hello Parkirat

Thank you very much


2013/6/17 Parkirat parkiratba...@gmail.com

 Thanks Ulrich,

 I have figured out the problem.
 The actual problem was in the configuration file for the resource httpd. It
 was correct in the Master node but the configuration was missing in the
 standby node, which was not allowing it to start.

 Regards,
 Parkirat Singh Bagga.



 --
 View this message in context:
 http://linux-ha.996297.n3.nabble.com/Pacemaker-Resource-dont-get-started-on-the-standby-node-tp14686p14695.html
 Sent from the Linux-HA mailing list archive at Nabble.com.
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems




-- 
esta es mi vida e me la vivo hasta que dios quiera
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Heartbeat haresources with IPv6

2013-06-17 Thread listas

Hi,


I'm using Ubuntu 12.04 + Heartbeat 3.0.5-3ubuntu2 to provide high availability 
for some IP addresses.
I want to configure an IPv6 address on my haresources. I did this:

File /etc/heartbeat/haresources:

server.domain.com \
nbsp;nbsp;nbsp; 192.168.2.62/32/eth1 \
nbsp;nbsp;nbsp; 192.168.2.64/32/eth1 \
nbsp;nbsp;nbsp; 192.168.2.72/32/eth1 \
nbsp;nbsp;nbsp; IPv6addr::2001:db8:38a5:8::2006/48/eth1 \
nbsp;nbsp;nbsp; MailTo::a...@domain.com

The IPv4 addresses work fine, but I'm not getting success with the IPv6 address.
My logs shows this message:
ResourceManager[22129]: info: Running /etc/ha.d/resource.d/IPv6addr 
2001:db8:38a5:8 2006/48/eth1 start
ResourceManager[22129]: CRIT: Giving up resources due to failure of 
IPv6addr::2001:db8:38a5:8::2006/48/eth1
ResourceManager[22129]: info: Running /etc/ha.d/resource.d/IPv6addr 
2001:db8:38a5:8 2006/48/eth1 stop
ResourceManager[22129]: info: Retrying failed stop operation 
[IPv6addr::2001:db8:38a5:8::2006/48/eth1]

Apparently there is a conflict between the characters '::' inside the IPv6 address and the separator '::' used in the haresources. But I would not like have to expand the IPv6 address. 


Does anyone know a way to avoid this conflict?

Thanks!
--
Thiago Henrique
www.adminlinux.com.br







___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat haresources with IPv6

2013-06-17 Thread Digimer

Ho Thiago,

  Heartbeat is deprecated and has not been developed in some time. 
There are no plans to restart development, either. It is _strongly_ 
advised that new setups use corosync + pacemaker. You can use the IPv6 
resource agents with it, too.


  The best place to look is on clusterlabs.org's Cluster from Scratch 
tutorial. It covers as the first example setting up an (IPv4) virtual IP 
address. It should be easy to adapt that to your IPv6 implementation. 
You will see two versions; One for crmsh and one for pcs. I would 
recommend the crmsh version for Ubuntu.


Cheers

On 06/17/2013 11:35 AM, lis...@adminlinux.com.br wrote:

Hi,


I'm using Ubuntu 12.04 + Heartbeat 3.0.5-3ubuntu2 to provide high
availability for some IP addresses.
I want to configure an IPv6 address on my haresources. I did this:

File /etc/heartbeat/haresources:

server.domain.com \
nbsp;nbsp;nbsp; 192.168.2.62/32/eth1 \
nbsp;nbsp;nbsp; 192.168.2.64/32/eth1 \
nbsp;nbsp;nbsp; 192.168.2.72/32/eth1 \
nbsp;nbsp;nbsp; IPv6addr::2001:db8:38a5:8::2006/48/eth1 \
nbsp;nbsp;nbsp; MailTo::a...@domain.com

The IPv4 addresses work fine, but I'm not getting success with the IPv6
address.
My logs shows this message:
ResourceManager[22129]: info: Running /etc/ha.d/resource.d/IPv6addr
2001:db8:38a5:8 2006/48/eth1 start
ResourceManager[22129]: CRIT: Giving up resources due to failure of
IPv6addr::2001:db8:38a5:8::2006/48/eth1
ResourceManager[22129]: info: Running /etc/ha.d/resource.d/IPv6addr
2001:db8:38a5:8 2006/48/eth1 stop
ResourceManager[22129]: info: Retrying failed stop operation
[IPv6addr::2001:db8:38a5:8::2006/48/eth1]

Apparently there is a conflict between the characters '::' inside the
IPv6 address and the separator '::' used in the haresources. But I would
not like have to expand the IPv6 address.
Does anyone know a way to avoid this conflict?

Thanks!
--
Thiago Henrique
www.adminlinux.com.br







___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems



--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Resource Collocation v/s Resource Groups

2013-06-17 Thread Parkirat
Hi,

Is there any difference between Resource Collocation and Resource Groups?

I grouped 2 resources both having migration_threshold=2 and
monitor_interval=60s. When, I stopped 1 of the resource from the group, I
did not restarted. However, when I was configuring the resource not in the
group, the resource started on manually stopping it.

Also is there any way to order the sequence of the resource in a group?

Regards,
Parkirat Singh Bagga.



--
View this message in context: 
http://linux-ha.996297.n3.nabble.com/Resource-Collocation-v-s-Resource-Groups-tp14699.html
Sent from the Linux-HA mailing list archive at Nabble.com.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Why cman is started at rc0.d and rc6.d

2013-06-17 Thread Su Chen
Hi All,

I am very new to pacemaker, cororsync and cman. I installed the packages on an 
Ubuntu machine. (aptitude install pacemaker cman fence-agents)
To my surprise, cman has a link under rc0.d and rc6.d, why cman need to be 
started while system is shuting down?

root@SuTH3:/etc# ls -l /etc/rc0.d/S05cman /etc/rc6.d/S05cman
lrwxrwxrwx 1 root root 14 May 18 23:55 /etc/rc0.d/S05cman - ../init.d/cman
lrwxrwxrwx 1 root root 14 May 18 23:55 /etc/rc6.d/S05cman - ../init.d/cman

Thanks,
Su
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Why cman is started at rc0.d and rc6.d

2013-06-17 Thread Su Chen

And another thing is it never get killed. Should it be stopped when system is 
halting?

root@SuTH3:/etc# ls /etc/rc* | grep cman
S05cman
S05cman
S61cman

Thanks,
Su

From: Su Chen
Sent: Monday, June 17, 2013 11:17 AM
To: 'General Linux-HA mailing list'
Subject: Why cman is started at rc0.d and rc6.d

Hi All,

I am very new to pacemaker, cororsync and cman. I installed the packages on an 
Ubuntu machine. (aptitude install pacemaker cman fence-agents)
To my surprise, cman has a link under rc0.d and rc6.d, why cman need to be 
started while system is shuting down?

root@SuTH3:/etc# ls -l /etc/rc0.d/S05cman /etc/rc6.d/S05cman
lrwxrwxrwx 1 root root 14 May 18 23:55 /etc/rc0.d/S05cman - ../init.d/cman
lrwxrwxrwx 1 root root 14 May 18 23:55 /etc/rc6.d/S05cman - ../init.d/cman

Thanks,
Su
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Resource Collocation v/s Resource Groups

2013-06-17 Thread Sven Arnold

Hi Pakirat,


Is there any difference between Resource Collocation and Resource Groups?


Resources inside a resource group are colocated _and_ ordered. See 
http://clusterlabs.org/doc/Ordering_Explained.pdf for more details.



I grouped 2 resources both having migration_threshold=2 and
monitor_interval=60s. When, I stopped 1 of the resource from the group, I
did not restarted. However, when I was configuring the resource not in the
group, the resource started on manually stopping it.

Also is there any way to order the sequence of the resource in a group?


Best regards,

Sven
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Resource Collocation v/s Resource Groups

2013-06-17 Thread Parkirat
Hi Sven,

Thanks for the reply. I am now collocating resources but the same problem of
the resource not getting started when manually stopped persists when my
migration_threshold=2, and its the 1st time, I have brought down the
resource after doing the cleanup as well as waiting for the failure-timeout
on that node.

Note: It behaves properly when my resources are not collocated or grouped.

Regards,
Parkirat Singh Bagga.



--
View this message in context: 
http://linux-ha.996297.n3.nabble.com/Resource-Collocation-v-s-Resource-Groups-tp14699p14703.html
Sent from the Linux-HA mailing list archive at Nabble.com.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] LVM Resource agent, exclusive activation

2013-06-17 Thread David Vossel




- Original Message -
 From: David Vossel dvos...@redhat.com
 To: General Linux-HA mailing list linux-ha@lists.linux-ha.org
 Sent: Tuesday, June 4, 2013 4:41:06 PM
 Subject: Re: [Linux-HA] LVM Resource agent, exclusive activation
 
 - Original Message -
  From: David Vossel dvos...@redhat.com
  To: General Linux-HA mailing list linux-ha@lists.linux-ha.org
  Sent: Monday, June 3, 2013 10:50:01 AM
  Subject: Re: [Linux-HA] LVM Resource agent, exclusive activation
  
  - Original Message -
   From: Lars Ellenberg lars.ellenb...@linbit.com
   To: linux-ha@lists.linux-ha.org
   Sent: Tuesday, May 21, 2013 5:58:05 PM
   Subject: Re: [Linux-HA] LVM Resource agent, exclusive activation
   
   On Tue, May 21, 2013 at 05:52:39PM -0400, David Vossel wrote:
- Original Message -
 From: Lars Ellenberg lars.ellenb...@linbit.com
 To: Brassow Jonathan jbras...@redhat.com
 Cc: General Linux-HA mailing list linux-ha@lists.linux-ha.org,
 Lars
 Marowsky-Bree l...@suse.com, Fabio M. Di
 Nitto fdini...@redhat.com
 Sent: Monday, May 20, 2013 3:50:49 PM
 Subject: Re: [Linux-HA] LVM Resource agent, exclusive activation
 
 On Fri, May 17, 2013 at 02:00:48PM -0500, Brassow Jonathan wrote:
  
  On May 17, 2013, at 10:14 AM, Lars Ellenberg wrote:
  
   On Thu, May 16, 2013 at 10:42:30AM -0400, David Vossel wrote:
   
   The use of 'auto_activation_volume_list' depends on updates
   to
   the
   LVM
   initscripts - ensuring that they use '-aay' in order to
   activate
   logical
   volumes.  That has been checked in upstream.  I'm sure it
   will
   go
   into
   RHEL7 and I think (but would need to check on) RHEL6.
   
   Only that this is upstream here, so it better work with
   debian oldstale, gentoo or archlinux as well ;-)
   
   
   Would this be good enough:
   
   vgchange --addtag pacemaker $VG
   and NOT mention the pacemaker tag anywhere in lvm.conf ...
   then, in the agent start action,
   vgchange -ay --config tags { pacemaker {} } $VG
   
   (or have the to be used tag as an additional parameter)
   
   No retagging necessary.
   
   How far back do the lvm tools understand the --config ...
   option?
  
  --config option goes back years and years - not sure of the exact
  date,
  but
  could probably tell with 'git bisect' if you wanted me to.
  
  The above would not quite be sufficient.
  You would still have to change the 'volume_list' field in lvm.conf
  (and
  update the initrd).
 
 You have to do that anyways if you want to make use of tags in this
 way?
 
  What you are proposing would simplify things in that you would not
  need different 'volume_list's on each machine - you could copy
  configs
  between machines.
 
 I thought volume_list = [ ... , @* ] in lvm.conf,
 assuming that works on all relevant distributions as well,
 and a command line --config tag would also propagate into that @*.
 It did so for me.
 
 But yes, vlumen_list = [ ... , pacemaker ] would be fine as well.

wait, did we just go around in a circle.  If we add pacemaker to the
volume list, and use that in every cluster node's config, then we've
by-passed the exclusive activation part have we not?!
   
   No.  I suggested to NOT set that pacemaker tag in the config
   (lvm.conf),
   but only ever explicitly set that tag from the command line as used from
   the resource agent ( --config tags { pacemaker {} } )
   
   That would also mean to either override volume_list with the same
   command line, or to have the tag mentioned in the volume_list in
   lvm.conf (but not set it in the tags {} section).
   
Also, we're not happy with the auto_activate list because it won't
work with old distros?!  It's a new feature, why do we have to work
with old distros that don't support it?
   
   You are right, we only have to make sure we don't break existing setup
   by rolling out a new version of the RA.  So if the resource agent
   won't accidentally use a code path where support of a new feature
   (of LVM) would be required, that's good enough compatibility.
   
   Still it won't hurt to pick the most compatible implementation
   of several possible equivalent ones (RA-feature wise).
   
   I think the proposed --config tags { pacemaker {} }
   is simpler (no retagging, no re-writing of lvm meta data),
   and will work for any setup that knows about tags.
  
  I've had a good talk with Jonathan about the --config tags { pacemaker {}
  }
  approach.  This was originally complicated for us because we were using the
  --config option for a device filter during activation in certain
  situations... using the --config option twice caused problems which made
  adding the tag in the config difficult.
  
  We've