Re: [Linux-ha-dev] crmsh error : cib-bootstrap-options already exist

2013-08-29 Thread Kristoffer Grönlund
Hi Takatoshi-san, On Thu, 29 Aug 2013 10:25:14 +0900 Takatoshi MATSUO matsuo@gmail.com wrote: BTW, why doesn't crm site support up command ? Only end is official ? Thank you for noticing this problem, the code which sets up command aliases like up was missing for the site command. This

Re: [Linux-ha-dev] crmsh error : cib-bootstrap-options already exist

2013-08-29 Thread Takatoshi MATSUO
Hi Kristoffer Thank you. I confirmd it. Regards, Takatoshi MATSUO 2013/8/29 Kristoffer Grönlund kgronl...@suse.com: Hi Takatoshi-san, On Thu, 29 Aug 2013 10:25:14 +0900 Takatoshi MATSUO matsuo@gmail.com wrote: BTW, why doesn't crm site support up command ? Only end is official ?

Re: [Linux-ha-dev] crmsh error : cib-bootstrap-options already exist

2013-08-29 Thread Lars Marowsky-Bree
On 2013-08-28T20:13:43, Dejan Muhamedagic de...@suse.de wrote: A new RC has been released today. It contains both fixes. It doesn't do atomic updates anymore, because cibadmin or something cannot stomach comments. Couldn't find the upstream bug report :-( Can you give me the pacemaker bugid,

Re: [Linux-ha-dev] crmsh error : cib-bootstrap-options already exist

2013-08-29 Thread Dejan Muhamedagic
Hi Lars, On Thu, Aug 29, 2013 at 10:49:33AM +0200, Lars Marowsky-Bree wrote: On 2013-08-28T20:13:43, Dejan Muhamedagic de...@suse.de wrote: A new RC has been released today. It contains both fixes. It doesn't do atomic updates anymore, because cibadmin or something cannot stomach

[Linux-HA] Antw: A couple of questions regarding STONITH fencing ...

2013-08-29 Thread Ulrich Windl
Hi! After some short thinking I find that using ssh as STONITH is probably the wrong thing to do, because it can never STONITH if the target is down already. (Is it documented that a STONITH operation should work independent from the target being up or down?) Maybe some shared storage and a

Re: [Linux-HA] Antw: A couple of questions regarding STONITH fencing ...

2013-08-29 Thread Alex Sudakar
On Thu, Aug 29, 2013 at 4:18 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: After some short thinking I find that using ssh as STONITH is probably the wrong thing to do, because it can never STONITH if the target is down already. Maybe some shared storage and a mechanism like sbd

Re: [Linux-HA] Rare issue with exportfs RA

2013-08-29 Thread Dejan Muhamedagic
Hi, On Wed, Aug 07, 2013 at 05:27:14PM +0200, Caspar Smit wrote: Hi all, Every couple of months i'm being hit by a very annoying exportfs issue: The symptoms are almost identical (except a few log messages) to this previous thread:

[Linux-HA] Antw: Re: Rare issue with exportfs RA

2013-08-29 Thread Ulrich Windl
Hi! exportfs monitor timeouts can be a network or name(server) lookup issue. Also make sure you export using FQHNs. Regards, Ulrich Dejan Muhamedagic deja...@fastmail.fm schrieb am 29.08.2013 um 15:29 in Nachricht 20130829132922.GA4442@walrus.homenet: Hi, On Wed, Aug 07, 2013 at

[Linux-HA] error: te_connect_stonith: Sign-in failed: triggered a retry

2013-08-29 Thread Tom Parker
Hello Since my upgrade last night I am also seeing this message in the logs on my servers. error: te_connect_stonith: Sign-in failed: triggered a retry Old mailing lists seem to imply that this is an issue with heartbeat which I don't think I am running. My software stack is this at the

Re: [Linux-HA] error: te_connect_stonith: Sign-in failed: triggered a retry

2013-08-29 Thread Andrew Beekhof
On 30/08/2013, at 5:51 AM, Tom Parker tpar...@cbnco.com wrote: Hello Since my upgrade last night I am also seeing this message in the logs on my servers. error: te_connect_stonith: Sign-in failed: triggered a retry Old mailing lists seem to imply that this is an issue with heartbeat

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Andrew Beekhof
On 30/08/2013, at 5:49 AM, Tom Parker tpar...@cbnco.com wrote: Hello. Las night I updated my SLES 11 servers to HAE-SP3 which contains the following versions of software: cluster-glue-1.0.11-0.15.28 libcorosync4-1.4.5-0.18.15 corosync-1.4.5-0.18.15 pacemaker-mgmt-2.1.2-0.7.40

Re: [Linux-HA] error: te_connect_stonith: Sign-in failed: triggered a retry

2013-08-29 Thread Tom Parker
This is happening when I am using the really large CIB and no. There doesn't seem to be anything else. 3 of my 6 nodes were showing this error. Now that I have deleted and recreated my CIB this log message seems to have gone away. On 08/29/2013 10:16 PM, Andrew Beekhof wrote: On 30/08/2013,

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Tom Parker
My pacemaker config contains the following settings: LRMD_MAX_CHILDREN=8 export PCMK_ipc_buffer=3172882 This is what I had today to get to 127 Resources defined. I am not sure what I should choose for the PCMK_ipc_type. Do you have any suggestions for large clusters? Thanks Tom On

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Andrew Beekhof
On 30/08/2013, at 1:42 PM, Tom Parker tpar...@cbnco.com wrote: My pacemaker config contains the following settings: LRMD_MAX_CHILDREN=8 export PCMK_ipc_buffer=3172882 perhaps go higher This is what I had today to get to 127 Resources defined. I am not sure what I should choose for

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Tom Parker
Do you know if this has changed significantly from the older versions? This cluster was working fine before the upgrade. On Fri 30 Aug 2013 12:16:35 AM EDT, Andrew Beekhof wrote: On 30/08/2013, at 1:42 PM, Tom Parker tpar...@cbnco.com wrote: My pacemaker config contains the following

Re: [Linux-HA] Pacemaker 1.19 cannot manage more than 127 resources

2013-08-29 Thread Tom Parker
Thanks for your help. I think I have it solved. The trick is that the crm tools also need to know what the Pacemaker IPC buffer size is. I have set: /etc/sysconfig/pacemaker #export LRMD_MAX_CHILDREN=8 # Force use of a particular class of IPC connection #