Re: [Pacemaker] Monitor ops do not get cancelled

2010-09-30 Thread Andrew Beekhof
On Tue, Sep 28, 2010 at 2:55 PM, Phil Armstrong p...@sgi.com wrote: From Andrew Beekof 1.1.3 came out the other day. which distro are you using? I'm not sure if this answers your question: novell/sles/updates/SLE11-HAE-SP1-Updates/sle-11-ia64 hmm, that doesn't tell me much about whats in

Re: [Pacemaker] stonith-ng message in /var/log/messages

2010-09-30 Thread Andrew Beekhof
On Wed, Sep 29, 2010 at 11:57 PM, Andrew Daugherity adaugher...@tamu.edu wrote: Ron Kerry rke...@... writes: I am seeing the following sequence of messages with every monitor interval for my stonith resource. Sep 28 10:44:01 genesis stonith-ng: [9493]: ERROR: run_stonith_agent: No timeout

Re: [Pacemaker] starting a xen-domU depending on available hardware-resources using SysInfo-RA

2010-09-30 Thread Sascha Reimann
Hi Dejan, it's working fine with the amount of free ram as the score and a bigger default-resource-stickiness: primitive v01 ocf:heartbeat:Xen \ params xmfile=/etc/xen/conf.d/v01.cfg \ op monitor interval=30s timeout=30s \ op start interval=0 timeout=60s \ op

Re: [Pacemaker] stop resource during promote

2010-09-30 Thread Andrew Beekhof
On Thu, Sep 30, 2010 at 12:39 AM, Mark Horton m...@nostromo.net wrote: Is it ok to stop/start a resource during a promote? I'm setting up a master/slave set of resources.  When a slave is promoted to master, I need to stop the resource, change a config file, then start it up in master mode.

Re: [Pacemaker] cib

2010-09-30 Thread Andrew Beekhof
On Tue, Sep 28, 2010 at 11:47 AM, Andrew Beekhof and...@beekhof.net wrote: On Mon, Sep 27, 2010 at 6:26 AM, Shravan Mishra shravan.mis...@gmail.com wrote: Thanks Raoul for the response. Changing the permission to hacluster:haclient did stop that error. Now I'm hitting another problem

Re: [Pacemaker] promote a ms resource to a node

2010-09-30 Thread Andrew Beekhof
A resource location constraint with role=Master would do it. Not sure about the shell syntax though. On Tue, Sep 28, 2010 at 3:51 PM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Hi, Let's say that I have manually demote a ms resource and have the following situation crm(live)resource#

Re: [Pacemaker] [Problem]Lost fail-count.

2010-09-30 Thread Andrew Beekhof
I see you've created a bug for this, I'll follow up there. On Wed, Sep 29, 2010 at 10:15 AM, renayama19661...@ybb.ne.jp wrote: Hi, We examined the trouble outbreak of a resource during cluster division and the recovery of the cluster. However, at the time of cluster recovery, the

Re: [Pacemaker] stop problem and crm node delete nodename is bug?

2010-09-30 Thread Andrew Beekhof
On Wed, Sep 29, 2010 at 9:46 AM, jiaju liu liujiaj...@yahoo.com.cn wrote: Date: Tue, 28 Sep 2010 12:27:47 +0200 From: Andrew Beekhof and...@beekhof.nethttp://cn.mc157.mail.yahoo.com/mc/compose?to=and...@beekhof.net To: The Pacemaker cluster resource manager

Re: [Pacemaker] syslog-ng as resource / how to make sure it gets restarted

2010-09-30 Thread Koch, Sebastian
Hello, i now edited the syslog-ng Debian Init Script to fit the LSB spec. Now it should be cluster ready. See the new Script version below. Feedback is appreciated and I hope it helps somebody.

Re: [Pacemaker] Does bond0 network interface work with corosync/pacemaker

2010-09-30 Thread Mike A Meyer
Pavlos, Thanks for helping out on this. We are running on RHEL 5.5 running on the iron and not a VM. We don't have SELinux turned on and the firewall is disabled. Here is information in the /etc/modprobe.conf file. alias eth0 bnx2 alias eth1 bnx2 alias scsi_hostadapter cciss alias

Re: [Pacemaker] Does bond0 network interface work with corosync/pacemaker

2010-09-30 Thread Mike A Meyer
Also, this is a two node setup in an active/passive type role. Mike From: Pavlos Parissis pavlos.paris...@gmail.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Date: 09/29/2010 04:01 PM Subject: Re: [Pacemaker] Does bond0 network interface work with

Re: [Pacemaker] Does bond0 network interface work with corosync/pacemaker

2010-09-30 Thread Pavlos Parissis
On 30 September 2010 15:23, Mike A Meyer mme...@cds-global.com wrote: Pavlos, Thanks for helping out on this. We are running on RHEL 5.5 running on the iron and not a VM. We don't have SELinux turned on and the firewall is disabled. Here is information in the /etc/modprobe.conf file.

Re: [Pacemaker] stop resource during promote

2010-09-30 Thread Mark Horton
On Thu, Sep 30, 2010 at 5:20 AM, Andrew Beekhof and...@beekhof.net wrote: On Thu, Sep 30, 2010 at 12:39 AM, Mark Horton m...@nostromo.net wrote: Is it ok to stop/start a resource during a promote? I'm setting up a master/slave set of resources.  When a slave is promoted to master, I need to

Re: [Pacemaker] starting a xen-domU depending on available hardware-resources using SysInfo-RA

2010-09-30 Thread Andrew Beekhof
On Thu, Sep 30, 2010 at 2:52 PM, Vadym Chepkov vchep...@gmail.com wrote: On Sep 30, 2010, at 2:35 AM, Sascha Reimann wrote: Hi Dejan, it's working fine with the amount of free ram as the score and a bigger default-resource-stickiness: primitive v01 ocf:heartbeat:Xen \       params

Re: [Pacemaker] crm and sudo

2010-09-30 Thread Justin Burket
Yes, I can recall this. There was a mismatch on where the shadow files are created between the crm shell and the underlying tools. It's been fixed recently (in August), more details here: https://bugzilla.novell.com/show_bug.cgi?id=626638 I went through all the novell hops for registration

Re: [Pacemaker] Does bond0 network interface work with corosync/pacemaker

2010-09-30 Thread Mike A Meyer
Hello Pavlos, We figured out our problem. Turns out one of our switches had a firewall setting that was blocking our multicast. All is working now. Thanks for your help. Mike From: Pavlos Parissis pavlos.paris...@gmail.com To: The Pacemaker cluster resource manager

[Pacemaker] Missing lrm_opstatus

2010-09-30 Thread Ron Kerry
Folks - I am seeing the following message sequence that results in a bogus declaration of monitor failures for two resources very quickly after a failover completes (from hendrix to genesis) with all resources coming up. The scenario is the same for both resources. CXFS resource monitor

Re: [Pacemaker] [Problem or Enhancement]When attrd reboots, a fail count is initialized.

2010-09-30 Thread renayama19661014
Hi Andrew, Thank you for comment. During crmd startup, one could read all the values from attrd into the hashtable. So the hashtable would only do something if only attrd went down. If attrd communicates with crmd at the time of start and reads the data of the hash table, the problem seems

[Pacemaker] [GUI]Compatibility issues of Python.

2010-09-30 Thread renayama19661014
Hi Yan, I operated latest GUI for Japanization of GUI. However, on RHEL5.5, GUI causes an error by the operation of the resource. There seems to be a cause in the difference of the version of Python. (snip) def on_rsc_action(self, action) : (cur_type, cur_name) =