Re: crm_master patch to eliminate do-nothing attribute updates - WAS Re: [Linux-ha-dev] Ordering of OCF Start, Stop and Monitor actions

2007-04-10 Thread Andrew Beekhof
On Apr 5, 2007, at 4:48 PM, Alan Robertson wrote: Alan Robertson wrote: Lars Marowsky-Bree wrote: On 2007-04-04T11:41:44, Doug Knight [EMAIL PROTECTED] wrote: The key word in my question was thinks. It would be useful to the RA if it could know what state the CRM thought it was in, so in

Re: crm_master patch to eliminate do-nothing attribute updates - WAS Re: [Linux-ha-dev] Ordering of OCF Start, Stop and Monitor actions

2007-04-10 Thread Alan Robertson
Andrew Beekhof wrote: On Apr 5, 2007, at 4:48 PM, Alan Robertson wrote: Alan Robertson wrote: Lars Marowsky-Bree wrote: On 2007-04-04T11:41:44, Doug Knight [EMAIL PROTECTED] wrote: The key word in my question was thinks. It would be useful to the RA if it could know what state the CRM

Re: [Linux-ha-dev] Ordering of OCF Start, Stop and Monitor actions

2007-04-10 Thread Alan Robertson
Lars Marowsky-Bree wrote: On 2007-04-05T07:40:34, Alan Robertson [EMAIL PROTECTED] wrote: That is why I'd suggest to only call it in start or post-notify; calling it in post-notify basically implies it'll be called after every state change. But, for DRBD for example, the ability to become

Re: [Linux-ha-dev] Fwd: Bug#418210: heartbeat-2: /etc/ha.d/authkeys should not determine which nodes are in the cluster

2007-04-10 Thread Alan Robertson
Simon Horman wrote: [ Reposting as I sent it to linux-ha-devel instead of linux-ha-devel the first time around ] This seems to be a bit of an easy trap to fall into. Are there any fixes floating around? I was thinking that perhaps a cluster id of some sort would be a good idea. But I'm

Re: [Linux-ha-dev] Ordering of OCF Start, Stop and Monitor actions

2007-04-10 Thread Lars Marowsky-Bree
On 2007-04-10T07:09:44, Alan Robertson [EMAIL PROTECTED] wrote: As even calling crm_master and having it do a compare and update-if-modified, or filtering it in the CIB directly requires to at least contact and query the CIB, I'd probably still track the state in the RA somewhere. (As to

Re: [Linux-ha-dev] Ordering of OCF Start, Stop and Monitor actions

2007-04-10 Thread Alan Robertson
Lars Marowsky-Bree wrote: On 2007-04-10T07:09:44, Alan Robertson [EMAIL PROTECTED] wrote: As even calling crm_master and having it do a compare and update-if-modified, or filtering it in the CIB directly requires to at least contact and query the CIB, I'd probably still track the state in

Re: [Linux-HA] crm_verfify cib.xml verification error

2007-04-10 Thread Andrew Beekhof
pretty sure i commented on this recently i'll patch it today On Apr 6, 2007, at 2:40 PM, Alan Robertson wrote: kisalay wrote: Hi, I recently migrated from 2.0.7 to 2.0.8. when I run my old ( 2.0.7 ) cib.xml through crm_verify now, I receive following warns / errors: element cib: validity

Re: [Linux-HA] 2.0.7 Failover Behavior Question

2007-04-10 Thread Andrew Beekhof
On 3/29/07, Mohler, Eric (EMOHLER) [EMAIL PROTECTED] wrote: Andrew, Thanks for your reply. Please refer to --'s below. The resulting behavior is that the app only restarts on the same node, never ping-pong. ** i assume ON and OFF

Re: [Linux-HA] Heartbeat stop hangs

2007-04-10 Thread Andrew Beekhof
On 4/9/07, Kevin Jamieson [EMAIL PROTECTED] wrote: kisalay wrote: I have a 2 node 2.0.8 Linux HA setup. I have observed that when stop is issued on my setup, as soon as the start returns, the stop hangs indefinitely, and the only way to stop heartbeat is to do killall. or wait for the

Re: [Linux-HA] OCF_RESKEY_interval

2007-04-10 Thread Bernd Schubert
On Tuesday 10 April 2007 10:15:03 Lars Marowsky-Bree wrote: However, it still gets passed in - just as OCF_RESKEY_CRM_meta_interval, to show the distinction to an instance parameter. # on probe (== exclusive) always report process not running ql_log warn OCF_RESKEY_interval =

Re: [Linux-HA] Annouce: IPAddr2 RA v1.30 alpha

2007-04-10 Thread Lars Marowsky-Bree
On 2007-04-10T11:12:11, Michael Schwartzkopff [EMAIL PROTECTED] wrote: The old script tried to auto-detect that it was run as a clone and then automatically enabled this, which I think is still preferable. If the CRM_meta_clone{,_max} show up in the environment, it should switch into this

Re: [Linux-HA] OCF_RESKEY_interval

2007-04-10 Thread Lars Marowsky-Bree
On 2007-04-10T11:08:56, Bernd Schubert [EMAIL PROTECTED] wrote: Ugh. Even probe shouldn't always return not running, but the actual state. This seems like a weird work-around for an otherwise broken monitor action, or am I missing something ...? Well, once OCF_RESKEY_interval was set, it

Re: [Linux-HA] Error in compiling LinxuHA.

2007-04-10 Thread Athrun Zara
dear Alan Robertson, Thank you very much for fast the reply and sorry for my late one. After following your suggestion by adding --disable-fatal-warnings , the source can be compiled. FYI : my configure command is : **./env \ CFLAGS=-I/opt/include -Wl,--rpath=/opt/lib \ LDFLAGS=-L/opt/lib

[Linux-HA] HA problems

2007-04-10 Thread Angelo Venera
Hi at all, i'm new about this list and about HA. I'm trying to build a HA Active/Passive for this service: amavisd clamd.amavisd dhcpd dovecot httpd mysqld named postfix smb spamassassin squid On start the heartbeat run this service and became primary. But when i try the command nmap on my

Re: [Linux-HA] Annouce: IPAddr2 RA v1.30 alpha

2007-04-10 Thread Lars Marowsky-Bree
On 2007-04-10T11:56:59, Michael Schwartzkopff [EMAIL PROTECTED] wrote: At the moment I would like to have that parameter since interop between LVS and CLUSTERIP is not tested at all. After these tests we can drop it. One can't simply drop a parameter once introduced. Why not? My script

Re: [Linux-HA] Can a RA know if a clone resource is ordered or interleave is true?

2007-04-10 Thread Alan Robertson
Lars Marowsky-Bree wrote: On 2007-04-05T08:46:40, Alan Robertson [EMAIL PROTECTED] wrote: My only comment on this is that if having two copies of your resource agent running at once causes serious problems, you need to _strongly_ consider re-writing you agent to have sufficient locking /

Re: [Linux-HA] Heartbeat stop hangs

2007-04-10 Thread Alan Robertson
kisalay wrote: Hi, I have a 2 node 2.0.8 Linux HA setup. I have observed that when stop is issued on my setup, as soon as the start returns, the stop hangs indefinitely, and the only way to stop heartbeat is to do killall. I dug a little deeper into the problem. First, the problem is

Re: [Linux-HA] HA problems

2007-04-10 Thread Alan Robertson
Angelo Venera wrote: Hi at all, i'm new about this list and about HA. I'm trying to build a HA Active/Passive for this service: amavisd clamd.amavisd dhcpd dovecot httpd mysqld named postfix smb spamassassin squid On start the heartbeat run this service and became primary. But when i

Re: [Linux-HA] OCF_RESKEY_interval

2007-04-10 Thread Andrew Beekhof
On 4/10/07, Bernd Schubert [EMAIL PROTECTED] wrote: On Tuesday 10 April 2007 14:07:54 Lars Marowsky-Bree wrote: On 2007-04-10T12:02:30, Peter Kruse [EMAIL PROTECTED] wrote: But when you return the proper status - running, failed, not running -, heartbeat should do the right thing

[Linux-HA] Status (rc.d/status)

2007-04-10 Thread Mark Frasa
Hello, For an active/passive configuration enviroment i want to know the status of hearthbeat on the local machine. I have found a script: /etc/ha.d/rc.d/status But this outputs: /etc/ha.d/rc.d/status: line 3: .: filename argument required .: usage: . filename The problem is line 3 in

Re: [Linux-HA] Getting the status of the node

2007-04-10 Thread Alan Robertson
Mark Eisenblaetter wrote: Hi, sorry, i don't find that script. Only some confusing mails about that script. do you know were i can find that script? It's not a script. What version are you running? -- Alan Robertson [EMAIL PROTECTED] Openness is the foundation and preservative of

Re: [Linux-HA] Getting the status of the node

2007-04-10 Thread Alan Robertson
Alan Robertson wrote: Mark Eisenblaetter wrote: Hi, sorry, i don't find that script. Only some confusing mails about that script. Did you read the web page? On my machine it's located in /usr/bin/cl_status. Where it is on yours depends on how you have things configured. -- Alan

Re: [Linux-HA] Annouce: IPAddr2 RA v1.30 alpha

2007-04-10 Thread Lars Marowsky-Bree
On 2007-04-10T14:39:56, Michael Schwartzkopff [EMAIL PROTECTED] wrote: Uhm. Of course. Machine 3 has just died. Of course there's no connectivity until it is restarted. Not really. Not only resource 3 is not available during failover, but ALSO all other resources! That is the problem. Uh?

Re: [Linux-HA] heartbeat does not start when the stonith device is not available

2007-04-10 Thread Alan Robertson
Martin wrote: Hello ! Today I have noticed that the heartbeat startup script does not start when the APC PDU (my stonith device) is configured in ha.cf but not available. IMHO it creates single point of failure. All the services that should be highly available are blocked by a simple

Re: [Linux-HA] bringing up IP address outside default netblock

2007-04-10 Thread Alan Robertson
Dale Yamamoto wrote: Running 2.0.5 on Debian Sarge, having an issue bringing up an IP address. These servers have plenty of IP addresses controlled by heartbeat where those IPs are in the same netblock as the server's own IP address. Our ISP has allocated us a second netblock that's not

Re: [Linux-HA] OCF_RESKEY_interval

2007-04-10 Thread Alan Robertson
Hi Bernd, Thanks for your continuing vigilance! Bernd Schubert wrote: On Thursday 05 April 2007 20:11:51 Alan Robertson wrote: This particular document had a couple of other errors too, which I believe I've corrected. See what you think. Thanks for improving the documentation, but I

[Linux-HA] Interest in Linux-HA at LinuxWorld San Francisco?

2007-04-10 Thread Alan Robertson
Hi, I'll be speaking at LinuxWorld in San Francisco August 6-9 this year. So, I'll be there. Are others from the list coming? I'll be giving a tutorial at LinuxWorld San Francisco, and I got a note which offers two things: 1) Birds of a Feather session -- Is there interest in this? 2) A