Re: [Pacemaker] Route OCF RA and Failover IP

2009-11-30 Thread Billy Guthrie

Florian,

Thanks for the input; for the time being I have it running.
I have commented out the validation for source for now. On the stanby node, 
the

gateway is reachable as there is an IP address on eth0 in the same
subnet as the VIP (eth0:0; however not active yet), so it is not
failing on the gateway. I have specificed a source address that is
not yet active on the standby and that is where it is failing.



   # If a source address has been configured, is it available on this 
system?

#if [ -n ${OCF_RESKEY_source} ]; then
#   if ! ip address show | grep -w ${OCF_RESKEY_source} /dev/null 21; 
then
#   ocf_log error Source address ${OCF_RESKEY_source} appears not 
to be available on this system.

#   # same reason as with _device:
#   return $OCF_ERR_INSTALLED
#   fi
#   fi

Thank you for your time on this matter


Billy



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] Contraining clones per node

2009-11-30 Thread Jens . Braeuer
Hi everyone,

i have been into heartbeat 2 and pacemaker for some time now and wonder 
wheather i can use it in more than just the normal HA situation. However 
going through the excelent Pacemaker Configuration Explained or the 
Linux HA Cluster book by Michael Schwarzkopff, i still have no idea how to 
configure pacemaker for my scenario.

My environment consists of multiple servers (~40), each with one or more 
cpu-cores. I have two application-types called A and B (services like eg. 
apache), that each use one cpu core. A is mission critical, B is optional.
So what i want to express is that there should be 20 A's and the remaining 
cpu's may be used by B's. When a node executing A's fails, it is perfectly 
ok to shut down B's to make cpu cores available for A's to be started.

Any idea how to do this?

Going through the various examples in the book and pdf, i found examples 
on how to use instance-attributes for one resource. That mean things like 
start apache only on host with more than XY ram or MN cpu speed. 
However, in my scenario i thing i need contraints that invole the number 
of resources on the host. An example would be the sum of A's and B's 
started on node must be less or equal the number of cpu cores. But even 
going through the parameters supplied to ocf-agent (page 72 in the 
pacemaker explained pdf), it seem i am unable to figure out how many 
clones a currently runs.

Is pacemaker able to handle such constraints? Is there some work-around 
(eg with score-values) to emulate such behavior?

any ideas/hints/comments are very welcome.
best regards,

Jens Bräuer___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Contraining clones per node

2009-11-30 Thread Michael Schwartzkopff
Am Montag, 30. November 2009 14:07:23 schrieb jens.brae...@rohde-schwarz.com:
 Hi everyone,

 i have been into heartbeat 2 and pacemaker for some time now and wonder
 wheather i can use it in more than just the normal HA situation. However
 going through the excelent Pacemaker Configuration Explained or the
 Linux HA Cluster book by Michael Schwarzkopff, i still have no idea how to
 configure pacemaker for my scenario.

Thanks ;-)

 My environment consists of multiple servers (~40), each with one or more
 cpu-cores. I have two application-types called A and B (services like eg.
 apache), that each use one cpu core. A is mission critical, B is optional.
 So what i want to express is that there should be 20 A's and the remaining
 cpu's may be used by B's. When a node executing A's fails, it is perfectly
 ok to shut down B's to make cpu cores available for A's to be started.

 Any idea how to do this?

In pacemaker resources have a meta_attribute priority. If there are not 
enough nodes available ton run all resources the resources with higher 
priority are run.

so make a clone of to start 20  times A. Resource A has a priority of 20. Make 
a clone of B with B having a priority of 10.

 Going through the various examples in the book and pdf, i found examples
 on how to use instance-attributes for one resource. That mean things like
 start apache only on host with more than XY ram or MN cpu speed.
 However, in my scenario i thing i need contraints that invole the number
 of resources on the host. An example would be the sum of A's and B's
 started on node must be less or equal the number of cpu cores. But even
 going through the parameters supplied to ocf-agent (page 72 in the
 pacemaker explained pdf), it seem i am unable to figure out how many
 clones a currently runs.

Resource allocation is a feature of the next verison. As far as I know it does 
not work up to now. At least it is not well tested.

 Is pacemaker able to handle such constraints? Is there some work-around
 (eg with score-values) to emulate such behavior?

See prio above.

 any ideas/hints/comments are very welcome.
 best regards,

 Jens Bräuer

Greetings to Munich.

-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: mi...@multinet.de
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Pacemaker shutdown issue

2009-11-30 Thread Dejan Muhamedagic
Hi,

On Mon, Nov 30, 2009 at 12:04:00AM -0500, Tony Bunce wrote:
 Hi Everyone,
 
 I'm having an issue with pacemaker and was hoping someone could point me in 
 the right direction.
 
 I'm using pacemaker with openais on a set of NFS servers.  Every time I 
 reboot the primary I get a split brain in DRBD.
 
 From what I can tell when openais is shutting down it doesn't stop the 
 services it is controlling so as far as DRBD is concerned it is the same as a 
 hard shutdown.
 
 I can reproduce the problem by stopping OpenAIS (service openais stop or 
 /etc/init.d/openais stop) and see that the controlled services (DRBD, files 
 systems, nfs, etc.) are still running.
 
 I think this is the same exact problem:
 http://www.gossamer-threads.com/lists/linuxha/pacemaker/59384
 
 Version Info:
 CentOS 5.4 x64
 drbd83-8.3.2-6.el5_3
 openais-0.80.6-8.el5_4.1
 pacemaker-1.0.5-4.1
 
 Is there something special that needs to be configured so that when openais
 stops it stops all of the resources?

No. The sequence of events is that openais tells crmd that shutdown
is pending, then crmd will try to stop all resources which are running
on the node. It may happen, usually with resources which are broken for
whatever reason, that the shutdown is escalated and that crmd gives up
on waiting for resources to stop. At any rate, if you don't see log
messages of the form lrmd.*stop.*rsc then there is probably a bug.
Please make a hb_report and file a bugzilla.

Thanks,

Dejan

 Thanks for the help!
 
 -Tony

 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] logging related information- pacemaker

2009-11-30 Thread Dejan Muhamedagic
Hi,

On Sat, Nov 28, 2009 at 08:45:26PM -0500, Shravan Mishra wrote:
 Hi,
 
 I'm using pacemaker and trying to configure logging for various
 subsytems like pengine, attrd, crmd etc.
 
 On starting corosync the only logs I see are for e.g
 
[...]

 Nothing related to stonithd or crmd etc.
 
 I have started /usr/lib64/heartbeat/ha_logd -d.
 
 Under /etc/ha.d/shellfuncs
 
 I see variables which I have exported on the command line:
 
 HA_LOGD=yes
 HA_LOGFILE=/tmp/corosync.log
 
 Am I taking a completely wrong path, am I supposed to configure just
 using corosync.conf and use logger_subsys for the above mentioned
 subsystems?

In corosync.conf, you should set

use_logd:  yes

in the service section, then specify the syslog facility
in /etc/logd.cf. A bit confusing, but openais/corosync and
ha_logd have different configuration files.

Thanks,

Dejan

 
 Sincerely
 Shravan
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] is ptest 1.06 working correctly?

2009-11-30 Thread Rasto Levrinc

On Mon, November 30, 2009 5:00 pm, Frank DiMeo wrote:
 I ran the command:




 ptest -live-check - -save-graph tmp.graph -save-dotfile tmp.dot

you need -- instead of - in your long option names. Only - should have
one -

Rasto

-- 
: Dipl-Ing Rastislav Levrinc
: DRBD-MC http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] crm_mon not refreshing

2009-11-30 Thread Joseph, Lester

No. I was using the command line utility from the terminal.

I've been told that crm_mon is now event driven and will only refresh as such.
Wouldn't that invalidate the interval option, since this is not working either.

From an operations perspective, I wanted to have a dedicated terminal window 
with crm_mon's output  displayed on screen. I would have liked the screen to 
refresh as it used to do.

Kind Regards
---
Lester Joseph
Linux Systems Administrator

From: Frank DiMeo [mailto:frank.di...@bigbandnet.com]
Sent: Monday, November 30, 2009 3:46 PM
To: Joseph, Lester
Subject: RE: [Pacemaker] crm_mon not refreshing

Are you using the web interface to crm_mon?

-Frank

From: Joseph, Lester [mailto:lester.jos...@galacoral.com]
Sent: Monday, November 30, 2009 10:10 AM
To: 'pacemaker@oss.clusterlabs.org'
Subject: [Pacemaker] crm_mon not refreshing

Hi,

I have pacemaker 1.0.6 running with heartbeat 3.0.1.
Noticed that crm_mon is not refreshing anymore, even when I specify the 
interval.

Has this been removed?
Please advise?

Kind Regards

Lester Joseph
Linux Systems Administrator
-
-
Gala Coral E-Commerce
Eurobet House
10-24 Church Street West
Woking
Surrey
GU21 6HT

T: +44 (0)1483  766766
M: +44 (0)7867 554267
F: +44 (0)1483 722141
E:  lester.jos...@galacoral.com



This email has been sent from Gala Coral Group Limited (GCG) or a subsidiary 
or associated company. GCG is registered in England with company number 
4639005. Registered office address: 71 Queensway, London W2 4QH, United 
Kingdom; website: www.galacoral.com.

This e-mail message (and any attachments) is confidential and may contain 
privileged and/or proprietorial information protected by legal rules. It is for 
use by the intended addressee only. If you believe you are not the intended 
recipient or that the sender is not authorised to send you the email, please 
return it to the sender (and please copy it to h...@galacoral.com) and then 
delete it from your computer. You should not otherwise copy or disclose its 
contents to anyone.

Except where this email is sent in the usual course of business, the views 
expressed are those of the sender and not necessarily ours. We reserve the 
right to monitor all emails sent to and from our businesses, to protect the 
businesses and to ensure compliance with internal policies.

Emails are not secure and cannot be guaranteed to be error-free, as they can be 
intercepted, amended, lost or destroyed, and may contain viruses; anyone who 
communicates with us by email is taken to accept these risks. GCG accepts no 
liability for any loss or damage which may be caused by software viruses.


This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk



This email has been sent from Gala Coral Group Limited (GCG) or a subsidiary 
or associated company. GCG is registered in England with company number 
4639005. Registered office address: 71 Queensway, London W2 4QH, United 
Kingdom; website: www.galacoral.com.

This e-mail message (and any attachments) is confidential and may contain 
privileged and/or proprietorial information protected by legal rules. It is for 
use by the intended addressee only. If you believe you are not the intended 
recipient or that the sender is not authorised to send you the email, please 
return it to the sender (and please copy it to h...@galacoral.com) and then 
delete it from your computer. You should not otherwise copy or disclose its 
contents to anyone.

Except where this email is sent in the usual course of business, the views 
expressed are those of the sender and not necessarily ours. We reserve the 
right to monitor all emails sent to and from our businesses, to protect the 
businesses and to ensure compliance with internal policies.

Emails are not secure and cannot be guaranteed to be error-free, as they can be 
intercepted, amended, lost or destroyed, and may contain viruses; anyone who 
communicates with us by email is taken to accept these risks. GCG accepts no 
liability for any loss or damage which may be caused by software viruses.
inline: image001.gif___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] is ptest 1.06 working correctly?

2009-11-30 Thread Frank DiMeo
I actually did use -- on the long options, for some reason the cut/paste in 
MS outlook collapsed them.  As you see from the enclosed files in my previous 
posting, the files are actually generated, there's just not much in them.

-Frank

 -Original Message-
 From: Rasto Levrinc [mailto:rasto.levr...@linbit.com]
 Sent: Monday, November 30, 2009 11:08 AM
 To: pacemaker@oss.clusterlabs.org
 Subject: Re: [Pacemaker] is ptest 1.06 working correctly?
 
 
 On Mon, November 30, 2009 5:00 pm, Frank DiMeo wrote:
  I ran the command:
 
 
 
 
  ptest -live-check - -save-graph tmp.graph -save-dotfile tmp.dot
 
 you need -- instead of - in your long option names. Only - should
 have
 one -
 
 Rasto
 
 --
 : Dipl-Ing Rastislav Levrinc
 : DRBD-MC http://www.drbd.org/mc/management-console/
 : DRBD/HA support and consulting http://www.linbit.com/
 DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
 
 
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] is ptest 1.06 working correctly?

2009-11-30 Thread Rasto Levrinc

On Mon, November 30, 2009 5:21 pm, Frank DiMeo wrote:
 I actually did use -- on the long options, for some reason the
 cut/paste in MS outlook collapsed them.  As you see from the enclosed
 files in my previous posting, the files are actually generated, there's
 just not much in them.


Oh, I see. It is because you don't have any transitions in live cib. It
works correctly as far as I can tell.

Rasto


-- 
: Dipl-Ing Rastislav Levrinc
: DRBD-MC http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] is ptest 1.06 working correctly?

2009-11-30 Thread Frank DiMeo
So ptest can analyze transitions that have already happened on a live node?  I 
thought it could analyze the configuration and predict behavior.  I suppose 
that's not correct?

-Frank

 -Original Message-
 From: Rasto Levrinc [mailto:rasto.levr...@linbit.com]
 Sent: Monday, November 30, 2009 11:38 AM
 To: pacemaker@oss.clusterlabs.org
 Subject: Re: [Pacemaker] is ptest 1.06 working correctly?
 
 
 On Mon, November 30, 2009 5:21 pm, Frank DiMeo wrote:
  I actually did use -- on the long options, for some reason the
  cut/paste in MS outlook collapsed them.  As you see from the enclosed
  files in my previous posting, the files are actually generated,
 there's
  just not much in them.
 
 
 Oh, I see. It is because you don't have any transitions in live cib. It
 works correctly as far as I can tell.
 
 Rasto
 
 
 --
 : Dipl-Ing Rastislav Levrinc
 : DRBD-MC http://www.drbd.org/mc/management-console/
 : DRBD/HA support and consulting http://www.linbit.com/
 DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
 
 
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] is ptest 1.06 working correctly?

2009-11-30 Thread Frank DiMeo
Actually, I don't know what you mean by the phrase you don't have any 
transitions in live cib.  Shouldn't ptest generate a graphical representation 
of the actions to be carried out on resources?

-Frank

 -Original Message-
 From: Rasto Levrinc [mailto:rasto.levr...@linbit.com]
 Sent: Monday, November 30, 2009 11:38 AM
 To: pacemaker@oss.clusterlabs.org
 Subject: Re: [Pacemaker] is ptest 1.06 working correctly?
 
 
 On Mon, November 30, 2009 5:21 pm, Frank DiMeo wrote:
  I actually did use -- on the long options, for some reason the
  cut/paste in MS outlook collapsed them.  As you see from the enclosed
  files in my previous posting, the files are actually generated,
 there's
  just not much in them.
 
 
 Oh, I see. It is because you don't have any transitions in live cib. It
 works correctly as far as I can tell.
 
 Rasto
 
 
 --
 : Dipl-Ing Rastislav Levrinc
 : DRBD-MC http://www.drbd.org/mc/management-console/
 : DRBD/HA support and consulting http://www.linbit.com/
 DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
 
 
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Pacemaker shutdown issue

2009-11-30 Thread Tony Bunce
The upgrade should really be transparent. What problems did you
encounter with nfsserver?

Whenever one of the nodes takes over the nfs resource it doesn't startup the 
first time and gives this error:

nfs_server_monitor_0 (node=nfs1, call=11, rc=2, status=complete): invalid 
parameter

If I run this command it starts up instantly and doesn't have any problems 
until the service gets migrated again:
crm_resource -C -r nfs_server


Here is that resource from my config:
primitive nfs_server ocf:heartbeat:nfsserver \
params nfs_init_script=/etc/init.d/nfs \
params nfs_notify_cmd=/sbin/rpc.statd \
params nfs_shared_infodir=/var/lib/nfs \
params nfs_ip=10.1.1.150 \
op monitor interval=30s


I haven't tested yet but was going to switch from ocf:heartbeat:nfsserver to 
lsb:nfs to see if that fixes the problem.


I also had something like this in my config:
primitive drbd_r0 ocf:heartbeat:drbd \
params drbd_resource=r0 \
op monitor=30s

That also gave me an error (I think it was action monitor_0 does not exist).

I think that needs to be switched to this:
primitive drbd_r0 ocf:linbit:drbd \
  params drbd_resource=r0
op monitor interval=29s role=Master timeout=30s \
  op monitor interval=30s role=Slave timeout=30s

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] is ptest 1.06 working correctly?

2009-11-30 Thread Darren.Mansell
I've never really understood the correct time to do the ptest graphs. I
initiated a failover once and did the graph very quickly while it was in
a transitional state but I've always wondered if there is an easier way
i.e. show me a graph of the migration plan if such and such were to
happen.

-Original Message-
From: Frank DiMeo [mailto:frank.di...@bigbandnet.com] 
Sent: 30 November 2009 16:56
To: pacemaker@oss.clusterlabs.org
Subject: Re: [Pacemaker] is ptest 1.06 working correctly?

Actually, I don't know what you mean by the phrase you don't have any
transitions in live cib.  Shouldn't ptest generate a graphical
representation of the actions to be carried out on resources?

-Frank

 -Original Message-
 From: Rasto Levrinc [mailto:rasto.levr...@linbit.com]
 Sent: Monday, November 30, 2009 11:38 AM
 To: pacemaker@oss.clusterlabs.org
 Subject: Re: [Pacemaker] is ptest 1.06 working correctly?
 
 
 On Mon, November 30, 2009 5:21 pm, Frank DiMeo wrote:
  I actually did use -- on the long options, for some reason the
  cut/paste in MS outlook collapsed them.  As you see from the
enclosed
  files in my previous posting, the files are actually generated,
 there's
  just not much in them.
 
 
 Oh, I see. It is because you don't have any transitions in live cib.
It
 works correctly as far as I can tell.
 
 Rasto
 
 
 --
 : Dipl-Ing Rastislav Levrinc
 : DRBD-MC http://www.drbd.org/mc/management-console/
 : DRBD/HA support and consulting http://www.linbit.com/
 DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
 
 
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] is ptest 1.06 working correctly?

2009-11-30 Thread Darren.Mansell
This sounds very interesting. I look forward to trying it :)

(sorry for Outlook-affliction)

-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm] 
Sent: 30 November 2009 17:28
To: pacemaker@oss.clusterlabs.org
Subject: Re: [Pacemaker] is ptest 1.06 working correctly?

Hi,

On Mon, Nov 30, 2009 at 05:04:35PM -, darren.mans...@opengi.co.uk
wrote:
 I've never really understood the correct time to do the ptest graphs.
I
 initiated a failover once and did the graph very quickly while it was
in
 a transitional state but I've always wondered if there is an easier
way
 i.e. show me a graph of the migration plan if such and such were to
 happen.

There's a fairly new feature in the crm shell with which it
is possible to edit the status section, e.g. to simulate a
resource failure or the node lost event. Then you can try
the ptest command (in configure) and it will show you what
would happen. This feature has not been complete at the time
when 1.0.6 was released and may still change.

Also, if you change the configuration and run ptest _before_
commit, that will also display the graph of what would
happen if the new configuration had been committed.

Thanks,

Dejan

 -Original Message-
 From: Frank DiMeo [mailto:frank.di...@bigbandnet.com] 
 Sent: 30 November 2009 16:56
 To: pacemaker@oss.clusterlabs.org
 Subject: Re: [Pacemaker] is ptest 1.06 working correctly?
 
 Actually, I don't know what you mean by the phrase you don't have any
 transitions in live cib.  Shouldn't ptest generate a graphical
 representation of the actions to be carried out on resources?
 
 -Frank
 
  -Original Message-
  From: Rasto Levrinc [mailto:rasto.levr...@linbit.com]
  Sent: Monday, November 30, 2009 11:38 AM
  To: pacemaker@oss.clusterlabs.org
  Subject: Re: [Pacemaker] is ptest 1.06 working correctly?
  
  
  On Mon, November 30, 2009 5:21 pm, Frank DiMeo wrote:
   I actually did use -- on the long options, for some reason the
   cut/paste in MS outlook collapsed them.  As you see from the
 enclosed
   files in my previous posting, the files are actually generated,
  there's
   just not much in them.
  
  
  Oh, I see. It is because you don't have any transitions in live cib.
 It
  works correctly as far as I can tell.
  
  Rasto
  
  
  --
  : Dipl-Ing Rastislav Levrinc
  : DRBD-MC http://www.drbd.org/mc/management-console/
  : DRBD/HA support and consulting http://www.linbit.com/
  DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.
  
  
  
  ___
  Pacemaker mailing list
  Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] Vmware Stonith device plugin which uses VMware VC

2009-11-30 Thread Joseph, Lester

Hi,

Has any one written or come across a stonith plugin for VMware that supports 
Virtual Center?

I have a few nodes which are all VMware virtual machines in a clustered 
environment.
I have painfully searched for a stonith plugin to use with this nodes.

I have been researching the possibility of creating my own using the community 
provided perl scripts that are included in the VMware-vSphere-SDK-for-Perl 
toolkit. The script vmcontrol.pl in this toolkit looks like it can do what we 
want, however, we need to incorporate this in a stonith plugin.

Operation of the vmcontrol.pl script.

Operation to be performed. One of the following:

  poweron (power on one or more virtual machines),
  poweroff (power off one  or more virtual machines),
  suspend (suspend one or more virtual machines),
  reboot (reboot one or more guests),
  reset (reset one or more virtual machines),
  shutdown (shutdown one or more guests),
  standby (set to standby mode one or guests).


Can anyone advise?

Kind Regards

Lester Joseph
Linux Systems Administrator
-
-
Gala Coral E-Commerce
Eurobet House
10-24 Church Street West
Woking
Surrey
GU21 6HT

T: +44 (0)1483  766766
M: +44 (0)7867 554267
F: +44 (0)1483 722141
E:  lester.jos...@galacoral.com



This email has been sent from Gala Coral Group Limited (GCG) or a subsidiary 
or associated company. GCG is registered in England with company number 
4639005. Registered office address: 71 Queensway, London W2 4QH, United 
Kingdom; website: www.galacoral.com.

This e-mail message (and any attachments) is confidential and may contain 
privileged and/or proprietorial information protected by legal rules. It is for 
use by the intended addressee only. If you believe you are not the intended 
recipient or that the sender is not authorised to send you the email, please 
return it to the sender (and please copy it to h...@galacoral.com) and then 
delete it from your computer. You should not otherwise copy or disclose its 
contents to anyone.

Except where this email is sent in the usual course of business, the views 
expressed are those of the sender and not necessarily ours. We reserve the 
right to monitor all emails sent to and from our businesses, to protect the 
businesses and to ensure compliance with internal policies.

Emails are not secure and cannot be guaranteed to be error-free, as they can be 
intercepted, amended, lost or destroyed, and may contain viruses; anyone who 
communicates with us by email is taken to accept these risks. GCG accepts no 
liability for any loss or damage which may be caused by software viruses.
inline: image001.gif___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Debian packages, OCFS2, high CPU load

2009-11-30 Thread Stefan Förster
* Stefan Förster cite+pacema...@incertum.net:
 * Dejan Muhamedagic deja...@fastmail.fm:
 On Fri, Nov 27, 2009 at 01:05:41PM +0100, Stefan Förster wrote:
 With Debian, apart from some minor glitches (path to controld.pcmk,
 old udev, old kernel) everything went well, but as soon as I commit
 the configuration containing the O2CB resources, both nodes become
 unresponsive, cluster communication fails and corosync (which was
 started as aisexec) is at about 100% CPU.
 
 corosync runs as corosync. aisexec is from the older openais
 (0.8x).
 
 With the Debian packages from http://people.debian.org/~madkiss/ha/,
 openais contains /usr/sbin/aisexec, which is a shellscript calling:
 
 export 
 COROSYNC_DEFAULT_CONFIG_IFACE=openaisserviceenableexperimental:corosync_parser
 corosync $@
 
 The Debian openais package also contains /usr/lib/lcrso/service_ckpt.lcrso
 which isn't loaded without the above environemnt settings. Amongst
 others, it contains:
 
 /usr/lib/lcrso/service_msg.lcrso
 /usr/lib/lcrso/service_lck.lcrso
 /usr/lib/lcrso/service_clm.lcrso
 /usr/lib/lcrso/service_evt.lcrso
 /usr/lib/lcrso/openaisserviceenable.lcrso
 /usr/lib/lcrso/service_ckpt.lcrso
 /usr/lib/lcrso/service_amf.lcrso
 /usr/lib/lcrso/service_tmr.lcrso
 
 Otherwise, perhaps you found a bug. See if it's reproducible
 without o2cb.
 
 I'm unsure on how to do this. Perhaps simply using another service
 which relies on CKPT would trigger that bug?

I could reproduce the problem: The behaivour arises as soon as
Pacemaker stops DLM for the first time - it seems it's not related to
o2cb at all. As soon as the DLM resource is stopped, the CPU usage of
corosync is at 100%.

Anything else I can do to aid in debugging this?


Cheers
Stefan

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Pacemaker shutdown issue

2009-11-30 Thread Tony Bunce
That looks like a problem in the resource agent. Most probably
you hit the bug 2219 which has been fixed on November 9.

I applied the patch and that appears to have fixed the problem!  I haven't 
tried a reboot yet but I can migrate between nodes without any issue.


This has most probably been from the crm shell. It has been relaxed
in the meantime (see bugzilla ).

That's exactly the problem I was setting.  I switched to the correct monitor 
commands (including the role) and that fixed the problem.
Both the clusterlabs.org and drbd.org show the  syntax without a role specified.

Thanks again for the help.  It looks like there is all kinds of good info in 
bugzilla.  I'll be sure to check that out first when I run into a problem.  (It 
doesn't look like Google or Bing index the bugzilla site)

-Tony

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] Possible bug: Strange behaviour of cibadmin -D

2009-11-30 Thread Michael Schwartzkopff
Hi,

I start with a fresh cluster, no cib.

1) Add a #health attribute and verify that is it i nthe CIB:
# attrd_update -n #health-smart -U red -d 1s
# cibadmin -Q | grep health
nvpair id=status-... name=#health-smart value=red/

2) So far so good. I delete the attribute. Since this is a virtual machine 
with limited access I have to do the following:
# cibadmin -Q | grep health  health.cib
# cibadmin -D -x health.cib
# cibadmin -Q | grep health
- nothing, entry gone. So far so good.

3) Now I want to write my attribute again:
# attrd_updater -n #health-smart -U red -d 1s
# cibamin -Q | grep health
- nothing. This is NOT ok.

Somehow the CIB does not accept any #health-smart attributed any more. 

4) Strange, but OK. I try to delete my whole CIB to be able to start again:
# cibadmin -E --force
# cibadmin -Q | grep health
nvpair id=status-... name=#health-smart value=red/

Here Is my attribute again! After an erasure if the CIB. How could this be?

If this a bug? Should I file it? Or I am just too stupid to use the command 
line?

Greetings,
-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: mi...@multinet.de
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] logging related information- pacemaker

2009-11-30 Thread Shravan Mishra
Thanks a lot.

On Mon, Nov 30, 2009 at 10:42 AM, Dejan Muhamedagic deja...@fastmail.fm wrote:
 Hi,

 On Sat, Nov 28, 2009 at 08:45:26PM -0500, Shravan Mishra wrote:
 Hi,

 I'm using pacemaker and trying to configure logging for various
 subsytems like pengine, attrd, crmd etc.

 On starting corosync the only logs I see are for e.g

 [...]

 Nothing related to stonithd or crmd etc.

 I have started /usr/lib64/heartbeat/ha_logd -d.

 Under /etc/ha.d/shellfuncs

 I see variables which I have exported on the command line:

 HA_LOGD=yes
 HA_LOGFILE=/tmp/corosync.log

 Am I taking a completely wrong path, am I supposed to configure just
 using corosync.conf and use logger_subsys for the above mentioned
 subsystems?

 In corosync.conf, you should set

 use_logd:  yes

 in the service section, then specify the syslog facility
 in /etc/logd.cf. A bit confusing, but openais/corosync and
 ha_logd have different configuration files.

 Thanks,

 Dejan


 Sincerely
 Shravan

 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Node crash when 'ifdown eth0'

2009-11-30 Thread Tim Serong
On 12/1/2009 at 11:05 AM, hj lee kerd...@gmail.com wrote: 
 On Fri, Nov 27, 2009 at 3:05 PM, Steven Dake sd...@redhat.com wrote: 
  
  On Fri, 2009-11-27 at 11:32 -0200, Mark Horton wrote: 
   I'm using pacemaker 1.0.6 and corosync 1.1.2 (not using openais) with 
   centos 5.4.  The packages are from here: 
   http://www.clusterlabs.org/rpm/epel-5/ 
   
   Mark 
   
   On Fri, Nov 27, 2009 at 9:01 AM, Oscar Remí-rez de Ganuza Satrústegui 
   oscar...@unav.es wrote: 
Good morning, 

We are testing a cluster configuration on RHEL5 (x86_64) with pacemaker 
1.0.5 and openais (0.80.5). 
Two node cluster, active-passive, with the following resources: 
Mysql service resource and a NFS filesystem resource (shared storage in 
  a 
SAN). 

In our tests, when we bring down the network interface (ifdown eth0), 
  the 
  
  What is the use case for ifdown eth0 (ie what are you trying to verify)? 
  
  
 I have the same test case. In my case, when two nodes cluster is disconnect, 
 I want to see split-brain. And then I want to see the split-brain handler 
 resets one of nodes. What I want to verify is that the cluster will recover 
 network disconnection and split-brain situation. 

Try this, on one node:

  # iptables -A INPUT -s ip.of.other.node -j DROP
  # iptables -A OUTPUT -d ip.of.other.node -j DROP

HTH,

Tim


-- 
Tim Serong tser...@novell.com
Senior Clustering Engineer, Novell Inc.



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Node crash when 'ifdown eth0'

2009-11-30 Thread Steven Dake
On Mon, 2009-11-30 at 17:05 -0700, hj lee wrote:
 
 
 On Fri, Nov 27, 2009 at 3:05 PM, Steven Dake sd...@redhat.com wrote:
 On Fri, 2009-11-27 at 11:32 -0200, Mark Horton wrote:
  I'm using pacemaker 1.0.6 and corosync 1.1.2 (not using
 openais) with
  centos 5.4.  The packages are from here:
  http://www.clusterlabs.org/rpm/epel-5/
 
  Mark
 
  On Fri, Nov 27, 2009 at 9:01 AM, Oscar Remí­rez de Ganuza
 Satrústegui
  oscar...@unav.es wrote:
   Good morning,
  
   We are testing a cluster configuration on RHEL5 (x86_64)
 with pacemaker
   1.0.5 and openais (0.80.5).
   Two node cluster, active-passive, with the following
 resources:
   Mysql service resource and a NFS filesystem resource
 (shared storage in a
   SAN).
  
   In our tests, when we bring down the network interface
 (ifdown eth0), the
 
 
 What is the use case for ifdown eth0 (ie what are you trying
 to verify)?
 
 I have the same test case. In my case, when two nodes cluster is
 disconnect, I want to see split-brain. And then I want to see the
 split-brain handler resets one of nodes. What I want to verify is that
 the cluster will recover network disconnection and split-brain
 situation.
 

ifconfig eth0 down is a totally different then testing if there is a
node disconnection.  When corosync detects eth0 being taken down, it
binds to the interface 127.0.0.1.  This is probably not what you had in
mind when you wanted to test split brain.  Keep in mind an interface
taken out of service is different then an interface failing from a posix
api perspective.

What you really want to test is pulling the network cable between the
machines.

Regards
-steve

 Thanks
 hj
 
 


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Node crash when 'ifdown eth0'

2009-11-30 Thread Oscar Remí­rez de Ganuza Satrústegui

Good morning,

Dejan Muhamedagic escribió:

Hi,

On Fri, Nov 27, 2009 at 12:01:17PM +0100, Oscar Remí­rez de Ganuza Satrústegui 
wrote:
  

In our tests, when we bring down the network interface (ifdown
eth0), the openais service (aisexec process) and other processes


Yes, openais gets nervous if the network interface disappears. I
think you'll find a core dump in /var/lib/openais. At any rate,
better make sure that the interface stays up. And don't use dhcp
but static addresses.
  

Ok, we were just checking different conditions.
We use static addresses anyway.

(stonithd, cib, attrd and crmd) crash, and just some processes are
still running:
[r...@herculespre ~]# ps -fea |grep ais\|heartbeat
root  2343  2335  0 Nov26 pts/000:00:18 /usr/lib64/heartbeat/lrmd
102   2345  2335  0 Nov26 pts/000:00:01 /usr/lib64/heartbeat/pengine



Processes which are not talking to aisexec.

Thanks,

Dejan
  

Thank you very much for the information!
I will test our configuration too with the rpm that Mark told us.

http://www.clusterlabs.org/rpm/epel-5/


Thanks again!
Regards,

---
Oscar Remírez de Ganuza
Servicios Informáticos
Universidad de Navarra
Ed. de Derecho, Campus Universitario
31080 Pamplona (Navarra), Spain
tfno: +34 948 425600 Ext. 3130
http://www.unav.es/SI




smime.p7s
Description: S/MIME Cryptographic Signature
___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Node crash when 'ifdown eth0'

2009-11-30 Thread Oscar Remí­rez de Ganuza Satrústegui

Hi,

Steven Dake escribió:

On Mon, 2009-11-30 at 17:05 -0700, hj lee wrote:
  

On Fri, Nov 27, 2009 at 3:05 PM, Steven Dake sd...@redhat.com wrote:
 On Fri, Nov 27, 2009 at 9:01 AM, Oscar Remí­rez de Ganuza
Satrústegui
 oscar...@unav.es wrote:
  In our tests, when we bring down the network interface
(ifdown eth0), the


What is the use case for ifdown eth0 (ie what are you trying

to verify)?

I have the same test case. In my case, when two nodes cluster is
disconnect, I want to see split-brain. And then I want to see the
split-brain handler resets one of nodes. What I want to verify is that
the cluster will recover network disconnection and split-brain
situation.




ifconfig eth0 down is a totally different then testing if there is a
node disconnection.  When corosync detects eth0 being taken down, it
binds to the interface 127.0.0.1.  This is probably not what you had in
mind when you wanted to test split brain.  Keep in mind an interface
taken out of service is different then an interface failing from a posix
api perspective.

What you really want to test is pulling the network cable between the
machines.
  
I wanted to test the split-brain situation too and the recovery from it. 
I also wanted to test a pingd resource and location we also have 
configured to see it the node put down the resources correctly when it 
detects no connection to the gateway.


Anyway, I have checked this situation and configuration successfully 
pulling the network cable from virtualcenter, but i got worried 
finding out that openais crashed and could not recover when the network 
interface gets down.


Thanks!

---
Oscar Remírez de Ganuza
Servicios Informáticos
Universidad de Navarra
Ed. de Derecho, Campus Universitario
31080 Pamplona (Navarra), Spain
tfno: +34 948 425600 Ext. 3130
http://www.unav.es/SI



smime.p7s
Description: S/MIME Cryptographic Signature
___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker