Re: [Pacemaker] Remote monitor ?

2011-04-05 Thread Andrew Beekhof
Not really, unless you have the monitor op ssh to the other machine to
run the command

On Tue, Apr 5, 2011 at 4:01 PM, Carlos G Mendioroz  wrote:
> Is there a way to let pacemaker get info on the performance of
> a resource from another node point of view ?
>
> --
> Carlos G Mendioroz    LW7 EQI  Argentina
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [Problem]Reboot by the error of the clone resource influences the resource of other nodes.

2011-04-05 Thread Vladislav Bogdanov
Hi Hideo-san,

thank you very much for information.
Will try it asap.

Best,
Vladislav

06.04.2011 04:54, renayama19661...@ybb.ne.jp wrote:
> Hi Vladislav,
> 
> I confirmed that a problem was improved with a patch of Andrew.
> Please please try a patch in the environment that your problem produced.
> 
>  * http://developerbugs.linux-foundation.org/show_bug.cgi?id=2574
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> --- On Fri, 2011/4/1, Vladislav Bogdanov  wrote:
> 
>> 01.04.2011 11:10, Andrew Beekhof wrote:
>>> On Fri, Apr 1, 2011 at 9:58 AM, Vladislav Bogdanov  
>>> wrote:
 01.04.2011 10:20, Andrew Beekhof wrote:
> The clone instance numbers for anonymous clones are an implementation
> detail and nothing should be inferred from them.
> Did anything actually get moved or just the numbers changed?
>

 Main inconvenience is that all dependent resources are forcibly restarted.
>>>
>>> Ok, then thats a bug.  Is there a hb_report somewhere for this scenario?
>>
>> Sent privately (just a note for ML).
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: 
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [Problem]Reboot by the error of the clone resource influences the resource of other nodes.

2011-04-05 Thread renayama19661014
Hi Vladislav,

I confirmed that a problem was improved with a patch of Andrew.
Please please try a patch in the environment that your problem produced.

 * http://developerbugs.linux-foundation.org/show_bug.cgi?id=2574

Best Regards,
Hideo Yamauchi.


--- On Fri, 2011/4/1, Vladislav Bogdanov  wrote:

> 01.04.2011 11:10, Andrew Beekhof wrote:
> > On Fri, Apr 1, 2011 at 9:58 AM, Vladislav Bogdanov  
> > wrote:
> >> 01.04.2011 10:20, Andrew Beekhof wrote:
> >>> The clone instance numbers for anonymous clones are an implementation
> >>> detail and nothing should be inferred from them.
> >>> Did anything actually get moved or just the numbers changed?
> >>>
> >>
> >> Main inconvenience is that all dependent resources are forcibly restarted.
> > 
> > Ok, then thats a bug.  Is there a hb_report somewhere for this scenario?
> 
> Sent privately (just a note for ML).
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] How to prevent locked I/O using Pacemaker with Primary/Primary DRBD/OCFS2 (Ubuntu 10.10)

2011-04-05 Thread Jean-Francois Malouin
Hi,

I don't want to hijack this thread so feel free to change the Subject
line if you feel like it.

* Lars Ellenberg  [20110404 16:56]:
> On Mon, Apr 04, 2011 at 01:34:48PM -0600, Mike Reid wrote:
> > All,
> > 
> > I am running a two-node web cluster on OCFS2 (v1.5.0) via DRBD
> > Primary/Primary (v8.3.8) and Pacemaker. Everything  seems to be working
> 
> If you want to stay with 8.3.8, make sure you are using 8.3.8.1 (note
> the trailing .1), or you can run into stalled resyncs.
> Or upgrade to "most recent".


Just curious, I'm running 8.3.8 but not sure about the trailing '.1'. 
Am I safe with:

~# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by root@puck,
2010-11-29 18:13:54

cheers,
jf

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Immediate fs errors on iscsi connection problem

2011-04-05 Thread Vladislav Bogdanov
05.04.2011 18:24, ruslan usifov wrote:
> GREAT THANKS!!!
> 
> It's work. 

Good. That RA may be not perfect, at least I saw some minor errors from it.

> But i have some little question, how can i got same effect on
> windows? Windows ISCSI initiator (win 2008 server) breaks connections
> and newer it restarts??

Can't say anything about its timeouts. Connection reestablishment (with
iptables hack on server side) worked for me. It would be great if you
share your findings.

> 
> 4 апреля 2011 г. 8:24 пользователь Vladislav Bogdanov
> mailto:bub...@hoster-ok.com>> написал:
> 
> Hi,
> 
> 03.04.2011 22:42, ruslan usifov wrote:
> > You need some tuning from both sides.
> > First, (at least some versions of) ietd needs to be blocked
> (-j DROP)
> > with iptables on restarts. That means, you should block all
> incoming and
> > outgoing packets (later is more important) before ietd stop
> and unblock
> > all after it starts. I use home-brew stateful RA for this,
> which blocks
> > (DROP) all traffic to/from VIP in slave mode and passes it to
> a later
> > decision (no -j) in master mode.
> >
> >
> >
> > Thanks for reply, But how to implemet this from pacemaker. Or i
> must to
> > modify inet.d scripts?
> 
> You can try attached.
> This was a first trial to write RA... It may contain some wrong logic,
> especially with scores, but it works for me.
> It is intended to be colocated with IP-address resource:
> colocation col1 inf: FW:Master ip
> order ord1 inf: ip:start FW:promote
> 
> Vladislav
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> 
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional "role" parameter

2011-04-05 Thread Holger Teutsch
Hi Dejan,

On Tue, 2011-04-05 at 13:40 +0200, Dejan Muhamedagic wrote:
> Hi Holger,
> 
> On Tue, Apr 05, 2011 at 01:19:56PM +0200, Holger Teutsch wrote:
> > Hi Dejan,
> > 
> > On Tue, 2011-04-05 at 12:27 +0200, Dejan Muhamedagic wrote:
> > > On Tue, Apr 05, 2011 at 12:10:48PM +0200, Holger Teutsch wrote:
> > > > Hi Dejan,
> > > > 
> > > > On Tue, 2011-04-05 at 11:48 +0200, Dejan Muhamedagic wrote:
> > > > > Hi Holger,
> > > > > 
> > > > > On Mon, Apr 04, 2011 at 09:31:02PM +0200, Holger Teutsch wrote:
> > > > > > On Mon, 2011-04-04 at 15:24 +0200, Andrew Beekhof wrote:
> > > > > [...]
> > > > > > 
> > > > > > crm_resource --move-off --resource myClone --node C
> > > > > >-> I want the instance moved off C, regardless where it is moved 
> > > > > > on
> > > > > 
> > > > > What is the difference between move-off and unmigrate (-U)?
> > > > 
> > > > --move-off -> create a constraint that a resource should *not* run on
> > > > the specific node (partly as before --move without --node)
> > > > 
> > > > -U: zap all migration constraints (as before) 
> > > 
> > > Ah, right, sorry, wanted to ask about the difference between
> > > move-off and move. The description looks the same as for move. Is
> > > it that in this case it is for clones so crm_resource needs an
> > > extra node parameter? You wrote in the doc:
> > > 
> > >   +Migrate a resource (-instance for clones/masters) off the specified 
> > > node.
> > > 
> > > The '-instance' looks somewhat funny. Why not say "Move/migrate a
> > > clone or master/slave instance away from the specified node"?
> > 
> > Moving away works for all kinds of resources so the text now looks like:
> > 
> > diff -r b4f456380f60 doc/crm_cli.txt
> > --- a/doc/crm_cli.txt   Thu Mar 17 09:41:25 2011 +0100
> > +++ b/doc/crm_cli.txt   Tue Apr 05 13:08:10 2011 +0200
> > @@ -818,10 +818,25 @@
> >  running on the current node. Additionally, you may specify a
> >  lifetime for the constraint---once it expires, the location
> >  constraint will no longer be active.
> > +For a master resource specify :master to move the master role.
> >  
> >  Usage:
> >  ...
> > -migrate  [] [] [force]
> > +migrate [:master] [] [] [force]
> > +...
> > +
> > +[[cmdhelp_resource_migrateoff,migrate a resource off the specified
> > node]]
> > + `migrateoff` (`moveoff`)
> > +
> > +Migrate a resource away from the specified node. 
> > +The resource is migrated by creating a constraint which prevents it
> > from
> > +running on the specified node. Additionally, you may specify a
> > +lifetime for the constraint---once it expires, the location
> > +constraint will no longer be active.
> > +
> > +Usage:
> > +...
> > +migrateoff   [] [force]
> >  ...
> >  
> >  [[cmdhelp_resource_unmigrate,unmigrate a resource to another node]]
> > 
> > > 
> > > I must say that I still find all this quite confusing, i.e. now
> > > we have "move", "unmove", and "move-off", but it's probably just me :)
> > 
> > Think of "move" == "move-to" then it is simpler 8-)
> > 
> > ... keeping in mind that for backward compatibility
> > 
> > crm_resource --move --resource myResource
> > 
> > is equivalent
> > 
> > crm_resource --move-off --resource myResource --node $(current node)
> > 
> > But as there is no "current node" for clones / masters the old
> > implementation did some random movements...
> 
> OK. Thanks for the clarification. I'd like to revise my previous
> comment about restricting use of certain constructs. For
> instance, in this case, if the command would result in a random
> movement then the shell should at least issue a warning about it.
> Or perhaps refuse to do that completely. I didn't take a look yet
> at the code, perhaps you've already done that.
> 
> Thanks,
> 
> Dejan
> 
> 

I admit you have to specify more verbosely what you want to achieve but
then the patched versions (based on patches I submitted today around
10:01) execute consistent and without surprises - at least for my test
cases. 

Regards
Holger



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Immediate fs errors on iscsi connection problem

2011-04-05 Thread ruslan usifov
GREAT THANKS!!!

It's work. But i have some little question, how can i got same effect on
windows? Windows ISCSI initiator (win 2008 server) breaks connections and
newer it restarts??

4 апреля 2011 г. 8:24 пользователь Vladislav Bogdanov
написал:

> Hi,
>
> 03.04.2011 22:42, ruslan usifov wrote:
> > You need some tuning from both sides.
> > First, (at least some versions of) ietd needs to be blocked (-j DROP)
> > with iptables on restarts. That means, you should block all incoming
> and
> > outgoing packets (later is more important) before ietd stop and
> unblock
> > all after it starts. I use home-brew stateful RA for this, which
> blocks
> > (DROP) all traffic to/from VIP in slave mode and passes it to a later
> > decision (no -j) in master mode.
> >
> >
> >
> > Thanks for reply, But how to implemet this from pacemaker. Or i must to
> > modify inet.d scripts?
>
> You can try attached.
> This was a first trial to write RA... It may contain some wrong logic,
> especially with scores, but it works for me.
> It is intended to be colocated with IP-address resource:
> colocation col1 inf: FW:Master ip
> order ord1 inf: ip:start FW:promote
>
> Vladislav
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] large number of pe-input-* files

2011-04-05 Thread Raoul Bhatia [IPAX]
On 04/05/2011 04:35 PM, Shravan Mishra wrote:
> Hi guys,
> 
> penguin process creates a large number of files under /var/lib/pengine.
> 
> We are using HA on a very high per box which is processing large
> amount of data fed fro an external source.
> There is a large number of files creation and IO taking place.
> 
> We ran out of inodes because there were something like 1500 files
> under the mentioned directory:
> 
> 
> 
> ls /var/lib/pengine/ | wc -l
> 1492
> 
> 
> 
> Is there a way to cleanup and or reduce these many files?

pacemaker can do this by itself:

> property $id="cib-bootstrap-options" \
> ...
> pe-error-series-max="100" \
> pe-warn-series-max="100" \
> pe-input-series-max="100" \
> ...

you can read about this in the pacemaker documentation.

cheers,
raoul
-- 

DI (FH) Raoul Bhatia M.Sc.  email.  r.bha...@ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OG  web.  http://www.ipax.at
Barawitzkagasse 10/2/2/11   email.off...@ipax.at
1190 Wien   tel.   +43 1 3670030
FN 277995t HG Wien  fax.+43 1 3670030 15


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] large number of pe-input-* files

2011-04-05 Thread Shravan Mishra
Hi guys,

penguin process creates a large number of files under /var/lib/pengine.

We are using HA on a very high per box which is processing large
amount of data fed fro an external source.
There is a large number of files creation and IO taking place.

We ran out of inodes because there were something like 1500 files
under the mentioned directory:



ls /var/lib/pengine/ | wc -l
1492



Is there a way to cleanup and or reduce these many files?


Sincerely
Shravan

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] Remote monitor ?

2011-04-05 Thread Carlos G Mendioroz

Is there a way to let pacemaker get info on the performance of
a resource from another node point of view ?

--
Carlos G MendiorozLW7 EQI  Argentina

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional "role" parameter

2011-04-05 Thread Dejan Muhamedagic
Hi Holger,

On Tue, Apr 05, 2011 at 01:19:56PM +0200, Holger Teutsch wrote:
> Hi Dejan,
> 
> On Tue, 2011-04-05 at 12:27 +0200, Dejan Muhamedagic wrote:
> > On Tue, Apr 05, 2011 at 12:10:48PM +0200, Holger Teutsch wrote:
> > > Hi Dejan,
> > > 
> > > On Tue, 2011-04-05 at 11:48 +0200, Dejan Muhamedagic wrote:
> > > > Hi Holger,
> > > > 
> > > > On Mon, Apr 04, 2011 at 09:31:02PM +0200, Holger Teutsch wrote:
> > > > > On Mon, 2011-04-04 at 15:24 +0200, Andrew Beekhof wrote:
> > > > [...]
> > > > > 
> > > > > crm_resource --move-off --resource myClone --node C
> > > > >-> I want the instance moved off C, regardless where it is moved on
> > > > 
> > > > What is the difference between move-off and unmigrate (-U)?
> > > 
> > > --move-off -> create a constraint that a resource should *not* run on
> > > the specific node (partly as before --move without --node)
> > > 
> > > -U: zap all migration constraints (as before) 
> > 
> > Ah, right, sorry, wanted to ask about the difference between
> > move-off and move. The description looks the same as for move. Is
> > it that in this case it is for clones so crm_resource needs an
> > extra node parameter? You wrote in the doc:
> > 
> > +Migrate a resource (-instance for clones/masters) off the specified 
> > node.
> > 
> > The '-instance' looks somewhat funny. Why not say "Move/migrate a
> > clone or master/slave instance away from the specified node"?
> 
> Moving away works for all kinds of resources so the text now looks like:
> 
> diff -r b4f456380f60 doc/crm_cli.txt
> --- a/doc/crm_cli.txt Thu Mar 17 09:41:25 2011 +0100
> +++ b/doc/crm_cli.txt Tue Apr 05 13:08:10 2011 +0200
> @@ -818,10 +818,25 @@
>  running on the current node. Additionally, you may specify a
>  lifetime for the constraint---once it expires, the location
>  constraint will no longer be active.
> +For a master resource specify :master to move the master role.
>  
>  Usage:
>  ...
> -migrate  [] [] [force]
> +migrate [:master] [] [] [force]
> +...
> +
> +[[cmdhelp_resource_migrateoff,migrate a resource off the specified
> node]]
> + `migrateoff` (`moveoff`)
> +
> +Migrate a resource away from the specified node. 
> +The resource is migrated by creating a constraint which prevents it
> from
> +running on the specified node. Additionally, you may specify a
> +lifetime for the constraint---once it expires, the location
> +constraint will no longer be active.
> +
> +Usage:
> +...
> +migrateoff   [] [force]
>  ...
>  
>  [[cmdhelp_resource_unmigrate,unmigrate a resource to another node]]
> 
> > 
> > I must say that I still find all this quite confusing, i.e. now
> > we have "move", "unmove", and "move-off", but it's probably just me :)
> 
> Think of "move" == "move-to" then it is simpler 8-)
> 
> ... keeping in mind that for backward compatibility
> 
> crm_resource --move --resource myResource
> 
> is equivalent
> 
> crm_resource --move-off --resource myResource --node $(current node)
> 
> But as there is no "current node" for clones / masters the old
> implementation did some random movements...

OK. Thanks for the clarification. I'd like to revise my previous
comment about restricting use of certain constructs. For
instance, in this case, if the command would result in a random
movement then the shell should at least issue a warning about it.
Or perhaps refuse to do that completely. I didn't take a look yet
at the code, perhaps you've already done that.

Thanks,

Dejan


> Regards
> Holger
> 
> > 
> > Cheers,
> > 
> > Dejan
> > 
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional "role" parameter

2011-04-05 Thread Holger Teutsch
Hi Dejan,

On Tue, 2011-04-05 at 12:27 +0200, Dejan Muhamedagic wrote:
> On Tue, Apr 05, 2011 at 12:10:48PM +0200, Holger Teutsch wrote:
> > Hi Dejan,
> > 
> > On Tue, 2011-04-05 at 11:48 +0200, Dejan Muhamedagic wrote:
> > > Hi Holger,
> > > 
> > > On Mon, Apr 04, 2011 at 09:31:02PM +0200, Holger Teutsch wrote:
> > > > On Mon, 2011-04-04 at 15:24 +0200, Andrew Beekhof wrote:
> > > [...]
> > > > 
> > > > crm_resource --move-off --resource myClone --node C
> > > >-> I want the instance moved off C, regardless where it is moved on
> > > 
> > > What is the difference between move-off and unmigrate (-U)?
> > 
> > --move-off -> create a constraint that a resource should *not* run on
> > the specific node (partly as before --move without --node)
> > 
> > -U: zap all migration constraints (as before) 
> 
> Ah, right, sorry, wanted to ask about the difference between
> move-off and move. The description looks the same as for move. Is
> it that in this case it is for clones so crm_resource needs an
> extra node parameter? You wrote in the doc:
> 
>   +Migrate a resource (-instance for clones/masters) off the specified 
> node.
> 
> The '-instance' looks somewhat funny. Why not say "Move/migrate a
> clone or master/slave instance away from the specified node"?

Moving away works for all kinds of resources so the text now looks like:

diff -r b4f456380f60 doc/crm_cli.txt
--- a/doc/crm_cli.txt   Thu Mar 17 09:41:25 2011 +0100
+++ b/doc/crm_cli.txt   Tue Apr 05 13:08:10 2011 +0200
@@ -818,10 +818,25 @@
 running on the current node. Additionally, you may specify a
 lifetime for the constraint---once it expires, the location
 constraint will no longer be active.
+For a master resource specify :master to move the master role.
 
 Usage:
 ...
-migrate  [] [] [force]
+migrate [:master] [] [] [force]
+...
+
+[[cmdhelp_resource_migrateoff,migrate a resource off the specified
node]]
+ `migrateoff` (`moveoff`)
+
+Migrate a resource away from the specified node. 
+The resource is migrated by creating a constraint which prevents it
from
+running on the specified node. Additionally, you may specify a
+lifetime for the constraint---once it expires, the location
+constraint will no longer be active.
+
+Usage:
+...
+migrateoff   [] [force]
 ...
 
 [[cmdhelp_resource_unmigrate,unmigrate a resource to another node]]

> 
> I must say that I still find all this quite confusing, i.e. now
> we have "move", "unmove", and "move-off", but it's probably just me :)

Think of "move" == "move-to" then it is simpler 8-)

... keeping in mind that for backward compatibility

crm_resource --move --resource myResource

is equivalent

crm_resource --move-off --resource myResource --node $(current node)

But as there is no "current node" for clones / masters the old
implementation did some random movements...

Regards
Holger

> 
> Cheers,
> 
> Dejan
> 



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional "role" parameter

2011-04-05 Thread Dejan Muhamedagic
On Tue, Apr 05, 2011 at 12:10:48PM +0200, Holger Teutsch wrote:
> Hi Dejan,
> 
> On Tue, 2011-04-05 at 11:48 +0200, Dejan Muhamedagic wrote:
> > Hi Holger,
> > 
> > On Mon, Apr 04, 2011 at 09:31:02PM +0200, Holger Teutsch wrote:
> > > On Mon, 2011-04-04 at 15:24 +0200, Andrew Beekhof wrote:
> > [...]
> > > 
> > > crm_resource --move-off --resource myClone --node C
> > >-> I want the instance moved off C, regardless where it is moved on
> > 
> > What is the difference between move-off and unmigrate (-U)?
> 
> --move-off -> create a constraint that a resource should *not* run on
> the specific node (partly as before --move without --node)
> 
> -U: zap all migration constraints (as before) 

Ah, right, sorry, wanted to ask about the difference between
move-off and move. The description looks the same as for move. Is
it that in this case it is for clones so crm_resource needs an
extra node parameter? You wrote in the doc:

+Migrate a resource (-instance for clones/masters) off the specified 
node.

The '-instance' looks somewhat funny. Why not say "Move/migrate a
clone or master/slave instance away from the specified node"?

I must say that I still find all this quite confusing, i.e. now
we have "move", "unmove", and "move-off", but it's probably just me :)

Cheers,

Dejan


> Regards
> Holger
> > 
> > Cheers,
> > 
> > Dejan
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: 
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional "role" parameter

2011-04-05 Thread Holger Teutsch
Hi Dejan,

On Tue, 2011-04-05 at 11:48 +0200, Dejan Muhamedagic wrote:
> Hi Holger,
> 
> On Mon, Apr 04, 2011 at 09:31:02PM +0200, Holger Teutsch wrote:
> > On Mon, 2011-04-04 at 15:24 +0200, Andrew Beekhof wrote:
> [...]
> > 
> > crm_resource --move-off --resource myClone --node C
> >-> I want the instance moved off C, regardless where it is moved on
> 
> What is the difference between move-off and unmigrate (-U)?

--move-off -> create a constraint that a resource should *not* run on
the specific node (partly as before --move without --node)

-U: zap all migration constraints (as before) 

Regards
Holger
> 
> Cheers,
> 
> Dejan
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] double stonith device

2011-04-05 Thread Dejan Muhamedagic
Hi,

On Tue, Apr 05, 2011 at 10:28:13AM +0200, Christian Zoffoli wrote:
> Il 04/04/2011 20:03, Andrew Daugherity ha scritto:
> [cut]
> > The APC PDUs do support outlet groups spanning several PDUs, using 
> > multicast I
> > believe.  There's even a note about it in README.rackpdu in the cluster-glue
> > package:
> > 
> > In case your nodes are equipped with multiple power supplies, the
> > PDU supports synchronous operation on multiple outlets on up to
> > four Switched Rack PDUs. See the User's Guide for more
> > information on how to setup outlet groups.
> > 
> > 
> > What I'm not clear about is how to tell the apcmaster stonith plugin which
> > outlet group to control.  The only parameters mentioned in its built-in 
> > help are
> > ipaddr, login, and password.

I guess you mean apcmastersnmp. Just name outlets after nodes.

> thank you very much ...I've not seen such feature before ...I'll make
> some tests.

There was some discussion about this rackpdu feature on the
linux-ha ML and, apparently, there's no guarantee that all
outlets are actually reset (or turned off). So, you should do
thorough testing. I never tested it myself and I'd appreciate if
you share your findings.

Cheers,

Dejan


> Christian
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional "role" parameter

2011-04-05 Thread Dejan Muhamedagic
Hi Holger,

On Mon, Apr 04, 2011 at 09:31:02PM +0200, Holger Teutsch wrote:
> On Mon, 2011-04-04 at 15:24 +0200, Andrew Beekhof wrote:
[...]
> 
> crm_resource --move-off --resource myClone --node C
>-> I want the instance moved off C, regardless where it is moved on

What is the difference between move-off and unmigrate (-U)?

Cheers,

Dejan

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] double stonith device

2011-04-05 Thread Christian Zoffoli
Il 04/04/2011 17:05, Dejan Muhamedagic ha scritto:
[cut]
> How comes? AFAIK, many setups rely on IPMI for fencing. Just
> update the firmware and do proper testing.

bad experience with some DRAC5 IPMIs

>> probably the best way shoud be create a new stonith module to manage
>> such setup
> 
> That's also possible. We accept patches too :)

if I'll choose such way ...you will see patches :)


Christian

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] double stonith device

2011-04-05 Thread Christian Zoffoli
Il 04/04/2011 20:03, Andrew Daugherity ha scritto:
[cut]
> The APC PDUs do support outlet groups spanning several PDUs, using multicast I
> believe.  There's even a note about it in README.rackpdu in the cluster-glue
> package:
> 
> In case your nodes are equipped with multiple power supplies, the
> PDU supports synchronous operation on multiple outlets on up to
> four Switched Rack PDUs. See the User's Guide for more
> information on how to setup outlet groups.
> 
> 
> What I'm not clear about is how to tell the apcmaster stonith plugin which
> outlet group to control.  The only parameters mentioned in its built-in help 
> are
> ipaddr, login, and password.


thank you very much ...I've not seen such feature before ...I'll make
some tests.


Christian

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] double stonith device

2011-04-05 Thread Christian Zoffoli
Il 05/04/2011 01:21, Lars Marowsky-Bree ha scritto:
> On 2011-04-04T15:34:16, Christian Zoffoli  wrote:
> 
>> I cannot connect both PSUs to a single PDU without loosing power source
>> reduncancy (I have dual UPS, dual power lines and so on).
> 
> Yes. Because the alternative is to lose the ability to fence the node if
> you lose one of the fencing devices. (I assume you'd be unable to
> confirm the fence if one of them was without power?)

if the APC fence device is not available I can check the status using
IPMI ...but considering only the APC switched PDU ...you are absolutely
right

> You need to look at the larger picture: even though each node is only
> on one power supply, the whole cluster is on two.

...there is a side effect ...sometimes we need to bring down 1 power
circuit ...for maintenance or whatever ...so we need two separate power
circuit to not impact on availability

> Alternatively, use a mechanism like SBD which doesn't fence via the
> power device, but uses the shared storage.

...interesting ...I'll have a look also to SBD

Thank you

Christian

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional "role" parameter

2011-04-05 Thread Holger Teutsch
On Mon, 2011-04-04 at 21:31 +0200, Holger Teutsch wrote:
> On Mon, 2011-04-04 at 15:24 +0200, Andrew Beekhof wrote:
> > On Mon, Apr 4, 2011 at 2:43 PM, Holger Teutsch  
> > wrote:
> > > On Mon, 2011-04-04 at 11:05 +0200, Andrew Beekhof wrote:
> > >> On Sat, Mar 19, 2011 at 11:55 AM, Holger Teutsch  
> > >> wrote:
> > >> > Hi Dejan,
> > >> >
> > >> > On Fri, 2011-03-18 at 14:24 +0100, Dejan Muhamedagic wrote:
> > >> >> Hi,
> > >> >>
> > >> >> On Fri, Mar 18, 2011 at 12:21:40PM +0100, Holger Teutsch wrote:
> > >> >> > Hi,
> > >> >> > I would like to submit 2 patches of an initial implementation for
> > >> >> > discussion.
> > >> > ..
> > >> >> > To recall:
> > >> >> >
> > >> >> > crm_resource --move resource
> > >> >> > creates a "standby" rule that moves the resource off the currently
> > >> >> > active node
> > >> >> >
> > >> >> > while
> > >> >> >
> > >> >> > crm_resource --move resource --node newnode
> > >> >> > creates a "prefer" rule that moves the resource to the new node.
> > >> >> >
> > >> >> > When dealing with clones and masters the behavior was random as the 
> > >> >> > code
> > >> >> > only considers the node where the first instance of the clone was
> > >> >> > started.
> > >> >> >
> > >> >> > The new code behaves consistently for the master role of an m/s
> > >> >> > resource. The options "--master" and "rsc:master" are somewhat 
> > >> >> > redundant
> > >> >> > as a "slave" move is not supported. Currently it's more an
> > >> >> > acknowledgement of the user.
> > >> >> >
> > >> >> > On the other hand it is desirable (and was requested several times 
> > >> >> > on
> > >> >> > the ML) to stop a single resource instance of a clone or master on a
> > >> >> > specific node.
> > >> >> >
> > >> >> > Should that be implemented by something like
> > >> >> >
> > >> >> > "crm_resource --move-off --resource myresource --node devel2" ?
> > >> >> >
> > >> >> > or should
> > >> >> >
> > >> >> > crm_resource refuse to work on clones
> > >> >> >
> > >> >> > and/or should moving the master role be the default for m/s 
> > >> >> > resources
> > >> >> > and the "--master" option discarded ?
> > >> >>
> > >> >> I think that we also need to consider the case when clone-max is
> > >> >> less than the number of nodes. If I understood correctly what you
> > >> >> were saying. So, all of move slave and move master and move clone
> > >> >> should be possible.
> > >> >>
> > >> >
> > >> > I think the following use cases cover what can be done with such kind 
> > >> > of
> > >> > interface:
> > >> >
> > >> > crm_resource --moveoff --resource myresource --node mynode
> > >> >   -> all resource variants: check whether active on mynode, then 
> > >> > create standby constraint
> > >> >
> > >> > crm_resource --move --resource myresource
> > >> >   -> primitive/group: convert to --moveoff --node `current_node`
> > >> >   -> clone/master: refused
> > >> >
> > >> > crm_resource --move --resource myresource --node mynode
> > >> >  -> primitive/group: create prefer constraint
> > >> >  -> clone/master: refused
> > >>
> > >> Not sure this needs to be refused.
> > >
> > > I see the problem that the node where the resource instance should be
> > > moved off had to be specified as well to get predictable behavior.
> > >
> > > Consider a a 2 way clone on a 3 node cluster.
> > > If the clone is active on A and B what should
> > >
> > > crm_resource --move --resource myClone --node C
> > >
> > > do ?
> > 
> > I would expect it to create the +inf constraint for C but no
> > contraint(s) for the current location(s)
> 
> You are right. These are different and valid use cases.
> 
> crm_resource --move --resource myClone --node C
>-> I want an instance on C, regardless where it is moved off
> 
> crm_resource --move-off --resource myClone --node C
>-> I want the instance moved off C, regardless where it is moved on
> 
> I tried them out with a reimplementation of the patch on a 3 node
> cluster with a resource with clone-max=2. The behavior appears logical
> (at least to me 8-) ).
> 
> > 
> > > This would require an additional --from-node or similar.
> > >
> > >> Other than that the proposal looks sane.
> > >>
> > >> My first thought was to make --move behave like --move-off if the
> > >> resource is a clone or /ms, but since the semantics are the exact
> > >> opposite, that might introduce introduce more problems than it solves.
> > >
> > > That was my perception as well.
> > >
> > >>
> > >> Does the original crm_resource patch implement this?
> > >
> > > No, I will submit an updated version later this week.
> > >
> > > - holger

Hi,
I submit revised patches for review.
Summarizing preceding discussions the following functionality is
implemented:

crm_resource --move-off --resource myresource --node mynode
   -> all resource variants: check whether active on mynode, then create 
standby constraint

crm_resource --move --resource myresource
   -> primitive/group: convert to --move-off --node `current_node`
   -> clone/master: refus

Re: [Pacemaker] How to prevent locked I/O using Pacemaker with Primary/Primary DRBD/OCFS2 (Ubuntu 10.10)

2011-04-05 Thread Raoul Bhatia [IPAX]
On 04/04/11 22:55, Lars Ellenberg wrote:
>> * > > ro2="Unknown"ds1="UpToDate" ds2="Outdated" />
>  Why keep people using this pseudo xml output?
>  where does that come from? we should un-document this.
>  This is to be consumed by other programs (like the LINBIT DRBD-MC).
>  This is not to be consumed by humans.

when one is used to "crm status", "drbd status" will be typed
automatically without thinking ;)

> # drbdadm status
> 
> 
>  ds1="UpToDate" ds2="UpToDate" />
> 
> 

imho, thats why...

cheers,
raoul
-- 

DI (FH) Raoul Bhatia M.Sc.  email.  r.bha...@ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OG  web.  http://www.ipax.at
Barawitzkagasse 10/2/2/11   email.off...@ipax.at
1190 Wien   tel.   +43 1 3670030
FN 277995t HG Wien  fax.+43 1 3670030 15


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker