Re: [ClusterLabs] Antw: Re: Set a node attribute for multiple nodes with one command

2016-11-22 Thread Kostiantyn Ponomarenko
Ken,
Thank you for the explanation.
I will try this low-level way of shadow cib creation tomorrow.
PS: I will sleep much better with this excellent news/idea. =)

Thank you,
Kostia

On Tue, Nov 22, 2016 at 10:53 PM, Ken Gaillot  wrote:

> On 11/22/2016 04:39 AM, Kostiantyn Ponomarenko wrote:
> > Using "shadow cib" in crmsh looks like a good idea, but it doesn't work
> > with node attributes set into "status" section of Pacemaker config.
> > I wonder it it is possible to make it work that way.
>
> Forgot to mention -- the shadow CIB is probably the best way to do this.
> I don't know if there's a way to do it in crmsh, but you can use it with
> the low-level commands crm_shadow and crm_attribute --lifetime=reboot.
>
> > Ken,
> >>> start dampening timer
> > Could you please elaborate more on this. I don't get how I can set this
> > timer.
> > Do I need to set this timer for each node?
> >
> >
> > Thank you,
> > Kostia
> >
> > On Mon, Nov 21, 2016 at 9:30 AM, Ulrich Windl
> >  > > wrote:
> >
> > >>> Ken Gaillot mailto:kgail...@redhat.com>>
> > schrieb am 18.11.2016 um 16:17 in Nachricht
> >  > >:
> > > On 11/18/2016 08:55 AM, Kostiantyn Ponomarenko wrote:
> > >> Hi folks,
> > >>
> > >> Is there a way to set a node attribute to the "status" section
> for few
> > >> nodes at the same time?
> > >>
> > >> In my case there is a node attribute which allows some resources
> to
> > >> start in the cluster if it is set.
> > >> If I set this node attribute for say two nodes in a way - one and
> then
> > >> another, than these resources are not distributed equally between
> these
> > >> two nodes. That because Pacemaker picks the first node to with
> this
> > >> attribute is set and immediately starts all allowed resources on
> it. And
> > >> this is not the behavior i would like to get.
> > >>
> > >> Thank you,
> > >> Kostia
> > >
> > > Not that I know of, but it would be a good feature to add to
> > > attrd_updater and/or crm_attribute.
> >
> > With crm (shell) you don't have transactions for node attributes,
> > but for the configuration. So if you add a location restriction
> > preventing any resources on your nodes, then enable the nodes, and
> > then delete the location restrictions in one transaction, you might
> > get what you want. It's not elegant, but itt ill do.
> >
> > To the crm shell maintainer: Is is difficult to build transactions
> > to node status changes? The problem I see is this: For configuration
> > you always have transactions (requiring "commit), but for nodes you
> > traditionally have non (effects are immediate). So you'd need a
> > thing like "start transaction" which requires a "commit" or some
> > kind of abort later.
> >
> > I also don't know whether a "shadow CIB" would help for the original
> > problem.
> >
> > Ulrich
> >
> > >
> > > You can probably hack it with a dampening value of a few seconds.
> If
> > > your rule checks for a particular value of the attribute, set all
> the
> > > nodes to a different value first, which will write that value and
> > start
> > > the dampening timer. Then set all the attributes to the desired
> value,
> > > and they will get written out together when the timer expires.
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Set a node attribute for multiple nodes with one command

2016-11-22 Thread Ken Gaillot
On 11/22/2016 04:39 AM, Kostiantyn Ponomarenko wrote:
> Using "shadow cib" in crmsh looks like a good idea, but it doesn't work
> with node attributes set into "status" section of Pacemaker config.
> I wonder it it is possible to make it work that way.

Forgot to mention -- the shadow CIB is probably the best way to do this.
I don't know if there's a way to do it in crmsh, but you can use it with
the low-level commands crm_shadow and crm_attribute --lifetime=reboot.

> Ken,
>>> start dampening timer
> Could you please elaborate more on this. I don't get how I can set this
> timer.
> Do I need to set this timer for each node?
> 
> 
> Thank you,
> Kostia
> 
> On Mon, Nov 21, 2016 at 9:30 AM, Ulrich Windl
>  > wrote:
> 
> >>> Ken Gaillot mailto:kgail...@redhat.com>>
> schrieb am 18.11.2016 um 16:17 in Nachricht
>  >:
> > On 11/18/2016 08:55 AM, Kostiantyn Ponomarenko wrote:
> >> Hi folks,
> >>
> >> Is there a way to set a node attribute to the "status" section for few
> >> nodes at the same time?
> >>
> >> In my case there is a node attribute which allows some resources to
> >> start in the cluster if it is set.
> >> If I set this node attribute for say two nodes in a way - one and then
> >> another, than these resources are not distributed equally between these
> >> two nodes. That because Pacemaker picks the first node to with this
> >> attribute is set and immediately starts all allowed resources on it. 
> And
> >> this is not the behavior i would like to get.
> >>
> >> Thank you,
> >> Kostia
> >
> > Not that I know of, but it would be a good feature to add to
> > attrd_updater and/or crm_attribute.
> 
> With crm (shell) you don't have transactions for node attributes,
> but for the configuration. So if you add a location restriction
> preventing any resources on your nodes, then enable the nodes, and
> then delete the location restrictions in one transaction, you might
> get what you want. It's not elegant, but itt ill do.
> 
> To the crm shell maintainer: Is is difficult to build transactions
> to node status changes? The problem I see is this: For configuration
> you always have transactions (requiring "commit), but for nodes you
> traditionally have non (effects are immediate). So you'd need a
> thing like "start transaction" which requires a "commit" or some
> kind of abort later.
> 
> I also don't know whether a "shadow CIB" would help for the original
> problem.
> 
> Ulrich
> 
> >
> > You can probably hack it with a dampening value of a few seconds. If
> > your rule checks for a particular value of the attribute, set all the
> > nodes to a different value first, which will write that value and
> start
> > the dampening timer. Then set all the attributes to the desired value,
> > and they will get written out together when the timer expires.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Set a node attribute for multiple nodes with one command

2016-11-22 Thread Ken Gaillot
On 11/22/2016 04:39 AM, Kostiantyn Ponomarenko wrote:
> Using "shadow cib" in crmsh looks like a good idea, but it doesn't work
> with node attributes set into "status" section of Pacemaker config.
> I wonder it it is possible to make it work that way.
> 
> Ken,
>>> start dampening timer
> Could you please elaborate more on this. I don't get how I can set this
> timer.
> Do I need to set this timer for each node?

Dampening is per attribute, so it applies to all nodes. You can set it
when you first create the attribute:

  attrd_updater -n $NAME --update $VALUE --delay $SECONDS

With dampening (delay), the attribute daemon will wait that long between
writes to the CIB. The goal is to reduce I/O activity for frequently
changing attributes, but it could also be handy here.


The --delay will be ignored if the above command is run after the
attribute already exists. You can change it for an already existing
attribute with

  attrd_updater -n $NAME --update-delay --delay $SECONDS

or

  attrd_updater -n $NAME --update-both $VALUE --delay $SECONDS


It's intentionally more trouble to set it on an already-created
attribute, because repeatedly changing the delay will make it useless
(each delay change requires an immediate write). Having a separate
command makes it less likely to be accidental.

> 
> 
> Thank you,
> Kostia
> 
> On Mon, Nov 21, 2016 at 9:30 AM, Ulrich Windl
>  > wrote:
> 
> >>> Ken Gaillot mailto:kgail...@redhat.com>>
> schrieb am 18.11.2016 um 16:17 in Nachricht
>  >:
> > On 11/18/2016 08:55 AM, Kostiantyn Ponomarenko wrote:
> >> Hi folks,
> >>
> >> Is there a way to set a node attribute to the "status" section for few
> >> nodes at the same time?
> >>
> >> In my case there is a node attribute which allows some resources to
> >> start in the cluster if it is set.
> >> If I set this node attribute for say two nodes in a way - one and then
> >> another, than these resources are not distributed equally between these
> >> two nodes. That because Pacemaker picks the first node to with this
> >> attribute is set and immediately starts all allowed resources on it. 
> And
> >> this is not the behavior i would like to get.
> >>
> >> Thank you,
> >> Kostia
> >
> > Not that I know of, but it would be a good feature to add to
> > attrd_updater and/or crm_attribute.
> 
> With crm (shell) you don't have transactions for node attributes,
> but for the configuration. So if you add a location restriction
> preventing any resources on your nodes, then enable the nodes, and
> then delete the location restrictions in one transaction, you might
> get what you want. It's not elegant, but itt ill do.
> 
> To the crm shell maintainer: Is is difficult to build transactions
> to node status changes? The problem I see is this: For configuration
> you always have transactions (requiring "commit), but for nodes you
> traditionally have non (effects are immediate). So you'd need a
> thing like "start transaction" which requires a "commit" or some
> kind of abort later.
> 
> I also don't know whether a "shadow CIB" would help for the original
> problem.
> 
> Ulrich
> 
> >
> > You can probably hack it with a dampening value of a few seconds. If
> > your rule checks for a particular value of the attribute, set all the
> > nodes to a different value first, which will write that value and
> start
> > the dampening timer. Then set all the attributes to the desired value,
> > and they will get written out together when the timer expires.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Reliable check for "is starting" state of a resource

2016-11-22 Thread Ken Gaillot
On 11/22/2016 10:53 AM, Kostiantyn Ponomarenko wrote:
> Hi folks,
> 
> I am looking for a good way of checking if a resource is in "starting"
> state.
> The thing is - I need to issue a command and I don't want to issue that
> command when this particular resource is starting. This resource start
> can take up to a few min. 
> As a note, I am OK with issuing that command if a resource is stopped
> (before it starts).
> 
> The best I can think of, is to check for an ongoing state transition of
> a resource in a loop with "crm_simulate -Ls" command.
> But with this approach I need to put the command into a loop, cut
> "Transition Summary" part, and then grep for the needed resource.
> And even that I have a question if this way is reliable?
> 
> Maybe be there is a better way of achieving the same result.  
> 
> Thank you,
> Kostia

Probably the cleanest way is to set record-pending=true (either as an
operation default, or just on the start operation you're interested in).
Then your command (or a wrapper) could check the CIB for a pending start.

A simpler approach would be "crm_resource --wait", which blocks until
the cluster is idle. The downside is you might wait for other actions
that you don't care about.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Locate resource with functioning member of clone set?

2016-11-22 Thread Israel Brewster
On Nov 17, 2016, at 4:04 PM, Ken Gaillot  wrote:
> 
> On 11/17/2016 11:37 AM, Israel Brewster wrote:
>> I have a resource that is set up as a clone set across my cluster,
>> partly for pseudo-load balancing (If someone wants to perform an action
>> that will take a lot of resources, I can have them do it on a different
>> node than the primary one), but also simply because the resource can
>> take several seconds to start, and by having it already running as a
>> clone set, I can failover in the time it takes to move an IP resource -
>> essentially zero down time.
>> 
>> This is all well and good, but I ran into a problem the other day where
>> the process on one of the nodes stopped working properly. Pacemaker
>> caught the issue, and tried to fix it by restarting the resource, but
>> was unable to because the old instance hadn't actually exited completely
>> and was still tying up the TCP port, thereby preventing the new instance
>> that pacemaker launched from being able to start.
>> 
>> So this leaves me with two questions: 
>> 
>> 1) is there a way to set up a "kill script", such that before trying to
>> launch a new copy of a process, pacemaker will run this script, which
>> would be responsible for making sure that there are no other instances
>> of the process running?
> 
> Sure, it's called a resource agent :)
> 
> When recovering a failed resource, Pacemaker will call the resource
> agent's stop action first, then start. The stop should make sure the
> service has exited completely. If it doesn't, the agent should be fixed
> to do so.

Ah, gotcha. I wasn't thinking along those lines in this case because the 
resource in question doesn't have a dedicated resource agent - it's a basic 
system service type resource. So then the proper approach would be to modify 
the init.d script such that when "stop" is called, it makes sure to completely 
clean up any associated processes - even if the PID file disappears or gets 
changed.

> 
>> 2) Even in the above situation, where pacemaker couldn't launch a good
>> copy of the resource on the one node, the situation could have been
>> easily "resolved" by pacemaker moving the virtual IP resource to another
>> node where the cloned resource was running correctly, and notifying me
>> of the problem. I know how to make colocation constraints in general,
>> but how do I do a colocation constraint with a cloned resource where I
>> just need the virtual IP running on *any* node where there clone is
>> working properly? Or is it the same as any other colocation resource,
>> and pacemaker is simply smart enough to both try to restart the failed
>> resource and move the virtual IP resource at the same time?
> 
> Correct, a simple colocation constraint of "resource R with clone C"
> will make sure R runs with a working instance of C.
> 
> There is a catch: if *any* instance of C restarts, R will also restart
> (even if it stays in the same place), because it depends on the clone as
> a whole. Also, in the case you described, pacemaker would first try to
> restart both C and R on the same node, rather than move R to another
> node (although you could set on-fail=stop on C to force R to move).

It *looked* like Pacemaker was continually trying to restart the cloned 
resource in this case - I think the issue being that from Pacemakers 
perspective the service *did* start successfully, it just failed again moments 
later (when it tried to bind to the port, and, being unable to, bailed out). So 
under the "default" configuration, Pacemaker would try restarting the service 
for quite a while before marking it as failed on that node. As such, it sounds 
like under the current configuration, the IP resource would never move (at 
least not in a reasonable time frame), as Pacemaker would simply continue to 
try restarting on the same node.

So to get around this, I'm thinking I could set the migration-threshold 
property on the clustered resource to something low, like two or three, perhaps 
combined with a failure-timeout so occasional successful restarts won't prevent 
the service from running on a node - only if it can't restart and stay running. 
Does that sound right?

> 
> If that's not sufficient, you could try some magic with node attributes
> and rules. The new ocf:pacemaker:attribute resource in 1.1.16 could help
> there.

Unfortunately, as I am running CentOS 6.8, the newest version available to me 
is 1.1.14. I haven't yet developed an implementation plan for moving to CentOS 
7, so unless I build from source or someone has made packages of a later 
release available for CentOS 6, I'm stuck at the moment. That said, between 
this and the alerts mentioned below, it might be worth spending more time 
looking into upgrading.

Thanks for the info!

---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---

> 
>> As an addendum

Re: [ClusterLabs] iSCSI on ZFS on DRBD

2016-11-22 Thread Jason A Ramsey
The way that Pacemaker interacts with services is using resource agents. These 
resource agents are bash scripts that you can modify to your heart’s content to 
do the things you want to do. Having worked with the ocf:heartbeat:iSCSITarget 
and ocf:heartbeat:iSCSILogicalUnit quite a lot in the last several months, I 
can tell you that they only support iet, tgt, lio, and lio-t implementations of 
the standard out of the box. I’m sure you could make modifications to them (I 
have to support very specific use cases on my NAS cluster, and I’m definitely 
not a code monkey) as needed. Just take a peak at the resource agent files 
relevant to what you’re doing and go from there (on my systems they are located 
at /usr/lib/ocf/resource.d). Good luck!

--

[ jR ]

  there is no path to greatness; greatness is the path

From: Mark Adams 
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 

Date: Tuesday, November 22, 2016 at 11:59 AM
To: "Users@clusterlabs.org" 
Subject: [ClusterLabs] iSCSI on ZFS on DRBD

Hi All,

Looking for some opinions on this, if anyone has any. I'm looking at this 
solution to be for proxmox vm nodes using zfsoniscsi.

Just as back round for people that haven't looked at proxmox before it logs on 
to the iscsi server via ssh and creates a zfs dataset then adds iscsi config to 
/etc/ietd.conf so that dataset is available as a LUN. This works fine when 
you've got a single iscsi host, but I haven't figured out a way to use it with 
pacemaker/corosync.

Is there any way to have ISCSILogicalUnit read it's luns from a config file 
instead of specifying each one in the cluster config? or is there any other 
resource agents that might be more suitable for this job? I could write my own 
"watcher" script I guess, but does anyone think this is a dangerous idea?

Is the only sensible thing really to make proxmox zfsonlinux pacemaker/corosync 
"aware" so that it's scripts can create the luns through pcs instead of adding 
the config to ietd.conf?

Is anyone using zfs/iscsi/drbd in some other configuration and had success?

Looking forward to all ideas!

Regards,
Mark
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] OS Patching Process

2016-11-22 Thread Jason A Ramsey
I’ve done the opposite:

lvm on top of drbd -> iscsi lun

but I’m not trying to resize anything. I just want to patch the OS of the nodes 
and reboot them in sequence without breaking things (and, preferably, without 
taking the cluster offline).

--
 
[ jR ]

  there is no path to greatness; greatness is the path

On 11/22/16, 11:47 AM, "emmanuel segura"  wrote:

I been using this mode: iscsi_disks -> lvm volume ->
drbd_on_top_of_lvm -> filesystem

resize: add_one_iscsi_device_to_every_cluster_node_first ->
now_add_device_the_volume_group_on_every_cluster_node ->
now_resize_the_volume_on_every_cluster_node : now you have every
cluster with the same logical volume size, now you can resize drbd and
filesystem on the active node

2016-11-22 17:35 GMT+01:00 Jason A Ramsey :
> Can anyone recommend a bulletproof process for OS patching a pacemaker
> cluster that manages a drbd mirror (with LVM on top of the drbd and luns
> defined for an iscsi target cluster if that matters)? Any time I’ve tried 
to
> mess with the cluster, it seems like I manage to corrupt my drbd 
filesystem,
> and now that I have actual data on the thing, that’s kind of a scary
> proposition. Thanks in advance!
>
>
>
> --
>
>
>
> [ jR ]
>
>
>
>   there is no path to greatness; greatness is the path
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] iSCSI on ZFS on DRBD

2016-11-22 Thread Mark Adams
Just to clarify, I'm referring to "ZFS over iSCSI" in regards to proxmox.

https://pve.proxmox.com/wiki/Storage:_ZFS_over_iSCSI

Regards,
Mark

On 22 November 2016 at 16:59, Mark Adams  wrote:

> Hi All,
>
> Looking for some opinions on this, if anyone has any. I'm looking at this
> solution to be for proxmox vm nodes using zfsoniscsi.
>
> Just as back round for people that haven't looked at proxmox before it
> logs on to the iscsi server via ssh and creates a zfs dataset then adds
> iscsi config to /etc/ietd.conf so that dataset is available as a LUN. This
> works fine when you've got a single iscsi host, but I haven't figured out a
> way to use it with pacemaker/corosync.
>
> Is there any way to have ISCSILogicalUnit read it's luns from a config
> file instead of specifying each one in the cluster config? or is there any
> other resource agents that might be more suitable for this job? I could
> write my own "watcher" script I guess, but does anyone think this is a
> dangerous idea?
>
> Is the only sensible thing really to make proxmox zfsonlinux
> pacemaker/corosync "aware" so that it's scripts can create the luns through
> pcs instead of adding the config to ietd.conf?
>
> Is anyone using zfs/iscsi/drbd in some other configuration and had success?
>
> Looking forward to all ideas!
>
> Regards,
> Mark
>



-- 
Mark Adams
Director
--
Open Virtualisation Solutions Ltd.
Registered in England and Wales number: 07709887
Office Address: 274 Verdant Lane, London, SE6 1TW
Office: +44 (0)333 355 0160
Mobile: +44 (0)750 800 1289
Site: http://www.openvs.co.uk
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] iSCSI on ZFS on DRBD

2016-11-22 Thread Mark Adams
Hi All,

Looking for some opinions on this, if anyone has any. I'm looking at this
solution to be for proxmox vm nodes using zfsoniscsi.

Just as back round for people that haven't looked at proxmox before it logs
on to the iscsi server via ssh and creates a zfs dataset then adds iscsi
config to /etc/ietd.conf so that dataset is available as a LUN. This works
fine when you've got a single iscsi host, but I haven't figured out a way
to use it with pacemaker/corosync.

Is there any way to have ISCSILogicalUnit read it's luns from a config file
instead of specifying each one in the cluster config? or is there any other
resource agents that might be more suitable for this job? I could write my
own "watcher" script I guess, but does anyone think this is a dangerous
idea?

Is the only sensible thing really to make proxmox zfsonlinux
pacemaker/corosync "aware" so that it's scripts can create the luns through
pcs instead of adding the config to ietd.conf?

Is anyone using zfs/iscsi/drbd in some other configuration and had success?

Looking forward to all ideas!

Regards,
Mark
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Reliable check for "is starting" state of a resource

2016-11-22 Thread Kostiantyn Ponomarenko
Hi folks,

I am looking for a good way of checking if a resource is in "starting"
state.
The thing is - I need to issue a command and I don't want to issue that
command when this particular resource is starting. This resource start can
take up to a few min.
As a note, I am OK with issuing that command if a resource is stopped
(before it starts).

The best I can think of, is to check for an ongoing state transition of a
resource in a loop with "crm_simulate -Ls" command.
But with this approach I need to put the command into a loop, cut
"Transition Summary" part, and then grep for the needed resource.
And even that I have a question if this way is reliable?

Maybe be there is a better way of achieving the same result.

Thank you,
Kostia
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] OS Patching Process

2016-11-22 Thread emmanuel segura
I been using this mode: iscsi_disks -> lvm volume ->
drbd_on_top_of_lvm -> filesystem

resize: add_one_iscsi_device_to_every_cluster_node_first ->
now_add_device_the_volume_group_on_every_cluster_node ->
now_resize_the_volume_on_every_cluster_node : now you have every
cluster with the same logical volume size, now you can resize drbd and
filesystem on the active node

2016-11-22 17:35 GMT+01:00 Jason A Ramsey :
> Can anyone recommend a bulletproof process for OS patching a pacemaker
> cluster that manages a drbd mirror (with LVM on top of the drbd and luns
> defined for an iscsi target cluster if that matters)? Any time I’ve tried to
> mess with the cluster, it seems like I manage to corrupt my drbd filesystem,
> and now that I have actual data on the thing, that’s kind of a scary
> proposition. Thanks in advance!
>
>
>
> --
>
>
>
> [ jR ]
>
>
>
>   there is no path to greatness; greatness is the path
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] OS Patching Process

2016-11-22 Thread Dmitri Maziuk

On 2016-11-22 10:35, Jason A Ramsey wrote:

Can anyone recommend a bulletproof process for OS patching a pacemaker
cluster that manages a drbd mirror (with LVM on top of the drbd and luns
defined for an iscsi target cluster if that matters)? Any time I’ve
tried to mess with the cluster, it seems like I manage to corrupt my
drbd filesystem, and now that I have actual data on the thing, that’s
kind of a scary proposition. Thanks in advance!


+1

I managed to clearly standby/unstandby mine a few times initially -- 
otherwise I wouldn't have put it in production -- but on the last 
several reboots DRBD filesystem just wouldn't unmount. Never corrupted 
anything but it's still a serious PITA. Especially with a couple of 
haresources pairs right next to it switching over perfectly every time.


Insult to injury, the RA start spewing "can't unmount, somebody's 
holding open" messages to the console at such rate that it is impossible 
to login and try lsof, fuser, or anything.


Dima


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] OS Patching Process

2016-11-22 Thread Jason A Ramsey
Can anyone recommend a bulletproof process for OS patching a pacemaker cluster 
that manages a drbd mirror (with LVM on top of the drbd and luns defined for an 
iscsi target cluster if that matters)? Any time I’ve tried to mess with the 
cluster, it seems like I manage to corrupt my drbd filesystem, and now that I 
have actual data on the thing, that’s kind of a scary proposition. Thanks in 
advance!

--

[ jR ]

  there is no path to greatness; greatness is the path
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] DRBD Insufficient Privileges Error

2016-11-22 Thread Jason A Ramsey
Did you install the drbd-pacemaker package? That’s the package that contains 
the resource agent.

--

[ jR ]

  there is no path to greatness; greatness is the path

From: Jasim Alam 
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 

Date: Sunday, November 20, 2016 at 2:58 PM
To: "Users@clusterlabs.org" 
Subject: [ClusterLabs] DRBD Insufficient Privileges Error

Hi,

I am trying to setup  two node H/A cluster with DRBD. Following is my 
configuration

[root@node-1 ~]# pcs config
Cluster Name: Cluster-1
Corosync Nodes:
node-1 node-2
Pacemaker Nodes:
node-1 node-2

Resources:
 Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=103.9.185.211 cidr_netmask=32
  Operations: start interval=0s timeout=20s (vip-start-interval-0s)
  stop interval=0s timeout=20s (vip-stop-interval-0s)
  monitor interval=30s (vip-monitor-interval-30s)
Resource: apache (class=ocf provider=heartbeat type=apache)
  Attributes: configfile=/etc/httpd/conf/httpd.conf 
statusurl=http://localhost/server-status
  Operations: start interval=0s timeout=40s (apache-start-interval-0s)
  stop interval=0s timeout=60s (apache-stop-interval-0s)
  monitor interval=1min (apache-monitor-interval-1min)
Master: StorageClone
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 
notify=true
  Resource: storage (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=drbd0
   Operations: start interval=0s timeout=240 (storage-start-interval-0s)
   promote interval=0s timeout=90 (storage-promote-interval-0s)
   demote interval=0s timeout=90 (storage-demote-interval-0s)
   stop interval=0s timeout=100 (storage-stop-interval-0s)
   monitor interval=60s (storage-monitor-interval-60s)

Stonith Devices:
Fencing Levels:

Location Constraints:
  Resource: apache
Enabled on: node-1 (score:50) (id:location-apache-node-1-50)
Ordering Constraints:
  start vip then start apache (kind:Mandatory) (id:order-vip-apache-mandatory)
Colocation Constraints:
  vip with apache (score:INFINITY) (id:colocation-vip-apache-INFINITY)

Resources Defaults:
No defaults set
Operations Defaults:
No defaults set

Cluster Properties:
cluster-infrastructure: corosync
cluster-name: Cluster-1
dc-version: 1.1.13-10.el7_2.4-44eb2dd
have-watchdog: false
no-quorum-policy: ignore
stonith-enabled: false

The problem is I am getting insufficient privilege error on second node

[root@node-1 ~]# pcs status
Cluster name: PSD-1
Last updated: Mon Nov 21 01:44:52 2016  Last change: Mon Nov 21 
01:19:17 2016 by root via cibadmin on node-1
Stack: corosync
Current DC: node-1 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum
2 nodes and 4 resources configured

Online: [ node-1 node-2 ]

Full list of resources:

vip(ocf::heartbeat:IPaddr2):   Started node-1
apache (ocf::heartbeat:apache):Started node-1
Master/Slave Set: StorageClone [storage]
 storage(ocf::linbit:drbd): FAILED node-2 (unmanaged)
 Masters: [ node-1 ]

Failed Actions:
* storage_stop_0 on node-2 'insufficient privileges' (4): call=16, 
status=complete, exitreason='none',
last-rc-change='Mon Nov 21 01:19:17 2016', queued=0ms, exec=2ms


PCSD Status:
  node-1: Online
  node-2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

but DRBD seems okay for both nodes

  [root@node-1 ~]# drbd-overview
 0:drbd0/0  Connected Primary/Secondary UpToDate/UpToDate
 [root@node-2 ~]# drbd-overview
 0:drbd0/0  Connected Secondary/Primary UpToDate/UpToDate

Log of node2

[root@node-2 ~]# tail -n 10 /var/log/messages
Nov 21 01:19:17 node-2 crmd[4060]:  notice: State transition S_NOT_DC -> 
S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_election_count_vote ]
Nov 21 01:19:17 node-2 crmd[4060]:  notice: State transition S_PENDING -> 
S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond 
]
Nov 21 01:19:17 node-2 crmd[4060]:   error: Failed to retrieve meta-data for 
ocf:linbit:drbd
Nov 21 01:19:17 node-2 crmd[4060]: warning: No metadata found for 
drbd::ocf:linbit: Input/output error (-5)
Nov 21 01:19:17 node-2 crmd[4060]:   error: No metadata for linbit::ocf:drbd
Nov 21 01:19:17 node-2 crmd[4060]:  notice: Operation storage_monitor_0: 
insufficient privileges (node=node-2, call=14, rc=4, cib-update=17, 
confirmed=true)
Nov 21 01:19:17 node-2 crmd[4060]:  notice: Operation storage_notify_0: ok 
(node=node-2, call=15, rc=0, cib-update=0, confirmed=true)
Nov 21 01:19:17 node-2 crmd[4060]:  notice: Operation storage_stop_0: 
insufficient privileges (node=node-2, call=16, rc=4, cib-update=18, 
confirmed=true)
Nov 21 01:20:31 node-2 systemd-logind: Removed session 3.
Nov 21 01:22:58 node-2 systemd-logind: Removed session 2.

Would appreciate any way out of this.

Thanks,
Jasim
___
Users mailing list: Users@cl

Re: [ClusterLabs] Antw: Re: Set a node attribute for multiple nodes with one command

2016-11-22 Thread Kostiantyn Ponomarenko
Using "shadow cib" in crmsh looks like a good idea, but it doesn't work
with node attributes set into "status" section of Pacemaker config.
I wonder it it is possible to make it work that way.

Ken,
>> start dampening timer
Could you please elaborate more on this. I don't get how I can set this
timer.
Do I need to set this timer for each node?


Thank you,
Kostia

On Mon, Nov 21, 2016 at 9:30 AM, Ulrich Windl <
ulrich.wi...@rz.uni-regensburg.de> wrote:

> >>> Ken Gaillot  schrieb am 18.11.2016 um 16:17 in
> Nachricht
> :
> > On 11/18/2016 08:55 AM, Kostiantyn Ponomarenko wrote:
> >> Hi folks,
> >>
> >> Is there a way to set a node attribute to the "status" section for few
> >> nodes at the same time?
> >>
> >> In my case there is a node attribute which allows some resources to
> >> start in the cluster if it is set.
> >> If I set this node attribute for say two nodes in a way - one and then
> >> another, than these resources are not distributed equally between these
> >> two nodes. That because Pacemaker picks the first node to with this
> >> attribute is set and immediately starts all allowed resources on it. And
> >> this is not the behavior i would like to get.
> >>
> >> Thank you,
> >> Kostia
> >
> > Not that I know of, but it would be a good feature to add to
> > attrd_updater and/or crm_attribute.
>
> With crm (shell) you don't have transactions for node attributes, but for
> the configuration. So if you add a location restriction preventing any
> resources on your nodes, then enable the nodes, and then delete the
> location restrictions in one transaction, you might get what you want. It's
> not elegant, but itt ill do.
>
> To the crm shell maintainer: Is is difficult to build transactions to node
> status changes? The problem I see is this: For configuration you always
> have transactions (requiring "commit), but for nodes you traditionally have
> non (effects are immediate). So you'd need a thing like "start transaction"
> which requires a "commit" or some kind of abort later.
>
> I also don't know whether a "shadow CIB" would help for the original
> problem.
>
> Ulrich
>
> >
> > You can probably hack it with a dampening value of a few seconds. If
> > your rule checks for a particular value of the attribute, set all the
> > nodes to a different value first, which will write that value and start
> > the dampening timer. Then set all the attributes to the desired value,
> > and they will get written out together when the timer expires.
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org