Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Unexpected Resource movement after failover

2016-10-17 Thread Nikhil Utane
Yes Ulrich, Somehow I missed pursuing on that. I will be doing both, configure stickiness to INFINITY and use utilization attributes. This should probably take care of it. Thanks Nikhil On Tue, Oct 18, 2016 at 11:45 AM, Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > >>> Nikhil Utane

[ClusterLabs] Antw: Re: Antw: Re: Can't do anything right; how do I start over?

2016-10-17 Thread Ulrich Windl
>>> Dmitri Maziuk schrieb am 17.10.2016 um 16:51 in Nachricht <0d370bc7-c74e-4250-5848-b9c7de941...@gmail.com>: > On 2016-10-17 02:12, Ulrich Windl wrote: > >> Have you tried a proper variant of "lsof" before? So maybe you know > which process might block the device. I also think if you have LVM

[ClusterLabs] Antw: Re: Antw: Re: Antw: Unexpected Resource movement after failover

2016-10-17 Thread Ulrich Windl
>>> Nikhil Utane schrieb am 17.10.2016 um 16:46 in Nachricht : > This is driving me insane. Why don't you try the utilization approach? > > This is how the resources were started. Redund_CU1_WB30 was the DC which I > rebooted. > cu_4 (ocf::redundancy:RedundancyRA): Started Redund_CU1_WB30 >

Re: [ClusterLabs] Antw: Re: Antw: Unexpected Resource movement after failover

2016-10-17 Thread Nikhil Utane
Thanks Ken. I will give it a shot. http://oss.clusterlabs.org/pipermail/pacemaker/2011-August/011271.html On this thread, if I interpret it correctly, his problem was solved when he swapped the anti-location constraint >From (mapping to my example) cu_2 with cu_4 (score:-INFINITY) cu_3 with cu_4

Re: [ClusterLabs] set start-failure-is-fatal per resource?

2016-10-17 Thread Ken Gaillot
On 10/17/2016 12:42 PM, Israel Brewster wrote: > I have one resource agent (redis, to be exact) that sometimes apparently > fails to start on the first attempt. In every case, simply running a > 'pcs resource cleanup' such that pacemaker tries to start it again > successfully starts the process. No

Re: [ClusterLabs] Antw: Re: Antw: Unexpected Resource movement after failover

2016-10-17 Thread Ken Gaillot
On 10/17/2016 09:55 AM, Nikhil Utane wrote: > I see these prints. > > pengine: info: rsc_merge_weights:cu_4: Rolling back scores from cu_3 > pengine:debug: native_assign_node:Assigning Redun_CU4_Wb30 to cu_4 > pengine: info: rsc_merge_weights:cu_3: Rolling back scores from cu_2

[ClusterLabs] set start-failure-is-fatal per resource?

2016-10-17 Thread Israel Brewster
I have one resource agent (redis, to be exact) that sometimes apparently fails to start on the first attempt. In every case, simply running a 'pcs resource cleanup' such that pacemaker tries to start it again successfully starts the process. Now, obviously, the proper thing to do is to figure out w

Re: [ClusterLabs] Antw: Re: Antw: Unexpected Resource movement after failover

2016-10-17 Thread Nikhil Utane
I see these prints. pengine: info: rsc_merge_weights: cu_4: Rolling back scores from cu_3 pengine:debug: native_assign_node: Assigning Redun_CU4_Wb30 to cu_4 pengine: info: rsc_merge_weights: cu_3: Rolling back scores from cu_2 pengine:debug: native_assign_node: Assigning Redund_CU

Re: [ClusterLabs] Antw: Re: Can't do anything right; how do I start over?

2016-10-17 Thread Dmitri Maziuk
On 2016-10-17 02:12, Ulrich Windl wrote: Have you tried a proper variant of "lsof" before? So maybe you know which process might block the device. I also think if you have LVM on top of DRBD, you must deactivate the VG before trying to unmount. No LVM here: AFAIMC these days it's another solut

Re: [ClusterLabs] Antw: Re: Antw: Unexpected Resource movement after failover

2016-10-17 Thread Nikhil Utane
This is driving me insane. This is how the resources were started. Redund_CU1_WB30 was the DC which I rebooted. cu_4 (ocf::redundancy:RedundancyRA): Started Redund_CU1_WB30 cu_2 (ocf::redundancy:RedundancyRA): Started Redund_CU5_WB30 cu_3 (ocf::redundancy:RedundancyRA): Started Redun_CU4_Wb30

[ClusterLabs] Antw: Re: Can't do anything right; how do I start over?

2016-10-17 Thread Ulrich Windl
>>> Dimitri Maziuk schrieb am 15.10.2016 um 23:13 in Nachricht <750d030a-ae3b-2c91-4275-8695c1a4c...@bmrb.wisc.edu>: > On 10/15/2016 12:27 PM, Dmitri Maziuk wrote: >> On 2016-10-15 01:56, Jay Scott wrote: >> >>> So, what's wrong? (I'm a newbie, of course.) >> >> Here's what worked for me on cen