On 10/07/2013, at 3:37 PM, Vladislav Bogdanov <[email protected]> wrote:
> 10.07.2013 08:13, Andrew Beekhof wrote: >> >> On 10/07/2013, at 2:15 PM, Vladislav Bogdanov <[email protected]> wrote: >> >>> 10.07.2013 07:05, Andrew Beekhof wrote: >>>> >>>> On 10/07/2013, at 2:04 PM, Vladislav Bogdanov <[email protected]> wrote: >>>> >>>>> 10.07.2013 03:39, Andrew Beekhof wrote: >>>>>> >>>>>> On 10/07/2013, at 1:51 AM, Vladislav Bogdanov <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> 03.07.2013 19:31, Dejan Muhamedagic wrote: >>>>>>>> On Tue, Jul 02, 2013 at 07:53:52AM +0300, Vladislav Bogdanov wrote: >>>>>>>>> 01.07.2013 18:29, Dejan Muhamedagic wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> On Mon, Jul 01, 2013 at 05:29:31PM +0300, Vladislav Bogdanov wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I'm trying to look if it is now safe to delete non-running nodes >>>>>>>>>>> (corosync 2.3, pacemaker HEAD, crmsh tip). >>>>>>>>>>> >>>>>>>>>>> # crm node delete v02-d >>>>>>>>>>> WARNING: 2: crm_node bad format: 7 v02-c >>>>>>>>>>> WARNING: 2: crm_node bad format: 8 v02-d >>>>>>>>>>> WARNING: 2: crm_node bad format: 5 v02-a >>>>>>>>>>> WARNING: 2: crm_node bad format: 6 v02-b >>>>>>>>>>> INFO: 2: node v02-d not found by crm_node >>>>>>>>>>> INFO: 2: node v02-d deleted >>>>>>>>>>> # >>>>>>>>>>> >>>>>>>>>>> So, I expect that crmsh still doesn't follow latest changes to >>>>>>>>>>> 'crm_node >>>>>>>>>>> -l'. Although node seems to be deleted correctly. >>>>>>>>>>> >>>>>>>>>>> For reference, output of crm_node -l is: >>>>>>>>>>> 7 v02-c >>>>>>>>>>> 8 v02-d >>>>>>>>>>> 5 v02-a >>>>>>>>>>> 6 v02-b >>>>>>>>>> >>>>>>>>>> This time the node state was empty. Or it's missing altogether. >>>>>>>>>> I'm not sure how's that supposed to be interpreted. We test the >>>>>>>>>> output of crm_node -l just to make sure that the node is not >>>>>>>>>> online. Perhaps we need to use some other command. >>>>>>>>> >>>>>>>>> Likely it shows everything from a corosync nodelist. >>>>>>>>> After I deleted the node from everywhere except corosync, list is >>>>>>>>> still >>>>>>>>> the same. >>>>>>>> >>>>>>>> OK. This patch changes the interface to crm_node to use the >>>>>>>> "list partition" option (-p). Could you please test it? >>>>>>> >>>>>>> Nope. Not enough. Even worse than before. I tested todays tip as it >>>>>>> includes that patch with merge of Andrew's public and private master >>>>>>> heads. >>>>>>> ========= >>>>>>> [root@v02-b ~]# crm node show >>>>>>> v02-a(5): normal >>>>>>> standby: off >>>>>>> virtualization: true >>>>>>> $id: nodes-5 >>>>>>> v02-b(6): normal >>>>>>> standby: off >>>>>>> virtualization: true >>>>>>> v02-c(7): normal >>>>>>> standby: off >>>>>>> virtualization: true >>>>>>> v02-d(8): normal(offline) >>>>>>> standby: off >>>>>>> virtualization: true >>>>>>> [root@v02-b ~]# crm node delete v02-d >>>>>>> ERROR: according to crm_node, node v02-d is still active >>>>>>> [root@v02-b ~]# crm_node -p >>>>>>> v02-c v02-d v02-a v02-b >>>>>>> [root@v02-b ~]# crm_node -l >>>>>>> 7 v02-c >>>>>>> 8 v02-d >>>>>>> 5 v02-a >>>>>>> 6 v02-b >>>>>>> [root@v02-b ~]# >>>>>>> ========= >>>>>>> >>>>>>> That is after I stopped node, lowered votequorum expected_votes (with >>>>>>> corosync-quorumtool) and deleted v02-d from a cmap nodelist. >>>>>>> >>>>>>> corosync-cmapctl still shows runtime info about deleted node as well: >>>>>>> runtime.totem.pg.mrp.srp.members.8.config_version (u64) = 0 >>>>>>> runtime.totem.pg.mrp.srp.members.8.ip (str) = r(0) ip(10.5.4.55) >>>>>>> runtime.totem.pg.mrp.srp.members.8.join_count (u32) = 1 >>>>>>> runtime.totem.pg.mrp.srp.members.8.status (str) = left >>>>>>> And it is not allowed to delete that keys. >>>>>>> >>>>>>> crm_node -R did the job (nothing left in the CIB), but, v02-d still >>>>>>> appears in its output for both -p and -l. >>>>>>> >>>>>>> Andrew, I copy you directly because above is probably to you. Shouldn't >>>>>>> crm_node some-how show that stopped node is deleted from a corosync >>>>>>> nodelist? >>>>>> >>>>>> Which stack is this? >>>>> >>>>> corosync 2.3 with nodelist and udpu. >>>> >>>> I assume its possible, but crm_node isn't smart enough to do that yet. >>>> Feel like writing a patch? :) >>> >>> Shouldn't it just skip offline nodes for -p? >>> >> >> Worse. It appears to be asking pacemakerd instead of corosync or crmd. >> > > Hm. I do not believe I'm able to refactor it then... > Yeah, I'm looking at it. The hard part is that going to corosync directly only gives you a nodeid :-( _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
