[ha-clusters-discuss] Details: Not possible to remove a dead node (which can't join the cluster) from a cluster ?!?

Thorsten Frueauf Wed, 14 Oct 2009 16:15:18 +0200

Hi Nils et al,

see inline:


Nils Goroll wrote:
> Hi Thorsten,
> 
> 
>>>>> ## node0 booted outside cluster (-x)
>>>> Why are you booting the node out of the cluster?
>>>
>>> I am trying to work out a procedure to restore a failed cluster node 
>>> on different hardware, in which case I cannot assume that the 
>>> interconnect will come up as the CLI interfaces might have changed.
>>
>> Now I am confused. So let me add some more context and see if this is 
>> what you are doing.
>>
>> The starting point is a working two node cluster (lets call them 
>> node-a and node-b).
>>
>> A diskset gets configured for both nodes.
>>
>> One node fails and is no longer available. Lets assume this is node-b.
>>
>> You should still be able to boot node-a in cluster mode
> 
> Correct.
> 
>  > If you then determine node-b to be non repairable/restorable, you should
>  > be able to remove node-b from the diskset by using:
>  >
>  > root at node-a# metaset -s <disksetname> -df -h node-b
> 
> which is exactly what I am trying to do. In the case I posted, the 
> failed node is node0 and I am trying to run on node1 (booted in cluster):
> 
> root at pub2-node1:~# time metaset -s pub2-node0 -d -f -h pub2-node0
> Proxy command to: pub2-node0
> 172.16.4.1: RPC: Rpcbind failure - RPC: Timed out
> rpc failure
> real    1m0.110s
> user    0m0.068s
> sys     0m0.026s

The RPC timeout is to be expected. What I find strange is that it prints 
172.16.4.1 - since I would expect "pub2-node0" at this point.

By random I have a cluster in my lab that has indeed a failed node (the 
Ultra 10 motherboard died). Granted, that is still a S10 11/06 and SC 
3.2 FCS cluster, but I was able to perform the following (lab-u10-1 is 
the still active node, lab-u10-2 is the real dead node):

root at lab-u10-1 # metaset
[...]
Set name = sge_ds, Set number = 2

Host                Owner
   lab-u10-1          Yes
   lab-u10-2

Mediator Host(s)    Aliases
   lab-u10-1
   lab-u10-2

Driv Dbase

d3   Yes
[...]

root at lab-u10-1 # metaset -s sge_ds -df -h lab-u10-2
Oct 14 16:05:05 lab-u10-1 md: WARNING: rpc.metamedd on host lab-u10-2 
not responding
lab-u10-2: RPC: Rpcbind failure - RPC: Timed out
root at lab-u10-1 # metaset
[...]
Set name = sge_ds, Set number = 2

Host                Owner
   lab-u10-1          Yes

Mediator Host(s)    Aliases
   lab-u10-1
   lab-u10-2

Driv Dbase

d3   Yes


Thus as you can see, the RPC failure is to be expected, since the 
cluster will try to contact the other node, and will fail since it is 
dead. But it will ultimately remove the node from the diskset.

Note that the diskset must be online on the remaining node (here lab-u10-1).

Now the obvious difference to your output is that in mine the correct 
nodename is getting used.

As I remember, SVM is very picky about that host name - it must be 
exactly the one as the diskset got registered.

Your output shows the host names pub2-node0 and pub2-node1, but the RPC 
error is for 172.16.4.1 - here I would have expected "pub2-node0" instead.

The IP sounds like the cluster interconnect. Is it possible that your 
host resolving / IP resolving is somehow broken?

> root at pub2-node1:~# metaset
> 
> Set name = nfs-dg, Set number = 1
> 
> Host                Owner
>   pub2-node0
>   pub2-node1         Yes
> 
> Driv Dbase
> 
> d1   Yes
> 
> d2   Yes
> 
> I've tried the same on s10 with sc32 and didn't succeed either.

Was the correct host name displayed on that cluster?

> Regarding on side aspect:
> 
>> Thus I am not sure what kind of interconnect or CLI interface issues 
>> you expect.
> 
> I was referring to the fact that in the recovery scenario I am trying to 
> solve it might not be possible to form a cluster because the failed node 
> possibly could get restored on different hardware, so the (restored) 
> cluster config would still contain adapters which don't exist on the 
> (changed) hardware.

In that case you should not try to add that node as the same name as the 
failed one.

Instead remove the failed one, and add the restored hardware as a 
complete new node.

>> I would assume that you need to remove the node from other things like 
>> resource groups, quorum device, etc, before you actually perform the 
>> "clnode clear -F node-b" from node-a (again being in cluster mode).
>>
>> "clnode remove" would only be used if the node you want to remove is 
>> still bootable into non-cluster mode.
> 
> Please let me come back to this point later, I currently can't access my 
> development environment :-(
> 
>> Or are you trying to remove a dead node, and then later add a 
>> different new node?
> 
> This is the plan. For purpose of development environment, the removed 
> and added node are the same, but this is just a simplification.

As recommended above - unless you have the exact same hardware and are 
able to restore from a backup the exact state of the failed node, I 
would recommend to remove the failed node and add the restored one as 
new node to the cluster.

Regards
        Thorsten

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   Sitz der Gesellschaft:
   Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
   Amtsgericht Muenchen: HRB 161028
   Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
   Vorsitzender des Aufsichtsrates: Martin Haering
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[ha-clusters-discuss] Details: Not possible to remove a dead node (which can't join the cluster) from a cluster ?!?

Reply via email to