Hi Thorsten,
>>>> ## node0 booted outside cluster (-x) >>> Why are you booting the node out of the cluster? >> >> I am trying to work out a procedure to restore a failed cluster node >> on different hardware, in which case I cannot assume that the >> interconnect will come up as the CLI interfaces might have changed. > > Now I am confused. So let me add some more context and see if this is > what you are doing. > > The starting point is a working two node cluster (lets call them node-a > and node-b). > > A diskset gets configured for both nodes. > > One node fails and is no longer available. Lets assume this is node-b. > > You should still be able to boot node-a in cluster mode Correct. > If you then determine node-b to be non repairable/restorable, you should > be able to remove node-b from the diskset by using: > > root at node-a# metaset -s <disksetname> -df -h node-b which is exactly what I am trying to do. In the case I posted, the failed node is node0 and I am trying to run on node1 (booted in cluster): root at pub2-node1:~# time metaset -s pub2-node0 -d -f -h pub2-node0 Proxy command to: pub2-node0 172.16.4.1: RPC: Rpcbind failure - RPC: Timed out rpc failure real 1m0.110s user 0m0.068s sys 0m0.026s root at pub2-node1:~# metaset Set name = nfs-dg, Set number = 1 Host Owner pub2-node0 pub2-node1 Yes Driv Dbase d1 Yes d2 Yes I've tried the same on s10 with sc32 and didn't succeed either. Regarding on side aspect: > Thus I am not sure what kind of interconnect or CLI interface issues you > expect. I was referring to the fact that in the recovery scenario I am trying to solve it might not be possible to form a cluster because the failed node possibly could get restored on different hardware, so the (restored) cluster config would still contain adapters which don't exist on the (changed) hardware. > I would assume that you need to remove the node from other things like > resource groups, quorum device, etc, before you actually perform the > "clnode clear -F node-b" from node-a (again being in cluster mode). > > "clnode remove" would only be used if the node you want to remove is > still bootable into non-cluster mode. Please let me come back to this point later, I currently can't access my development environment :-( > Or are you trying to remove a dead node, and then later add a different > new node? This is the plan. For purpose of development environment, the removed and added node are the same, but this is just a simplification. Again, thank you very much for taking the time to discuss these issues. Nils