On Mar 21, 2007, at 2:17 PM, Max Hofer wrote:

On Wednesday 21 March 2007 12:55, Alan Robertson wrote:
Andrew Beekhof wrote:

On 3/20/07, Alan Robertson <[EMAIL PROTECTED]> wrote:


Max Hofer wrote:



OK,







i lost a day just trying to figure out how to replace a cluster node


with



a spare part. I just thought someone else needs this info or maybe



knows a better way as How I did.







Situation:



- cluster with 2 nodes (routing1, routing2)



- routing2 should be replaced with a spare part



- routing1 and routing2 use a file system on a drbd to share



 common data







Precondition:



- routing2 crashed and hb_uuid is not recoverable





FYI: It's in the CIB, and also in the hb_uuid files on every machine.






- spare part is configured to not start heartbeat after power-on







Steps I did:



* replaced crashed routing2 with spare part (cabling etc.)



* powered on routing2



* on routing2 invalidate data on drbd device (---> sync from routing1



to routing2)



* on routing1 delete routing2 (I found a bug that pingd resets to 0



when calling hb_delnode ---> see bug #1535)



# /usr/lib/heartbeat/hb_delnode routing2 && killall pingd



(!!!NOTE: if your cluster configuration triggers a failover on a pingd



failure set the cluster in unmanaged mode, stop pingd, delete



the node and then restart pingd, setting the cluster in managed mode



again)



* on routing1 delete removed hostcache (I'm not sure if this setp is



neccessary but someone in the mailing list explained it has to be done)



# rm /var/lib/heartbeat/delhostcache



* on routing1 add routing2 again



# /usr/lib/heartbeat/hb_addnode routing2



* start heartbeat on routing2







Finished .....







What i really find stupid about the whole proccedure:



* the assumption the UUID file (/var/lib/heartbeat/hb_uuid) should can



be used on the spare part is probably never the case (except you



perform a planned replacement ... )





See note above...






* this assumption does not work well if the spare part is installed to



be a replacement for different cluster nodes. The UUDI is created



on the veiry first install of heartbeat (and thus is not part of my



configuration data). It would be a cofiguration hell to "save all



UUID of all clusters after cluster actvation" on a system with a



couple nodes





It's already saved for you - in two places on every machine...





What's missing is the conversion from ASCII to binary. Could you make a


bugzilla for that and assign it to me?






been there done that:

 crm_uuid -w
Tthis command returns the ASCII UUID of the currenltly running node.

are you using 2.0.8 or the development version?

this feature wasn't in 2.0.8


What i need is
a command which returns me the binary version of the node which has to be replaced.

Example: two nodes N1 and N2. N2 is replaced (because of HD crash)

So i need to create the binary UUID for N2 on N1 - something like

crm_uuid -b N2 > hb_uuid_N2


Andrew: Is there a man page or other documentation outside the command
for this?


--
Max Hofer
APUS Software G.m.b.H.
A-8074 Raaba, Bahnhofstraße 1/1
T| +43 316 401629 11
F| +43 316 401629 9
W| www.apus.co.at
E| [EMAIL PROTECTED]
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to