On Mar 21, 2007, at 2:17 PM, Max Hofer wrote:
On Wednesday 21 March 2007 12:55, Alan Robertson wrote:
Andrew Beekhof wrote:
On 3/20/07, Alan Robertson <[EMAIL PROTECTED]> wrote:
Max Hofer wrote:
OK,
i lost a day just trying to figure out how to replace a cluster
node
with
a spare part. I just thought someone else needs this info or maybe
knows a better way as How I did.
Situation:
- cluster with 2 nodes (routing1, routing2)
- routing2 should be replaced with a spare part
- routing1 and routing2 use a file system on a drbd to share
common data
Precondition:
- routing2 crashed and hb_uuid is not recoverable
FYI: It's in the CIB, and also in the hb_uuid files on every
machine.
- spare part is configured to not start heartbeat after power-on
Steps I did:
* replaced crashed routing2 with spare part (cabling etc.)
* powered on routing2
* on routing2 invalidate data on drbd device (---> sync from
routing1
to routing2)
* on routing1 delete routing2 (I found a bug that pingd resets
to 0
when calling hb_delnode ---> see bug #1535)
# /usr/lib/heartbeat/hb_delnode routing2 && killall pingd
(!!!NOTE: if your cluster configuration triggers a failover on a
pingd
failure set the cluster in unmanaged mode, stop pingd, delete
the node and then restart pingd, setting the cluster in managed
mode
again)
* on routing1 delete removed hostcache (I'm not sure if this
setp is
neccessary but someone in the mailing list explained it has to
be done)
# rm /var/lib/heartbeat/delhostcache
* on routing1 add routing2 again
# /usr/lib/heartbeat/hb_addnode routing2
* start heartbeat on routing2
Finished .....
What i really find stupid about the whole proccedure:
* the assumption the UUID file (/var/lib/heartbeat/hb_uuid)
should can
be used on the spare part is probably never the case (except you
perform a planned replacement ... )
See note above...
* this assumption does not work well if the spare part is
installed to
be a replacement for different cluster nodes. The UUDI is created
on the veiry first install of heartbeat (and thus is not part of
my
configuration data). It would be a cofiguration hell to "save all
UUID of all clusters after cluster actvation" on a system with a
couple nodes
It's already saved for you - in two places on every machine...
What's missing is the conversion from ASCII to binary. Could you
make a
bugzilla for that and assign it to me?
been there done that:
crm_uuid -w
Tthis command returns the ASCII UUID of the currenltly running node.
are you using 2.0.8 or the development version?
this feature wasn't in 2.0.8
What i need is
a command which returns me the binary version of the node which has
to be replaced.
Example: two nodes N1 and N2. N2 is replaced (because of HD crash)
So i need to create the binary UUID for N2 on N1 - something like
crm_uuid -b N2 > hb_uuid_N2
Andrew: Is there a man page or other documentation outside the
command
for this?
--
Max Hofer
APUS Software G.m.b.H.
A-8074 Raaba, Bahnhofstraße 1/1
T| +43 316 401629 11
F| +43 316 401629 9
W| www.apus.co.at
E| [EMAIL PROTECTED]
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems