Bonjour à tous,
Pour un projet que je compte présenter aux
cours, j’expérimente DRBD et heartbeat pour approcher les bases de la
haute disponibilité.
Je suis un débutant sous Linux et encore plus sous Debian ( j'ai commencé avec
Ubuntu).
J’ai suivis plusieurs tutoriaux pour mettre en place DRBD et heartbeat.
DRBD ne pose pas de problèmes.
Par contre heartbeat, arg, ca parrait simple mais rien ne marche.
De plus je fais la configuration en style V1, donc c’est facilement lisible.
Lorceque je lance heartbeat via:
Hello everyone,
I try to experiment DRBD and
heartbeat.
I am a beginner in Linux.
I followed several tutorials to develop DRBD and heartbeat.
DRBD works perfectly.
Also I am setting V1 style, so it is easily readable.
When I start heartbeat via:
/etc/init.d/heartbeat restart
I get to folow message:
Stopping High-Availability services:
Done.
Waiting to allow resource takeover to complete:
Done.
Starting High-Availability services:
2010/05/31_22:12:33 INFO: Resource is stopped
Done.
And the ip alias is not created
My stucture look like this
LAN0 sur eth0: 192.168.0.0 /24 # Lan users + heartbeat
LAN1 sur eth1: 192.168.1.0 /30 # Lan DRBD: is working.
LAN2 sur eth2: 192.168.2.0 /28 # Lan apps servers: Not yet used.
Heartbeat have 2 node: frontal1 and frontal2
Frontal1| eth1------DRBD------eth1 | frontal2
------------ ------------
eth0 _____________________eth0
| |
|______________________ |
frontal1 eth0: 192.168.0.2
frontal2 eth0: 192.168.0.3
heartbeat ip alias eth0:0: 192.168.0.1
frontal1 eth1: 192.168.1.1
frontal2 eth2: 192.168.1.2
My sotware are Debian 5.0 Lenny and heartbeat 2.1.3-6lenny4
I folowed the folow guides without any succes
http://howtoforge.net/highly-available-nfs-server-using-drbd-and-heartbeat-on-debian-5.0-lenny
http://doc.ubuntu-fr.org/tutoriel/mirroring_sur_deux_serveurs
http://www.drbd.org/users-guide/ch-heartbeat.html
http://www.linux-ha.org/doc/
Here you have my logs, commands result
When I do a BasicSanityCheck (2) I see a problem with IPaddr
But when I launch manualy the script Ipaddr ou Ipaddr2 the ip alias is created
and avaliable on the network.
I looked on a few forum about the subject, and I don't find any solution on my
problem
Thanks for your help
(1) vim /etc/ha.d/ha.cf [/b]
Code:
autojoin none
mcast eth0 239.0.0.43 694 1 0
warntime 5
deadtime 5
initdead 15
keepalive 2
node frontal1
node frontal2
(2) sh /usr/share/heartbeat/BasicSanityCheck
Code:
RTNETLINK answers: Network is unreachable
Using interface: eth0
Should not run tests with heartbeat already running.
Starting base64 and md5 algorithm tests
base64 and md5 algorithm tests succeeded.
Starting Resource Agent tests
Testing RA: Dummy
Testing RA: IPaddr
ERROR: IPaddr RA failed
Starting IPC tests
That's weird. Heartbeat seems to be running...
Stopping heartbeat
Stopping High-Availability services:
Done.
Starting heartbeat
Starting High-Availability services:
2010/05/31_22:16:04 INFO: Resource is stopped
Done.
Does not look like we ARPed the address
Looks like monitor operation failed
Reloading heartbeat
Reloading heartbeat
Stopping heartbeat
Stopping High-Availability services:
Done.
Checking STONITH basic sanity.
Performing apphbd success case tests
Performing apphbd failure case tests
Starting LRM tests
Starting heartbeat
Starting High-Availability services:
2010/05/31_22:18:25 INFO: Resource is stopped
Done.
(3)sh /usr/share/heartbeat/ResourceManager listkeys frontal1
192.168.0.1
(4)sh /usr/share/heartbeat/ResourceManager listkeys frontal2
(5)ip addr show
Code:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state
UNKNOWN qlen 1000
link/ether 00:0c:29:cb:86:45 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.2/24 brd 192.168.0.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state
UNKNOWN qlen 1000
link/ether 00:0c:29:cb:86:4f brd ff:ff:ff:ff:ff:ff
inet 192.168.1.1/30 brd 192.168.1.3 scope global eth1
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:0c:29:cb:86:59 brd ff:ff:ff:ff:ff:ff
(6)/etc/ha.d/resource.d/IPaddr 192.168.0.1 start
Code:
2010/05/31_22:30:37 INFO: Success
(7)/etc/ha.d/resource.d/IPaddr2 192.168.0.1 start
Code:
2010/05/31_22:30:24 INFO: Using calculated nic for 192.168.0.1: eth0
2010/05/31_22:30:24 INFO: Using calculated netmask for 192.168.0.1:
255.255.255.0
2010/05/31_22:30:25 INFO: eval ifconfig eth0:0 192.168.0.1 netmask
255.255.255.0 broadcast 192.168.0.255
2010/05/31_22:30:25 INFO: Success
INFO: Success
(8) cat /etc/ha.d/haresources
frontal1 IPaddr2::192.168.0.1/24/eth0/192.168.0.255
OU frontal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server
(9) cat /var/log/heartbeat/log
Code:
heartbeat[7179]: 2010/05/31_22:38:40 info: Version 2 support: false
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Deprecated 'legacy' auto_failback opt
ion selected.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Please convert to 'auto_failback on'.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: See documentation for conversion deta
ils.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Logging daemon is disabled --enabling
logging daemon is recommended
heartbeat[7179]: 2010/05/31_22:38:40 info: **************************
heartbeat[7179]: 2010/05/31_22:38:40 info: Configuration validated. Starting hea
rtbeat 2.1.3
heartbeat[7180]: 2010/05/31_22:38:40 info: heartbeat: version 2.1.3
heartbeat[7180]: 2010/05/31_22:38:40 info: Heartbeat generation: 1275221613
heartbeat[7180]: 2010/05/31_22:38:40 info: glib: UDP multicast heartbeat started
for group 239.0.0.43 port 694 interface eth0 (ttl=1 loop=0)
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign
al manual handler
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign
al manual handler
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_SignalHandler: Added signa
l handler for signal 17
heartbeat[7180]: 2010/05/31_22:38:40 info: Local status now set to: 'up'
heartbeat[7180]: 2010/05/31_22:38:41 info: Link frontal2:eth0 up.
heartbeat[7180]: 2010/05/31_22:38:41 info: Status update for node frontal2: stat
[7m--More-- [27m
us active
harc[7188]: 2010/05/31_22:38:41 info: Running /etc/ha.d/rc.d/status status
heartbeat[7180]: 2010/05/31_22:38:42 info: Comm_now_up(): updating status to act
ive
heartbeat[7180]: 2010/05/31_22:38:42 info: Local status now set to: 'active'
IPaddr2[7242]: 2010/05/31_22:38:42 INFO: Resource is stopped
heartbeat[7204]: 2010/05/31_22:38:42 info: Local Resource acquisition completed.
frontal1:~# cat /var/log/heartbeat/log|more
heartbeat[7179]: 2010/05/31_22:38:40 info: Version 2 support: false
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Deprecated 'legacy' auto_failback opt
ion selected.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Please convert to 'auto_failback on'.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: See documentation for conversion deta
ils.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Logging daemon is disabled --enabling
logging daemon is recommended
heartbeat[7179]: 2010/05/31_22:38:40 info: **************************
heartbeat[7179]: 2010/05/31_22:38:40 info: Configuration validated. Starting hea
rtbeat 2.1.3
heartbeat[7180]: 2010/05/31_22:38:40 info: heartbeat: version 2.1.3
heartbeat[7180]: 2010/05/31_22:38:40 info: Heartbeat generation: 1275221613
heartbeat[7180]: 2010/05/31_22:38:40 info: glib: UDP multicast heartbeat started
for group 239.0.0.43 port 694 interface eth0 (ttl=1 loop=0)
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign
al manual handler
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign
al manual handler
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_SignalHandler: Added signa
l handler for signal 17
heartbeat[7180]: 2010/05/31_22:38:40 info: Local status now set to: 'up'
heartbeat[7180]: 2010/05/31_22:38:41 info: Link frontal2:eth0 up.
heartbeat[7180]: 2010/05/31_22:38:41 info: Status update for node frontal2: stat
[7m--More-- [27m
us active
harc[7188]: 2010/05/31_22:38:41 info: Running /etc/ha.d/rc.d/status status
heartbeat[7180]: 2010/05/31_22:38:42 info: Comm_now_up(): updating status to act
ive
heartbeat[7180]: 2010/05/31_22:38:42 info: Local status now set to: 'active'
IPaddr2[7242]: 2010/05/31_22:38:42 INFO: Resource is stopped
heartbeat[7204]: 2010/05/31_22:38:42 info: Local Resource acquisition completed.
harc[7337]: 2010/05/31_22:39:06 info: Running /etc/ha.d/rc.d/ip-request-resp
ip-request-resp
ip-request-resp[7337]: 2010/05/31_22:39:06 received ip-request-resp
IPaddr2::19
2.168.0.1/24/eth0/192.168.0.255 OK no
ResourceManager[7356]: 2010/05/31_22:39:06 info: Acquiring resource group:
fron
tal1 IPaddr2::192.168.0.1/24/eth0/192.168.0.255
IPaddr2[7382]: 2010/05/31_22:39:06 INFO: Resource is stopped
ResourceManager[7356]: 2010/05/31_22:39:06 info: Running
/etc/ha.d/resource.d/I
Paddr2 192.168.0.1/24/eth0/192.168.0.255 start
IPaddr2[7491]: 2010/05/31_22:39:07 INFO: ip -f inet addr add 192.168.0.1/24
brd
192.168.0.255 dev eth0
IPaddr2[7491]: 2010/05/31_22:39:07 INFO: ip link set eth0 up
IPaddr2[7491]: 2010/05/31_22:39:07 INFO: /usr/lib/heartbeat/send_arp -i 200
-r
5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.0.1 eth0 192.168.0.1 au
to not_used not_used
IPaddr2[7462]: 2010/05/31_22:39:07 INFO: Success
heartbeat[7180]: 2010/05/31_22:39:07 info: Initial resource acquisition complete
[7m--More-- [27m
(ip-request-resp)
harc[7549]: 2010/05/31_22:39:07 info: Running /etc/ha.d/rc.d/ip-request-resp
ip-request-resp
ip-request-resp[7549]: 2010/05/31_22:39:07 received ip-request-resp
drbddisk::r
0 OK no
ResourceManager[7568]: 2010/05/31_22:39:07 info: Acquiring resource group:
fron
tal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server tftpd-hpa
ResourceManager[7568]: 2010/05/31_22:39:07 info: Running
/etc/ha.d/resource.d/d
rbddisk r0 start
Filesystem[7633]: 2010/05/31_22:39:07 INFO: Resource is stopped
ResourceManager[7568]: 2010/05/31_22:39:07 info: Running
/etc/ha.d/resource.d/F
ilesystem /dev/drbd1 /serveur ext3 start
Filesystem[7711]: 2010/05/31_22:39:07 INFO: Running start for /dev/drbd1 o
n /serveur
Filesystem[7700]: 2010/05/31_22:39:07 INFO: Success
ResourceManager[7568]: 2010/05/31_22:39:07 info: Running
/etc/init.d/dhcp3-serv
er start
ResourceManager[7568]: 2010/05/31_22:39:09 info: Running
/etc/init.d/tftpd-hpa
start
ResourceManager[7568]: 2010/05/31_22:39:09 ERROR: Return code 71 from
/etc/init
.d/tftpd-hpa
ResourceManager[7568]: 2010/05/31_22:39:09 CRIT: Giving up resources due to
fai
lure of tftpd-hpa
ResourceManager[7568]: 2010/05/31_22:39:09 info: Releasing resource group:
fron
[7m--More-- [27m
tal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server tftpd-hpa
ResourceManager[7568]: 2010/05/31_22:39:09 info: Running
/etc/init.d/tftpd-hpa
stop
ResourceManager[7568]: 2010/05/31_22:39:09 info: Running
/etc/init.d/dhcp3-serv
er stop
ResourceManager[7568]: 2010/05/31_22:39:09 info: Running
/etc/ha.d/resource.d/F
ilesystem /dev/drbd1 /serveur ext3 stop
Filesystem[7898]: 2010/05/31_22:39:09 INFO: Running stop for /dev/drbd1 on
/serveur
Filesystem[7898]: 2010/05/31_22:39:09 INFO: Trying to unmount /serveur
Filesystem[7898]: 2010/05/31_22:39:10 INFO: unmounted /serveur successfull
y
Filesystem[7887]: 2010/05/31_22:39:10 INFO: Success
ResourceManager[7568]: 2010/05/31_22:39:10 info: Running
/etc/ha.d/resource.d/d
rbddisk r0 sto
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems