Jeronimo, percebi pelo seu arquivo de HARESOURCES que vc está montando um
ambiente de balanceamento de carga usando Zeo/Zope.
Aqui no Núcleo de Pesquisa que participo nos montamos um ambiente
semelhante.
Vamos lá, quanto seu erro eu acho que é algo relativo a transmissão UDP
usada pelo Heartbeat, tente mudar a porta.
Ok
2007/2/15, Jeronimo Zucco <[EMAIL PROTECTED]>:
Olá a todos.
Estou enfrentando problema com o heartbeat. O nodo 2 simplesmente
cai depois de alguns segundos, sem nenhuma explicação aparente. Segue
abaixo o log do nodo 2:
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: Enabling logging daemon
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: logfile and debug file
are those specified in logd config file (default /etc/logd.cf)
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: Syntax: apiauth client
[uid=uidlist] [gid=gidlist]
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: Where uidlist is a
comma-separated list of uids,
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: and gidlist is a
comma-separated list of gids
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: One or the other must be
specified.
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: Syntax: apiauth client
[uid=uidlist] [gid=gidlist]
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: Where uidlist is a
comma-separated list of uids,
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: and gidlist is a
comma-separated list of gids
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: One or the other must be
specified.
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: AUTH: i=1: key =
0x80eed98, auth=0xb7b6cccc, authname=crc
Feb 15 10:48:06 odin2 heartbeat: [3883]: WARN: Core dumps could be lost
if multiple dumps occur
Feb 15 10:48:06 odin2 heartbeat: [3883]: WARN: Consider setting
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
supportability
Feb 15 10:48:06 odin2 heartbeat: [3883]: WARN: logd is enabled but
logfile/debugfile/logfacility is still configured in ha.cf
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: **************************
Feb 15 10:48:06 odin2 heartbeat: [3883]: info: Configuration validated.
Starting heartbeat 2.0.8
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: heartbeat: version 2.0.8
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: Heartbeat generation: 17
Feb 15 10:48:06 odin2 heartbeat: [3884]: info:
G_main_add_TriggerHandler: Added signal manual handler
Feb 15 10:48:06 odin2 heartbeat: [3884]: info:
G_main_add_TriggerHandler: Added signal manual handler
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: Removing
/var/run/heartbeat/rsctmp failed, recreating.
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: glib: ucast: write socket
priority set to IPTOS_LOWDELAY on eth1
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: glib: ucast: bound send
socket to device: eth1
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: glib: ucast: bound
receive socket to device: eth1
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: glib: ucast: started on
port 694 interface eth1 to 10.100.100.1
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: G_main_add_SignalHandler:
Added signal handler for signal 17
Feb 15 10:48:06 odin2 heartbeat: [3884]: info: Local status now set to:
'up'
Feb 15 10:48:07 odin2 heartbeat: [3884]: info: Link odin1:eth1 up.
Feb 15 10:48:07 odin2 heartbeat: [3884]: info: Status update for node
odin1: status active
Feb 15 10:48:07 odin2 harc[3896]: [3899]: info: Running
/etc/ha.d/rc.d/status status
Feb 15 10:48:07 odin2 heartbeat: [3884]: info: Exiting status process
3896 returned rc 0.
Feb 15 10:48:08 odin2 heartbeat: [3884]: info: Comm_now_up(): updating
status to active
Feb 15 10:48:08 odin2 heartbeat: [3884]: info: Local status now set to:
'active'
Fica nessa, mas eu vejo pelo heartbeat status que está tudo parado.
No nodo 1, que é o primário, ele detecta que o nodo 2 caiu:
Feb 15 10:48:07 odin1 ipfail: [6343]: info: Link Status update: Link
odin2/eth1 now has status up
Feb 15 10:48:07 odin1 heartbeat: [6330]: info: Heartbeat restart on node
odin2
Feb 15 10:48:07 odin1 heartbeat: [6330]: info: Link odin2:eth1 up.
Feb 15 10:48:07 odin1 heartbeat: [6330]: info: Status update for node
odin2: status init
Feb 15 10:48:07 odin1 ipfail: [6343]: info: Status update: Node odin2
now has status init
Feb 15 10:48:07 odin1 heartbeat: [6330]: info: Status update for node
odin2: status up
Feb 15 10:48:07 odin1 ipfail: [6343]: info: Status update: Node odin2
now has status up
Feb 15 10:48:07 odin1 harc[6745]: [6747]: info: Running
/etc/ha.d/rc.d/status status
Feb 15 10:48:07 odin1 harc[6749]: [6751]: info: Running
/etc/ha.d/rc.d/status status
Feb 15 10:48:08 odin1 heartbeat: [6330]: info: all clients are now paused
Feb 15 10:48:08 odin1 heartbeat: [6330]: info: all clients are now resumed
Feb 15 10:48:08 odin1 heartbeat: [6330]: info: Status update for node
odin2: status active
Feb 15 10:48:08 odin1 ipfail: [6343]: info: Status update: Node odin2
now has status active
Feb 15 10:48:08 odin1 harc[6753]: [6755]: info: Running
/etc/ha.d/rc.d/status status
Feb 15 10:48:41 odin1 ipfail: [6343]: info: Status update: Node odin2
now has status dead
Feb 15 10:48:41 odin1 heartbeat: [6330]: WARN: node odin2: is dead
Feb 15 10:48:41 odin1 heartbeat: [6330]: info: Dead node odin2 gave up
resources.
Feb 15 10:48:41 odin1 heartbeat: [6330]: info: Link odin2:eth1 dead.
Feb 15 10:48:41 odin1 ipfail: [6343]: info: NS: We are dead. :<
Feb 15 10:48:42 odin1 ipfail: [6343]: info: Link Status update: Link
odin2/eth1 now has status dead
Feb 15 10:48:42 odin1 ipfail: [6343]: info: We are dead. :<
Feb 15 10:48:42 odin1 ipfail: [6343]: info: Asking other side for ping
node count.
Meu haresources em ambas as máquinas contém:
odin1 IPaddr::X.X.X.X/26/eth0 drbddisk::r0
Filesystem::/dev/drbd0::/zeo::ext3 zeo-instancias
Se alguém puder dar uma dica de como resolver esse problema, agradeço.
--
Jeronimo Zucco
LPIC-1 Linux Professional Institute Certified
Núcleo de Processamento de Dados
Universidade de Caxias do Sul
http://jczucco.blogspot.com
_______________________________________________
Linux-HA mailing list
[email protected]
http://listas.linuxchix.org.br/mailman/listinfo/linux-ha
--
Ronaldo Amaral Santos
Tecnólogo em Desenvolvimento de Software 6º Período Noturno
Núcleo de Pesquisa em Sistemas de Informação – NSI
Cefet-Campos
-------------------------
Linux User #437600
_______________________________________________
Linux-HA mailing list
[email protected]
http://listas.linuxchix.org.br/mailman/listinfo/linux-ha