Realmente, o problema eh que o keepalived retorna 1 quando é iniciado
sem ser parado antes, ou seja, no segundo start consecutivo ele começa
a retornar 1... Mas porque o heartbeat está iniciando os serviços
novamente se eles já estão iniciados?

Obrigado

2006/5/15, Flavio Menezes Reis <[EMAIL PROTECTED]>:
> Pra embassar mais... o log do que acontece quando eu desligo o
> secundário. Parece que o problema está em o hearbeat novamente dar
> start em tudo que está em haresources quando o secundário é desligado:
> heartbeat[6047]: 2006/05/15_11:30:45 info: Received shutdown notice
> from 'sempron2200_2'.
> heartbeat[6047]: 2006/05/15_11:30:45 info: Resources being acquired
> from sempron2200_2.
> heartbeat[7102]: 2006/05/15_11:30:45 info: acquire local HA resources 
> (standby).
> ResourceManager[7122]:  2006/05/15_11:30:45 info: Acquiring resource
> group: athlon2400 IPaddr::192.168.1.50/24/eth0 drbddisk
> Filesystem::/dev/drbd0::/mnt/data postgresql-8.1 cluster0 keepalived
> IPaddr[7147]:   2006/05/15_11:30:45 INFO: IPaddr Running OK
> heartbeat[7103]: 2006/05/15_11:30:45 info: Local Resource acquisition 
> completed.
> IPaddr[7166]:   2006/05/15_11:30:45 INFO: IPaddr Running OK
> ResourceManager[7122]:  2006/05/15_11:30:45 info: Running
> /etc/ha.d/resource.d/drbddisk  start
> Filesystem[7485]:       2006/05/15_11:30:45 INFO: /mnt/data is mounted 
> (running)
> Filesystem[7421]:       2006/05/15_11:30:45 INFO: Filesystem Running OK
> ResourceManager[7122]:  2006/05/15_11:30:45 info: Running
> /etc/init.d/postgresql-8.1  start
> ResourceManager[7122]:  2006/05/15_11:30:46 info: Running
> /etc/ha.d/resource.d/cluster0  start
> ResourceManager[7122]:  2006/05/15_11:30:47 info: Running
> /etc/init.d/keepalived  start
> ResourceManager[7122]:  2006/05/15_11:30:47 ERROR: Return code 1 from
> /etc/init.d/keepalived
> ResourceManager[7122]:  2006/05/15_11:30:47 CRIT: Giving up resources
> due to failure of keepalived
> ResourceManager[7122]:  2006/05/15_11:30:47 info: Releasing resource
> group: athlon2400 IPaddr::192.168.1.50/24/eth0 drbddisk
> Filesystem::/dev/drbd0::/mnt/data postgresql-8.1 cluster0 keepalived
> ResourceManager[7122]:  2006/05/15_11:30:47 info: Running
> /etc/init.d/keepalived  stop
> ResourceManager[7122]:  2006/05/15_11:30:47 info: Running
> /etc/ha.d/resource.d/cluster0  stop
> ResourceManager[7122]:  2006/05/15_11:30:49 info: Running
> /etc/init.d/postgresql-8.1  stop
> ResourceManager[7122]:  2006/05/15_11:30:51 info: Running
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mnt/data stop
> Filesystem[7755]:       2006/05/15_11:30:52 INFO: Filesystem Success
> ResourceManager[7122]:  2006/05/15_11:30:52 info: Running
> /etc/ha.d/resource.d/drbddisk  stop
> ResourceManager[7122]:  2006/05/15_11:30:53 info: Running
> /etc/ha.d/resource.d/IPaddr 192.168.1.50/24/eth0 stop
> IPaddr[7988]:   2006/05/15_11:30:54 INFO: /sbin/route -n del -host 
> 192.168.1.50
> IPaddr[7988]:   2006/05/15_11:30:54 INFO: /sbin/ifconfig eth0:0
> 192.168.1.50 down
> IPaddr[7988]:   2006/05/15_11:30:54 INFO: IP Address 192.168.1.50 released
> IPaddr[7883]:   2006/05/15_11:30:54 INFO: IPaddr Success
> heartbeat[7102]: 2006/05/15_11:30:54 info: local HA resource
> acquisition completed (standby).
> heartbeat[6047]: 2006/05/15_11:30:54 info: Standby resource
> acquisition done [foreign].
> harc[8041]:     2006/05/15_11:30:54 info: Running /etc/ha.d/rc.d/status status
> mach_down[8050]:        2006/05/15_11:30:54 info:
> /usr/lib/heartbeat/mach_down: nice_failback: foreign resources
> acquired
> mach_down[8050]:        2006/05/15_11:30:54 info: mach_down takeover
> complete for node sempron2200_2.
> heartbeat[6047]: 2006/05/15_11:30:54 info: mach_down takeover complete.
>
>
> 2006/5/15, Flavio Menezes Reis <[EMAIL PROTECTED]>:
> > Olá!
> >
> > Tava quase tudo pronto aqui no cluster, com auto failback on e tudo, o
> > problema é que não havia experimentado desligar o secundário e
> > incrivelmente, quando faço isto, o keepalived vem com um Return Code =
> > 1 (RC=1) que faz com que o heartbeat comece a dar stop em todos os
> > script do arquivo haresource o que acaba por derrubar o servidor...
> > Porque o keepalived retorna este RC=1???
> >
> > Obrigado a todos. Daqui a pouco vou remeter um relato completo de como
> > construi este cluster para trocar experiências com vocês.
> >
> > Abraços
> >
> > --
> > Flávio Menezes dos Reis
> > Bacharelando em Sistemas de Informação - Ulbra - Torres - RS
> > [EMAIL PROTECTED]
> >
>
>
> --
> Flávio Menezes dos Reis
> Bacharelando em Sistemas de Informação - Ulbra - Torres - RS
> [EMAIL PROTECTED]
>


-- 
Flávio Menezes dos Reis
Bacharelando em Sistemas de Informação - Ulbra - Torres - RS
[EMAIL PROTECTED]
_______________________________________________
Linux-HA mailing list
[email protected]
http://listas.linuxchix.org.br/mailman/listinfo/linux-ha

Responder a