On Mon, Jul 6, 2009 at 10:08 AM, Armanet Stephane <[email protected]> wrote:
> Hello list > > I'm trying to setup a 3 nodes Cluster with 2 failover Domain for an HA > mail solution. > I want 1 run active for the Imap server in the Imap Failover domain , 1 > node active for the Smtp in the Smtp Failover domain and the 3rd in the > 2 failover domain as a backup node. > > I run Centos 5.3 > My fence device is a wti power switch > > My cluster.conf is in attachement > > My SMTP service is composed of: > 1 IP > 1 amavisd scritp > 1 postfix script > 2 NFS mount for postfix and amavis > > If I manually kill the postfix master process (to simulate a crash), my > node is not fence and the logs said: > > Jul 6 10:00:40 centos-smtp1 clurgmgrd: [4228]: <info> Executing > /etc/init.d/postfix status > Jul 6 10:00:40 centos-smtp1 clurgmgrd: [4228]: <err> script:postfix: > status of /etc/init.d/postfix failed (returned 3) > Jul 6 10:00:40 centos-smtp1 clurgmgrd[4228]: <notice> status on script > "postfix" returned 1 (generic error) > Jul 6 10:00:40 centos-smtp1 clurgmgrd[4228]: <notice> Stopping service > service:Postfix > Jul 6 10:00:40 centos-smtp1 clurgmgrd: [4228]: <info> Executing > /etc/init.d/amavisd stop > Jul 6 10:00:40 centos-smtp1 kernel: do_vfs_lock: VFS is out of sync > with lock manager! > Jul 6 10:00:40 centos-smtp1 last message repeated 8 times > Jul 6 10:00:41 centos-smtp1 clurgmgrd: [4228]: <info> Executing > /etc/init.d/postfix stop > Jul 6 10:00:41 centos-smtp1 clurgmgrd: [4228]: <err> script:postfix: > stop of /etc/init.d/postfix failed (returned 1) > Jul 6 10:00:41 centos-smtp1 clurgmgrd[4228]: <notice> stop on script > "postfix" returned 1 (generic error) > Jul 6 10:00:41 centos-smtp1 clurgmgrd: [4228]: <info> Removing IPv4 > address 195.83.126.201/24 from bond0 > Jul 6 10:00:41 centos-smtp1 avahi-daemon[3552]: Withdrawing address > record for 195.83.126.201 on bond0. > Jul 6 10:00:51 centos-smtp1 clurgmgrd: [4228]: <info> unmounting > /var/lib/amavis > Jul 6 10:00:51 centos-smtp1 clurgmgrd: [4228]: <info> unmounting > /var/spool/postfix > Jul 6 10:00:51 centos-smtp1 clurgmgrd[4228]: <crit> #12: RG > service:Postfix failed to stop; intervention required > Jul 6 10:00:51 centos-smtp1 clurgmgrd[4228]: <notice> Service > service:Postfix is failed > Jul 6 10:00:52 centos-smtp1 ntpd[3322]: synchronized to 195.83.126.119, > stratum 1 > > Clustat said: > > Cluster Status for cluster-test @ Mon Jul 6 10:02:39 2009 > Member Status: Quorate > > Member Name ID > Status > ------ ---- ---- > ------ > centos-imap1.ill.fr 1 > Online, Local, rgmanager > centos-imap2.ill.fr 2 > Online, rgmanager > centos-smtp1.ill.fr 3 > Online, rgmanager > /dev/disk/by-id/scsi-360a98000567247514634507447594661-part1 0 > Online, Quorum Disk > > Service Name Owner > (Last) State > ------- ---- ----- > ------ ----- > service:Imap > centos-imap2.ill.fr started > > service:Postfix > (centos-smtp1.ill.fr) failed > > > > > So I have to disable the Postfix servcie with: > clusvcadm -d Postfix > and re-enable > clusvcadm -e Postfix > > > > Could you explain my why my original smtp node is not fenced and why my > service is not start on the 2nd node ??? > Nodes are fenced only when they lost communications with the other nodes, not when a service fails. You should check the init scripts to make sure it works fine outside the cluster, return values are important. I think in your case is failing because you killed postfix in a way it deleted the .pid file, and that made the init script fail. BTW you should configure the service as recovery="relocate" if you want them to be started on a different node. Greetings, Juanra > Is there a way to force the fencing ??? > > > -- > ARMANET Stephane > Division Projet Technique > Service Informatique > Groupe Infrastructure > > Institut Laue langevin > > -- > Linux-cluster mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/linux-cluster >
-- Linux-cluster mailing list [email protected] https://www.redhat.com/mailman/listinfo/linux-cluster
