Ciao Cristina, On Thu, Apr 16, 2009 at 09:28:59AM +0200, Cristina Bulfon wrote: > Ciao, > > I've solved also the problem with V2 style.. the problem was that the AFS > script was starting/stopping continuosly. > The solution was adding in the script the check of the daemon , "status" > option , in this way the script > should be LSB complaint.
Good. > Instead still remain the problem with killing HBREAD etc .. > Has anybody have any clue ? Looks like a normal shutdown to me. Or is it that heartbeat shuts down by itself? Thanks, Dejan > Thanks > > cristina > > > On Apr 15, 2009, at 4:06 PM, Cristina Bulfon wrote: > >> Ciao, >> >> it seems to be solved: >> >> it was my fault ,try to mount xfs filesystem instead of ext3 ..correct the >> typo and did the filesystem check. >> Everything seems working I don't get the error "umount .. busy" but >> when I tried to simulate the down of the active node and switch to the >> passive .. on the >> ha-debug log file on the active node I got >> >> ResourceManager[10006]: 2009/04/15_15:59:45 info: Running >> /etc/ha.d/resource.d/Filesystem /dev/AFS/sda3 /vicepa/ xfs stop >> Filesystem[10267]: 2009/04/15_15:59:45 INFO: Running stop for >> /dev/AFS/sda3 on /vicepa >> Filesystem[10267]: 2009/04/15_15:59:45 INFO: Trying to unmount >> /vicepa >> Filesystem[10267]: 2009/04/15_15:59:45 INFO: unmounted /vicepa >> successfully >> Filesystem[10256]: 2009/04/15_15:59:45 INFO: Success >> INFO: Success >> ResourceManager[10006]: 2009/04/15_15:59:45 info: Running >> /etc/ha.d/resource.d/IPaddr 141.108.26.31/24/eth0 stop >> In IP Stop >> SIOCDELRT: No such process >> IPaddr[10374]: 2009/04/15_15:59:45 INFO: ifconfig eth0:0 down >> IPaddr[10345]: 2009/04/15_15:59:45 INFO: Success >> INFO: Success >> heartbeat[9993]: 2009/04/15_15:59:45 info: All HA resources relinquished. >> heartbeat[8025]: 2009/04/15_15:59:45 WARN: 1 lost packet(s) for >> [afsitfs4.roma1.infn.it] [50:52] >> heartbeat[8025]: 2009/04/15_15:59:45 info: No pkts missing from >> afsitfs4.roma1.infn.it! >> ... >> heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process 8030 >> with signal 15 >> heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process 8031 >> with signal 15 >> heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process 8032 >> with signal 15 >> heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process 8033 >> with signal 15 >> heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process 8034 >> with signal 15 >> heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBFIFO process 8028 >> with signal 15 >> heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process 8029 >> with signal 15 >> heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8029 exited. 7 >> remaining >> heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8028 exited. 6 >> remaining >> heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8031 exited. 5 >> remaining >> heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8030 exited. 4 >> remaining >> heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8032 exited. 3 >> remaining >> heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8034 exited. 2 >> remaining >> heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8033 exited. 1 >> remaining >> heartbeat[8025]: 2009/04/15_15:59:47 info: afsitfs3.roma1.infn.it >> Heartbeat shutdown complete. >> >> Thanks >> >> cristina >> >> >> On Apr 15, 2009, at 3:16 PM, Dejan Muhamedagic wrote: >> >>> Ciao, >>> >>> On Wed, Apr 15, 2009 at 01:37:40PM +0200, Cristina Bulfon wrote: >>>> >>>> On Apr 15, 2009, at 1:18 PM, Dejan Muhamedagic wrote: >>>> >>>>> Ciao, >>>>> >>>>> On Wed, Apr 15, 2009 at 12:53:41PM +0200, Cristina Bulfon wrote: >>>>>> Ciao Dejan, >>>>>> >>>>>> I am doing back & forth on this item :-) >>>>>> I moved to 2.14. version and back to V1 style... I don't use anymore >>>>>> DRBD, >>>>>> just the mount >>>>> >>>>> Do you need drbd? >>>> >>>> No.. when I started the first time to use heartbeat I couldn't manage >>>> the >>>> filesystem mount with heartbeat >>>> so I used DRDB as workaround, I don't need it since my devices are >>>> visible >>>> through the SAN. >>> >>> OK. Make sure that you also configure fencing/stonith! >>> >>>>>> So the haresources file is the follows >>>>>> >>>>>> afsitfs3.roma1.infn.it IPaddr::141.108.26.31/24/eth0 >>>>>> afsitfs3.roma1.infn.it Filesystem::/dev/AFS/sda3::/vicepa::xfs >>>>>> afsitfs3.roma1.infn.it Filesystem::/dev/AFS/sda1::/usr/afs::ext3 >>>>>> afsitfs3.roma1.infn.it 141.108.26.31 afs >>>>>> >>>>>> when I put the master node in stand_by or I stop the heartbeat, >>>>>> happens >>>>>> the >>>>>> following things >>>>>> >>>>>> - try the umount the filesystems before to stop "afs".. >>>>> >>>>> Isn't it afs stop before filesystem? >>>> >>>> That's is the problem I don't understand why .. it seems that >>>> the stop is performed in the same "start" order >>> >>> That can't be. Really. Can't recall anymore how v1 works, perhaps >>> it looks at the status before deciding whether to stop a >>> resource. >>> >>>>>> umount: /vicepa: device is busy >>>>>> umount: /vicepa: device is busy >>>>>> Filesystem[3427]: 2009/04/14_09:16:52 ERROR: Couldn't unmount >>>>>> /vicepa; trying cleanup with SIGTERM >>>>>> /vicepa: >>>>> >>>>> This may be normal, i.e. there could be processes using the >>>>> filesystem, though typically there are only applications which >>>>> depend on the filesystem (in this case afs) which should be >>>>> doing something there. If this is a concern, you should check >>>>> which processes have files open over there (fuser,lsof). >>>>> >>>>>> With 2.1.3 version I didn;t see any kind of those message, everything >>>>>> is >>>>>> V1 >>>>>> style was fine. >>>>> >>>>> I suspect that the afs RA is not working correctly, in particular >>>>> the status operation. >>>> I will take a look >>>> >>>> thanks cristina >>> >>> Thanks, >>> >>> Dejan >>> _______________________________________________ >>> Linux-HA mailing list >>> [email protected] >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >>> >> > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
