Ciao,

it seems to be solved:

it was my fault ,try to mount xfs filesystem instead of ext3 ..correct the typo and did the filesystem check.
Everything seems working I don't  get  the error "umount  .. busy" but
when I tried to simulate the down of the active node and switch to the passive .. on the
ha-debug log file on the active node I got

ResourceManager[10006]: 2009/04/15_15:59:45 info: Running /etc/ha.d/ resource.d/Filesystem /dev/AFS/sda3 /vicepa/ xfs stop Filesystem[10267]: 2009/04/15_15:59:45 INFO: Running stop for / dev/AFS/sda3 on /vicepa Filesystem[10267]: 2009/04/15_15:59:45 INFO: Trying to unmount / vicepa Filesystem[10267]: 2009/04/15_15:59:45 INFO: unmounted /vicepa successfully
Filesystem[10256]:      2009/04/15_15:59:45 INFO:  Success
INFO:  Success
ResourceManager[10006]: 2009/04/15_15:59:45 info: Running /etc/ha.d/ resource.d/IPaddr 141.108.26.31/24/eth0 stop
In IP Stop
SIOCDELRT: No such process
IPaddr[10374]:  2009/04/15_15:59:45 INFO: ifconfig eth0:0 down
IPaddr[10345]:  2009/04/15_15:59:45 INFO:  Success
INFO:  Success
heartbeat[9993]: 2009/04/15_15:59:45 info: All HA resources relinquished. heartbeat[8025]: 2009/04/15_15:59:45 WARN: 1 lost packet(s) for [afsitfs4.roma1.infn.it] [50:52] heartbeat[8025]: 2009/04/15_15:59:45 info: No pkts missing from afsitfs4.roma1.infn.it!
...
heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process 8030 with signal 15 heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process 8031 with signal 15 heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process 8032 with signal 15 heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process 8033 with signal 15 heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process 8034 with signal 15 heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBFIFO process 8028 with signal 15 heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process 8029 with signal 15 heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8029 exited. 7 remaining heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8028 exited. 6 remaining heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8031 exited. 5 remaining heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8030 exited. 4 remaining heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8032 exited. 3 remaining heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8034 exited. 2 remaining heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8033 exited. 1 remaining heartbeat[8025]: 2009/04/15_15:59:47 info: afsitfs3.roma1.infn.it Heartbeat shutdown complete.

Thanks

cristina


On Apr 15, 2009, at 3:16 PM, Dejan Muhamedagic wrote:

Ciao,

On Wed, Apr 15, 2009 at 01:37:40PM +0200, Cristina Bulfon wrote:

On Apr 15, 2009, at 1:18 PM, Dejan Muhamedagic wrote:

Ciao,

On Wed, Apr 15, 2009 at 12:53:41PM +0200, Cristina Bulfon wrote:
Ciao Dejan,

I am doing back & forth on this item :-)
I moved to 2.14. version and back to V1 style... I don't use anymore
DRBD,
just the mount

Do you need drbd?

No.. when I started the first time to use heartbeat I couldn't manage the
filesystem mount with heartbeat
so I used DRDB as workaround, I don't need it since my devices are visible
through the SAN.

OK. Make sure that you also configure fencing/stonith!

So the haresources file is the follows

afsitfs3.roma1.infn.it  IPaddr::141.108.26.31/24/eth0
afsitfs3.roma1.infn.it   Filesystem::/dev/AFS/sda3::/vicepa::xfs
afsitfs3.roma1.infn.it   Filesystem::/dev/AFS/sda1::/usr/afs::ext3
afsitfs3.roma1.infn.it  141.108.26.31   afs

when I put the master node in stand_by or I stop the heartbeat, happens
the
following things

- try the umount the filesystems before to stop "afs"..

Isn't it afs stop before filesystem?

That's is the problem I don't understand why .. it seems that
the stop is performed in the same  "start" order

That can't be. Really. Can't recall anymore how v1 works, perhaps
it looks at the status before deciding whether to stop a
resource.

umount: /vicepa: device is busy
umount: /vicepa: device is busy
Filesystem[3427]:       2009/04/14_09:16:52 ERROR: Couldn't unmount
/vicepa; trying cleanup with SIGTERM
/vicepa:

This may be normal, i.e. there could be processes using the
filesystem, though typically there are only applications which
depend on the filesystem (in this case afs) which should be
doing something there. If this is a concern, you should check
which processes have files open over there (fuser,lsof).

With 2.1.3 version I didn;t see any kind of those message, everything is
V1
style was fine.

I suspect that the afs RA is not working correctly, in particular
the status operation.
I will take a look

thanks cristina

Thanks,

Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to