Ciao.
it's happen when I shutdown heartbeat manually, the strange things
is that it happens only on the master node. But if you said it's
normally
it's fine for me.
Now I am going to configure the resource monitor and if I need some
helps
I will post an email with other subject.
Thanks a lot for helping me
cristina
On Apr 16, 2009, at 4:43 PM, Dejan Muhamedagic wrote:
Ciao Cristina,
On Thu, Apr 16, 2009 at 09:28:59AM +0200, Cristina Bulfon wrote:
Ciao,
I've solved also the problem with V2 style.. the problem was that
the AFS
script was starting/stopping continuosly.
The solution was adding in the script the check of the daemon ,
"status"
option , in this way the script
should be LSB complaint.
Good.
Instead still remain the problem with killing HBREAD etc ..
Has anybody have any clue ?
Looks like a normal shutdown to me. Or is it that heartbeat shuts
down by itself?
Thanks,
Dejan
Thanks
cristina
On Apr 15, 2009, at 4:06 PM, Cristina Bulfon wrote:
Ciao,
it seems to be solved:
it was my fault ,try to mount xfs filesystem instead of
ext3 ..correct the
typo and did the filesystem check.
Everything seems working I don't get the error "umount .. busy"
but
when I tried to simulate the down of the active node and switch to
the
passive .. on the
ha-debug log file on the active node I got
ResourceManager[10006]: 2009/04/15_15:59:45 info: Running
/etc/ha.d/resource.d/Filesystem /dev/AFS/sda3 /vicepa/ xfs stop
Filesystem[10267]: 2009/04/15_15:59:45 INFO: Running stop for
/dev/AFS/sda3 on /vicepa
Filesystem[10267]: 2009/04/15_15:59:45 INFO: Trying to unmount
/vicepa
Filesystem[10267]: 2009/04/15_15:59:45 INFO: unmounted /vicepa
successfully
Filesystem[10256]: 2009/04/15_15:59:45 INFO: Success
INFO: Success
ResourceManager[10006]: 2009/04/15_15:59:45 info: Running
/etc/ha.d/resource.d/IPaddr 141.108.26.31/24/eth0 stop
In IP Stop
SIOCDELRT: No such process
IPaddr[10374]: 2009/04/15_15:59:45 INFO: ifconfig eth0:0 down
IPaddr[10345]: 2009/04/15_15:59:45 INFO: Success
INFO: Success
heartbeat[9993]: 2009/04/15_15:59:45 info: All HA resources
relinquished.
heartbeat[8025]: 2009/04/15_15:59:45 WARN: 1 lost packet(s) for
[afsitfs4.roma1.infn.it] [50:52]
heartbeat[8025]: 2009/04/15_15:59:45 info: No pkts missing from
afsitfs4.roma1.infn.it!
...
heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process
8030
with signal 15
heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process
8031
with signal 15
heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process
8032
with signal 15
heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process
8033
with signal 15
heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBREAD process
8034
with signal 15
heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBFIFO process
8028
with signal 15
heartbeat[8025]: 2009/04/15_15:59:47 info: killing HBWRITE process
8029
with signal 15
heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8029
exited. 7
remaining
heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8028
exited. 6
remaining
heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8031
exited. 5
remaining
heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8030
exited. 4
remaining
heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8032
exited. 3
remaining
heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8034
exited. 2
remaining
heartbeat[8025]: 2009/04/15_15:59:47 info: Core process 8033
exited. 1
remaining
heartbeat[8025]: 2009/04/15_15:59:47 info: afsitfs3.roma1.infn.it
Heartbeat shutdown complete.
Thanks
cristina
On Apr 15, 2009, at 3:16 PM, Dejan Muhamedagic wrote:
Ciao,
On Wed, Apr 15, 2009 at 01:37:40PM +0200, Cristina Bulfon wrote:
On Apr 15, 2009, at 1:18 PM, Dejan Muhamedagic wrote:
Ciao,
On Wed, Apr 15, 2009 at 12:53:41PM +0200, Cristina Bulfon wrote:
Ciao Dejan,
I am doing back & forth on this item :-)
I moved to 2.14. version and back to V1 style... I don't use
anymore
DRBD,
just the mount
Do you need drbd?
No.. when I started the first time to use heartbeat I couldn't
manage
the
filesystem mount with heartbeat
so I used DRDB as workaround, I don't need it since my devices are
visible
through the SAN.
OK. Make sure that you also configure fencing/stonith!
So the haresources file is the follows
afsitfs3.roma1.infn.it IPaddr::141.108.26.31/24/eth0
afsitfs3.roma1.infn.it Filesystem::/dev/AFS/sda3::/vicepa::xfs
afsitfs3.roma1.infn.it Filesystem::/dev/AFS/sda1::/usr/
afs::ext3
afsitfs3.roma1.infn.it 141.108.26.31 afs
when I put the master node in stand_by or I stop the heartbeat,
happens
the
following things
- try the umount the filesystems before to stop "afs"..
Isn't it afs stop before filesystem?
That's is the problem I don't understand why .. it seems that
the stop is performed in the same "start" order
That can't be. Really. Can't recall anymore how v1 works, perhaps
it looks at the status before deciding whether to stop a
resource.
umount: /vicepa: device is busy
umount: /vicepa: device is busy
Filesystem[3427]: 2009/04/14_09:16:52 ERROR: Couldn't
unmount
/vicepa; trying cleanup with SIGTERM
/vicepa:
This may be normal, i.e. there could be processes using the
filesystem, though typically there are only applications which
depend on the filesystem (in this case afs) which should be
doing something there. If this is a concern, you should check
which processes have files open over there (fuser,lsof).
With 2.1.3 version I didn;t see any kind of those message,
everything
is
V1
style was fine.
I suspect that the afs RA is not working correctly, in particular
the status operation.
I will take a look
thanks cristina
Thanks,
Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems