HI,

We are getting some strange behavior out of pvfs-2.8.1 clients running
on some sles 10 sp 1 nodes.

The pvfs2 clients can mount the pvfs2 file system with no problems we
then start an MPI job that runs on a small number of nodes.  The problem
happens when we try to kill the mpi job.  As soon as we send the kill
signal to the mpi job several of our pvfs2 client nodes have their
pvfs2-client-core deamon die with this message:

hpcp6671:~ # ps -ef |grep pvfs
root     25767     1  0 12:21 ?
00:00:00 /bphpc5/vol0/salmr0/opt/pvfs-2.8.1/x86_64/sles10sp1/sbin/pvfs2-client 
-p /bphpc5/vol0/salmr0/opt/pvfs-2.8.1/x86_64/sles10sp1/sbin/pvfs2-client-core
root     16117 25767  0 15:02 ?        00:00:00 [pvfs2-client-co]



hpcp6671:~ # cat /tmp/pvfs2-client.log 
[E 12:21:35.567169] PVFS Client Daemon Started.  Version 2.8.1
[D 12:21:35.567434] [INFO]: Mapping pointer 0x2acdf7aa3000 for I/O.
[D 12:21:35.579256] [INFO]: Mapping pointer 0x2acdf8ea5000 for I/O.
[E 15:02:54.988860] PVFS2 client: signal 11, faulty address is 0x41d5,
from 0x408d81
[E 15:02:54.989282] [bt] pvfs2-client-core [0x408d81]
[E 15:02:54.989294] [bt] pvfs2-client-core [0x408d81]
[E 15:02:54.989302] [bt] pvfs2-client-core(main+0xbc3) [0x40a173]
[E 15:02:54.989309] [bt] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x2acdf788b154]
[E 15:02:54.989315] [bt] pvfs2-client-core [0x403519]
[E 15:02:54.991351] Child process with pid 25768 was killed by an
uncaught signal 6
[E 15:02:54.993980] PVFS Client Daemon Started.  Version 2.8.1
[D 15:02:54.994242] [INFO]: Mapping pointer 0x2b94619a2000 for I/O.
[D 15:02:55.008318] [INFO]: Mapping pointer 0x2b9462da4000 for I/O.
[E 15:02:55.312456] Got an unrecognized/unimplemented vfs operation of
type ff000000.
[E 15:02:55.312497] Post of op: PVFS_VFS_OP_INVALID failed!


Any ideas?

thanks
Rene

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to