Hi again:

Today I got a trouble report form a user of the cluster, and when I
investigated it, it turns out that the mother superior for his job
rebooted during some untarr'ing operations in preparation for his job.
 I've seen this behavior on the head node of the cluster as well
(although not yet since the upgrade, but only have about 24 hours on
the upgrade so far).

Any ideas?  I've suffered from this problem since the 2.6.x series
when I started using pvfs, although this is the first time I've caught
a compute node.  I believe its pvfs, as normally all that's going on
when it reboots is I/O to the pvfs volume (normally something like
tarring, g/bzip'ing something, or a lot of copy/moves).  I've been on
our headnode when this has happened in the past; there's no kernel
panic information visible; just one second everything appears to be
working, and the next its as if someone hit the reset button on the
computer.

--Jim
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to