The reboot of the I/O servers have now cleared up this failure. There
is still some inconsistency
in the filesystem while running pvfs2-fsck but I believe that I can
clean these up.
Thanks,
Bill
Michael Moore wrote:
Hi Bill,
Sorry for the delay. The difference appears to be between what the
management iterate handles call returns and what statfs returns to fsck.
I'm looking now to get a better understanding how statfs and the trove
ledger stuff gets it's counts versus how iterate handles counts them.
In the mean time, have the server processes been restarted since this
behavior started occurring? If not, is that a possibility?
Sorry again for the delay in getting back with you on this issue.
Thanks,
Michael
On Tue, Jan 04, 2011 at 01:48:20PM -0500, Bill Wichser wrote:
I've deleted those files with the native pvfs2-rm command which informed
me to run pvfs2-fsck. Running pvfs2-validate turned up a number more
which I removed. So there is nothing to pvfs2-viewdist on.
FWIW I'm running a meta on the head and the I/O servers on 16 compute
nodes, version 2.8.2
[root@della3 bill]# pvfs2-stat /scratch/pvfs2
-------------------------------------------------------
File Name : /scratch/pvfs2
Relative Name : /
fs ID : 1922795883
Handle : 1048576
Mask : 504000177
Permissions : 777
Type : Directory
Size : 4096
Owner : 0 (root)
Group : 0 (root)
atime : 1294130281 (Tue Jan 4 03:38:01 2011)
mtime : 1293499466 (Mon Dec 27 20:24:26 2010)
ctime : 1293499462 (Mon Dec 27 20:24:22 2010)
dir entries : 6
[root@della3 bill]# pvfs2-validate -d /scratch/pvfs2/
pvfs2-validate starting validation at object [/scratch/pvfs2]
pvfs2-validate done validating object tree at [/scratch/pvfs2]
[root@della3 bill]# pvfs2-fsck -p -m /scratch/pvfs2
# Current FSID is 1922795883.
Ugh! Server 1, Received 64789 total handles instead of 64792
So the total handles have changed, as expected because of the removals,
but the difference is the same. Now to be honest, when I made that
filesystem, I didn't run an fsck so it could be a remnant from last
month. I don't know. But we have a bunch of Genomics people wrecking
havoc with those strange files in kernel space. I was able to do an
pvfs2-ls on them (user space) but didn't really pursue, hoping instead
to just make the problem go away!
Thanks,
Bill
Michael Moore wrote:
Hi Bill,
Can you provide the output of pvfs2-stat on the parent directory
and affected files and 'pvfs2-viewdist -f <path>' on the affected files?
Do you see any complaints in the server logs related to accessing these
files?
Michael
On Mon, Jan 03, 2011 at 08:04:02AM -0500, Bill Wichser wrote:
Having some trouble with my filesystem. There are a few files which did
not get written correctly by one of the users and some corruption looks
to be present.
# ls -lR
./3689_old:
total 0
?--------- ? ? ? ? ? clusmax.out
./3764_old:
total 0
?--------- ? ? ? ? ? traj.xtc
These cannot be removed. In the past, a run of pvfs2-fsck seemed to
correct these types of problems but this time all I get is the following
message and the fsck terminates. I'm not sure how to correct this.
Googling leads me to the source code. Anyone have any suggestions?
# pvfs2-fsck -p -v -m /scratch/pvfs2
# Current FSID is 1922795883.
Ugh! Server 1, Received 64796 total handles instead of 64800
Thanks, and Happy New Year to all!
Bill
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users