I'm glad that cleaned up the problem. From what I saw the statfs calls use handle counts that are maintained in memory. On a restart of the servers the handle lists in memory are regenerated using the same calls the iterate-handles management calls use.
I don't have an answer for how the in-memory handle counts got out of sync. If you run into other fsck issues feel free to ping the list. Thanks, Michael On Mon, Jan 17, 2011 at 09:24:48AM -0500, Bill Wichser wrote: > The reboot of the I/O servers have now cleared up this failure. There > is still some inconsistency > in the filesystem while running pvfs2-fsck but I believe that I can > clean these up. > > Thanks, > Bill > > Michael Moore wrote: > > Hi Bill, > > > > Sorry for the delay. The difference appears to be between what the > > management iterate handles call returns and what statfs returns to fsck. > > I'm looking now to get a better understanding how statfs and the trove > > ledger stuff gets it's counts versus how iterate handles counts them. > > > > In the mean time, have the server processes been restarted since this > > behavior started occurring? If not, is that a possibility? > > > > Sorry again for the delay in getting back with you on this issue. > > > > Thanks, > > Michael > > > > On Tue, Jan 04, 2011 at 01:48:20PM -0500, Bill Wichser wrote: > > > >> I've deleted those files with the native pvfs2-rm command which informed > >> me to run pvfs2-fsck. Running pvfs2-validate turned up a number more > >> which I removed. So there is nothing to pvfs2-viewdist on. > >> > >> FWIW I'm running a meta on the head and the I/O servers on 16 compute > >> nodes, version 2.8.2 > >> > >> [root@della3 bill]# pvfs2-stat /scratch/pvfs2 > >> ------------------------------------------------------- > >> File Name : /scratch/pvfs2 > >> Relative Name : / > >> fs ID : 1922795883 > >> Handle : 1048576 > >> Mask : 504000177 > >> Permissions : 777 > >> Type : Directory > >> Size : 4096 > >> Owner : 0 (root) > >> Group : 0 (root) > >> atime : 1294130281 (Tue Jan 4 03:38:01 2011) > >> mtime : 1293499466 (Mon Dec 27 20:24:26 2010) > >> ctime : 1293499462 (Mon Dec 27 20:24:22 2010) > >> dir entries : 6 > >> > >> [root@della3 bill]# pvfs2-validate -d /scratch/pvfs2/ > >> pvfs2-validate starting validation at object [/scratch/pvfs2] > >> pvfs2-validate done validating object tree at [/scratch/pvfs2] > >> > >> [root@della3 bill]# pvfs2-fsck -p -m /scratch/pvfs2 > >> # Current FSID is 1922795883. > >> Ugh! Server 1, Received 64789 total handles instead of 64792 > >> > >> So the total handles have changed, as expected because of the removals, > >> but the difference is the same. Now to be honest, when I made that > >> filesystem, I didn't run an fsck so it could be a remnant from last > >> month. I don't know. But we have a bunch of Genomics people wrecking > >> havoc with those strange files in kernel space. I was able to do an > >> pvfs2-ls on them (user space) but didn't really pursue, hoping instead > >> to just make the problem go away! > >> > >> Thanks, > >> Bill > >> > >> Michael Moore wrote: > >> > >>> Hi Bill, > >>> > >>> Can you provide the output of pvfs2-stat on the parent directory > >>> and affected files and 'pvfs2-viewdist -f <path>' on the affected files? > >>> > >>> Do you see any complaints in the server logs related to accessing these > >>> files? > >>> > >>> Michael > >>> > >>> On Mon, Jan 03, 2011 at 08:04:02AM -0500, Bill Wichser wrote: > >>> > >>> > >>>> Having some trouble with my filesystem. There are a few files which did > >>>> not get written correctly by one of the users and some corruption looks > >>>> to be present. > >>>> > >>>> # ls -lR > >>>> ./3689_old: > >>>> total 0 > >>>> ?--------- ? ? ? ? ? clusmax.out > >>>> > >>>> ./3764_old: > >>>> total 0 > >>>> ?--------- ? ? ? ? ? traj.xtc > >>>> > >>>> These cannot be removed. In the past, a run of pvfs2-fsck seemed to > >>>> correct these types of problems but this time all I get is the following > >>>> message and the fsck terminates. I'm not sure how to correct this. > >>>> Googling leads me to the source code. Anyone have any suggestions? > >>>> > >>>> # pvfs2-fsck -p -v -m /scratch/pvfs2 > >>>> # Current FSID is 1922795883. > >>>> Ugh! Server 1, Received 64796 total handles instead of 64800 > >>>> > >>>> > >>>> Thanks, and Happy New Year to all! > >>>> Bill > >>>> > >>>> _______________________________________________ > >>>> Pvfs2-users mailing list > >>>> [email protected] > >>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > >>>> > >>>> > > _______________________________________________ > > Pvfs2-users mailing list > > [email protected] > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > > _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
