This was a bug in the pvfs fsck app. I've attached a patch that fixes the problem.

-sam

Attachment: fsck-fix.patch
Description: Binary data




On Apr 25, 2008, at 5:02 PM, Andrew Pochinsky wrote:

Hi All,
I'm playing with 2.7.1 and have encountered the following problem. First, the setup. There are 3 i/o servers, all of which also are metadata servers. Each server has 14TB of disks to host pvfs2. The servers are running opensuse 10.3 with 2.6.25 kernel on x86-64 hardware (4-way Xeon) with 8GB of memory. The underlying file system for pvfs2 storage is 3ware raid 6 -> LVM -> reiserfs. (The only way to make all 14TB visible as one file system I could find.) One client talks to this setup with kmod. During the tests, some data was moved to pvfs2, directories created, filed moved and deleted -- is a word, the usual random file operations one would expect a user to perform. At some point the client was untarring a bunch of files to a mounted pvfs2, and was rebooted and/or the tar stream was interrupted. After that, the pvfs2 complains about inconsistent state. If I try to run pvfs2-fsck, it tries to repair the file system but fails on a assertion:

test:~ # pvfs2-fsck -p -y -m /data/pvfs-2.7.1-test
....
# third pass: moving orphaned sub trees and files to lost+found.
# trying to salvage Metafile 4611686018427387386.
* creating new reference to lostfile.4611686018427387386 (4611686018427387386) in 1537228672809129301.
PVFS_mgmt_create_dirent: No such file or directory (error class: 0)
pvfs2-fsck: src/apps/admin/pvfs2-fsck.c:915: fill_lost_and_found: Assertion `ret == 0' failed.
Aborted
test:~ #

Needless to say that there is lost+found on it:

test:~ # ls -la /data/pvfs-2.7.1-test3
total 4308896
drwxr-xr-t 1 root root       4096 Aug 15  2007 .
drwxr-xr-x 4 root root        104 Apr 18 16:58 ..
drwxr-xr-x 1 3334 3334       4096 Dec 14 17:55 avp
drwxrwxrwx 1 root root       4096 Apr 18 15:58 lost+found
drwxr-xr-x 1 root root       4096 Apr 18 18:37 test-1
drwxr-xr-x 1 root root       4096 Apr 24 02:19 test-2
drwxr-xr-x 1 root root       4096 Apr 24 12:25 test-P0
-rw-r--r-- 1 root root 4363677696 Apr 24 02:14 x4
-rw-r--r-- 1 root root   48601689 Apr 24 02:20 y

I'm wondering if anyone has seen a similar thing. The file system as about 114GB of test data on it, and it could be reformatted, but if the current state of the file system is useful to the developers for any forensics, I'd be glad to share it.

Thanks,
--andrew

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users


Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to