Hello Folks,
        We've got this perl script that performs our purges based on the atime 
returned from a stat() call.  Over the weekend, it would appear, our script got 
back millions of corrupted or misreported atimes and, lucky for us, unlinked a 
whole bunch of files (on the order of 45TB).  The only indication anything 
might have happened was the following sitting in the dmesg of the lustre client 
that houses the purge script:
*snip*
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) failure -2 inode 
181017071
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) Skipped 19 
previous similar messages
LustreError: 13381:0:(file.c:3312:ll_inode_revalidate_fini()) failure -2 inode 
161973341
LustreError: 13381:0:(file.c:3312:ll_inode_revalidate_fini()) Skipped 25 
previous similar messages
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) failure -2 inode 
162433196
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) Skipped 32 
previous similar messages
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) failure -2 inode 
174530765
LustreError: 27061:0:(file.c:3312:ll_inode_revalidate_fini()) Skipped 33 
previous similar messages
*snip*

Any ideas or experience with poorly reported atimes under lustre?


----------------
John White
High Performance Computing Services (HPCS)
(510) 486-7307
One Cyclotron Rd, MS: 50B-3209C
Lawrence Berkeley National Lab
Berkeley, CA 94720

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to