Sam,
I'm not sure if other problems were PVFS2 related - the machine has
a local 3ware raid disk, and it went into a read-only mode at some
point (the disk is 2TB and runs LVM+reiserfs). After umounting that
disk the kernel refused to mount it again complaining about an
unknown fs type. At that point the machine had to be manually power
cycled. At the next boot it replayed a lot of reiserfs logs. We
noticed the problem when some user processes died without a reason,
but they used both PVFS2 and reiserfs. My wild guess would be that
some race condition (maybe in pvfs2) started bit rot in the kernel
fs code that propagated with time, but I'll be the first to question
speculations like that.
Thanks,
--andrew
On Jun 3, 2008, at 6:41 PM, Sam Lang wrote:
Hi Andrew,
Were you seeing any other problems with the PVFS volume before the
unmount? Did a directory listing hang or anything like that?
I've included a report of the description of the problem in case
other PVFS developers have some ideas. The bug message is saying
that the current directory (.) has a reference count of 1. Its a
weird place to see that error though. We just added the current
(.) and parent (..) dirs to a readdir call through filldir. Also,
the kernel code still cleans up the dentry anyway after reporting
that message, and the pvfs2 volume gets unmounted, so I'm curious
as to how this causes the deterioration over time. Also, we don't
ever increment the d_count of a dentry ourselves, so somehow the
interaction with the kernel interfaces is causing the d_count to
get above 1 (it starts out at 1), and then get decremented later.
-sam
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users