Sam,
I'm not sure if other problems were PVFS2 related - the machine has a local 3ware raid disk, and it went into a read-only mode at some point (the disk is 2TB and runs LVM+reiserfs). After umounting that disk the kernel refused to mount it again complaining about an unknown fs type. At that point the machine had to be manually power cycled. At the next boot it replayed a lot of reiserfs logs. We noticed the problem when some user processes died without a reason, but they used both PVFS2 and reiserfs. My wild guess would be that some race condition (maybe in pvfs2) started bit rot in the kernel fs code that propagated with time, but I'll be the first to question speculations like that.
Thanks,
--andrew

On Jun 3, 2008, at 6:41 PM, Sam Lang wrote:

Hi Andrew,

Were you seeing any other problems with the PVFS volume before the unmount? Did a directory listing hang or anything like that?

I've included a report of the description of the problem in case other PVFS developers have some ideas. The bug message is saying that the current directory (.) has a reference count of 1. Its a weird place to see that error though. We just added the current (.) and parent (..) dirs to a readdir call through filldir. Also, the kernel code still cleans up the dentry anyway after reporting that message, and the pvfs2 volume gets unmounted, so I'm curious as to how this causes the deterioration over time. Also, we don't ever increment the d_count of a dentry ourselves, so somehow the interaction with the kernel interfaces is causing the d_count to get above 1 (it starts out at 1), and then get decremented later.

-sam

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to