I finally narrowed it down. It turns out we had a problem merging the previous release, but it did not show up since we never got a chance to test it. Sam added an op_release in namei.c to fix a kmem_cache leak, and it sneaked in twice without warning. Taking that out fixed the problem.
Bart. On Tue, Jul 29, 2008 at 8:03 AM, Phil Carns <[EMAIL PROTECTED]> wrote: > I'm having a hard time thinking of anything specific that would have > impacted this. You could maybe try to narrow it down some by taking a diff > of just the src/kernel/linux-2.6 directory and apply that to a 2.7.1 tree to > test and see if it is something specifically in the kernel module code. > > -Phil > > Bart Taylor wrote: > >> I ran the test the same way you mentioned - outside of the LTP framework - >> and still had the problem. I have applied the patch that fixed the rename06 >> test as well as the kernel buffer overflow fix from a few days ago and still >> have the problem. >> >> I did a CVS export of head this morning and used the same configure and >> build as last time. I ran the open file test against a file system created >> from head and against a 271 file system (with some recent patches) and both >> tests succeed, so it seems like the fix is somewhere between the 271 release >> and head, but I am not sure where. Do you have an idea where it might be >> lurking? >> >> Bart. >> >> >> >> On Fri, Jul 25, 2008 at 7:16 AM, Phil Carns <[EMAIL PROTECTED] <mailto: >> [EMAIL PROTECTED]>> wrote: >> >> Phil Carns wrote: >> >> Bart Taylor wrote: >> >> I am having a problem with an LTP test from the 20080630 set >> of LTP tests. The >> 'openfile01' test does 10 threaded opens of 10 files. It is >> attached in case you >> need a copy. The test completes successfully, but an 'ls' >> command immediately >> after that hangs and cannot be killed. Eventually the node >> hangs as well. Any >> command that touches the file system will trigger the problem. >> >> We also tried this with the 2.7.1 release tarball and see >> the same problem. A >> single node file system running RHEL4 and a 2.6.9-67 kernel. >> The client was on >> the same node. >> >> Here is the configure line used: >> >> ./configure --with-kernel=/lib/modules/`uname -r`/build >> >> and how the client was started: >> >> ./pvfs2-client -p ./pvfs2-client-core >> >> The fs.conf file is attached. >> >> The client debug mask was set to 'all', and >> /proc/sys/pvfs2/debug had a value of >> 32767. But once the 'ls' command was issued, there were no >> log messages. >> >> Does anyone else see this error? >> >> Bart. >> >> >> Are you able to reproduce this running openfile by itself after >> a fresh boot? It looks like openfile operates on a file in the >> current working directory, so I have been trying to run it like >> this: >> >> <mount pvfs2 on /mnt/pvfs2> >> cd /mnt/pvfs2 >> ~/openfile -f10 -t10 >> ls -alh >> >> So far I haven't had any trouble with that particular >> combination. I'm running it on a centos4 box with a very >> similar kernel. The openfile tests looks fairly innocent- with >> those arguments each of 10 separate threads open the same single >> file 10 times (for a total of 100 file descriptors open to the >> same file) if I understand correctly. >> >> If I try to run a full LTP test, however, I do have other >> problems. In particular the rename06 test hangs. I can trigger >> that one by itself as follows: >> >> export TMPDIR=/mnt/pvfs2 >> ~/rename06 >> >> The same suite of tests runs fine on a 2.6.24 kernel and a trunk >> build of PVFS. I'm not sure yet if the difference is between >> pvfs versions or between kernel versions. >> >> >> The rename06 test passes with pvfs trunk; I think that particular >> problem has already been fixed. I still haven't figured out why >> openfile01 would be a problem, though. >> >> -Phil >> >> >> >
_______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
