On Oct 8, 2007, at 7:48 PM, Murali Vilayannur wrote:

Hi Sam,
I was able to verify that your latest patch fixes the problem with
the simul test #7, so I went ahead and committed it.

Awesome! I thought Kevin mentioned that it crashed somewhere else.. :(


Also, when the problem actually existed, running simul #7 a bunch and
then trying to unload the kernel module was giving an error:

Ah.. Maybe this was the error he mentioned to me..?


[  806.396608] slab error in kmem_cache_destroy(): cache
`pvfs2_op_cache': Can't free all objects

I did some debugging and it looks like the op cache entry that wasn't
getting release was from a lookup, and it looks like there's a case
where lookup can return an error that's not ENOENT, and the op entry
doesn't get released.  I've attached a patch that I think fixes the
problem.  Can you verify that this looks ok?

Awesome! Nice catch! That looks great!

 Also, I've seen this
error before on other systems (I think Pete has too) and I'm not sure
its always from this one case.  Is there a good way to verify that
we're releasing ops (and possibly other cache entries)
appropriately? Just looking for ideas to harden the code in the kmod.

I guess we could always keep track of extra list_head's in each object being
allocated off the slab and chain it in a private global link list and
verify that the list
is empty prior to module unload's kmem_cache_destroy() perhaps?
If not empty then free all the remaining elements..

Yeah and write some big warnings to the log about leaking cache entries. Are there any interfaces to the kmem_cache that allow you to query the entries you've allocated so that we don't have to do that ourselves?

To find out where we are leaking we could store the return address of the caller
of the alloc() function in the object and use that to find out the
offending leaks..
What do you think?

Sounds good to me.
-sam
Thanks!
Murali


-sam



On Aug 16, 2007, at 12:59 PM, Murali Vilayannur wrote:

Kevin,
Instead of the call to d_add(), can you replace it by a
pvfs2_d_splice_alias() with the same parameters as before and
recompile/reload and see if that fixes the crash.
Something like the attached..
thanks,
Murali

On 8/16/07, Kevin Harms <[EMAIL PROTECTED]> wrote:
Murali,

        i tried the patch. (applied it to 2.6.3 source) it get
crashes from
on one of the machines.
        i send you an email with dmesg output.

Kevin

<dcache.patch>
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers






_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to