> 
> On Fri, 30 Aug 2013, James Harper wrote:
> > I finally got a valgrind memtest hit... output attached below email. I
> > recompiled all of tapdisk and ceph without any -O options (thought I had
> > already...) and it seems to have done the trick
> 
> What version is this?  The line numbers don't seem to match up with my
> source tree.

0.67.2, but I've peppered it with debug prints

> > Basically it looks like an instance of AioRead is being accessed after
> > being free'd. I need some hints on what api behaviour by the tapdisk
> > driver could be causing this to happen in librbd...
> 
> It looks like refcounting for the AioCompletion is off.  My first guess
> would be premature (or extra) calls to rados_aio_release or
> AioCompletion::release().
> 
> I did a quick look at the code and it looks like aio_read() is carrying a
> ref for the AioComplete for the entire duration of the function, so it
> should not be disappearing (and taking the AioRead request struct with it)
> until well after where the invalid read is.  Maybe there is an error path
> somewhere what is dropping a ref it shouldn't?
> 

I'll see if I can find a way to track that. It's the c->get() and c->put() that 
track this right?
 
The crash seems a little bit different every time, so it could still be 
something stomping on memory, eg overwriting the ref count or something.

Thanks

James

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to