On Thursday, May 31, 2012 at 4:58 PM, Noah Watkins wrote:
> 
> On May 31, 2012, at 3:39 PM, Greg Farnum wrote:
> > > 
> > > Nevermind to my last comment. Hmm, I've seen this, but very rarely.
> > Noah, do you have any leads on this? Do you think it's a bug in your Java 
> > code or in the C/++ libraries?
> 
> 
> 
> I _think_ this is because the JVM uses its own threading library, and Ceph 
> assumes pthreads and pthread compatible mutexes--is that assumption about 
> Ceph correct? Hence the error that looks like Mutex::lock(bool) being 
> reference for context during the segfault. To verify this all that is needed 
> is some synchronization added to the Java.
I'm not quite sure what you mean here. Ceph is definitely using pthread 
threading and mutexes, but I don't see how the use of a different threading 
library can break pthread mutexes (which are just using the kernel futex stuff, 
AFAIK).
But I admit I'm not real good at handling those sorts of interactions, so maybe 
I'm missing something?

> There are only two segfaults that I've ever encountered, one in which the C 
> wrappers are used with an unmounted client, and the error Nam is seeing 
> (although they could be related). I will re-submit an updated patch for the 
> former, which should rule that out as the culprit.
> 
> Nam: where are you grabbing the Java patches from? I'll push some updates.
> 
> 
> The only other scenario that comes to mind is related to signaling:
> 
> The RADOS Java wrappers suffered from an interaction between the JVM and 
> RADOS client signal handlers, in which either the JVM or RADOS would replace 
> the handlers for the other (not sure which order). Anyway, the solution was 
> to link in the JVM libjsig.so signal chaining library. This might be the same 
> thing we are seeing here, but I'm betting it is the first theory I mentioned.
Hmm. I think that's an issue we've run into but I thought it got fixed for 
librados. Perhaps I'm mixing that up with libceph, or just pulling past 
scenarios out of thin air. It never manifested as Mutex count bugs, though!
-Greg

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to