On Monday, June 4, 2012 at 1:47 PM, Noah Watkins wrote: > On Mon, Jun 4, 2012 at 1:17 PM, Greg Farnum <[email protected] > (mailto:[email protected])> wrote: > > > I'm not quite sure what you mean here. Ceph is definitely using pthread > > threading and mutexes, but I don't see how the use of a different threading > > library can break pthread mutexes (which are just using the kernel futex > > stuff, AFAIK). > > But I admit I'm not real good at handling those sorts of interactions, so > > maybe I'm missing something? > > > > The basic idea was that threads in Java did not map 1:1 with kernel > threads (think co-routines), which would break a lot of stuff, > especially futex. Looking at some documentation, old JVMs had > something called Green Threads, but have now been abandoned in favor > of native threads. So maybe this theory is now irrelevant, and > evidence seems to suggest you're right and Java is using native > threads.
Gotcha, that makes sense. > > > > The RADOS Java wrappers suffered from an interaction between the JVM and > > > RADOS client signal handlers, in which either the JVM or RADOS would > > > replace the handlers for the other (not sure which order). Anyway, the > > > solution was to link in the JVM libjsig.so signal chaining library. This > > > might be the same thing we are seeing here, but I'm betting it is the > > > first theory I mentioned. > > > Hmm. I think that's an issue we've run into but I thought it got fixed for > > librados. Perhaps I'm mixing that up with libceph, or just pulling past > > scenarios out of thin air. It never manifested as Mutex count bugs, though! > > I haven't tested the Rados wrappers in a while. I've never had to link > in the signal chaining library for libcephfs. > > I wonder if the Mutex::lock(bool) being printed out is a red herring... Well, it's a SIGSEGV. So my guess is that's the frame that happens to be going outside its allowed bounds, probably because it's the first frame actually accessing the memory off of a bad (probably NULL) pointer. For instance, if it not only failed to mount the client, but even to create the context object? -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
