On Mon, Nov 30, 2009 at 9:22 PM, Jason Baker <[email protected]> wrote: > There's just *one* set of tests that isn't passing against trunk. I haven't > had time to look in to it. James and I have discussed it in this thread: > https://lists.ubuntu.com/archives/storm/2009-November/001198.html > To answer James's question, I ran the test with something that looked like > this: > print "before select" > rlist, wlist, xlist = select.select(readers, [], [], TIMEOUT) > print "after select" > After a few iterations, "before select" will print out but "after select" > won't. There may be something I'm misunderstanding about Python's > threading, but I believe that means that it's blocking on the select call. > Of course, another possibility is that that's a coincidence and it's > deadlocking somewhere else.
I had another thought about this: The body of the select() function call will look something like this: 1. prepare arguments to select() system call 2. drop GIL 3. make the select() system call 4. acquire GIL 5. prepare result I'd been assuming that it was blocking at (3), which didn't make sense because of the use of the timeout. Buit it could also block at (4), which would occur if the Oracle database adapter held on to the GIL while talking to the database. This should be pretty easy to check for using gdb to see what the interpreter is doing when you hit the deadlock. If that is the case, the fix would be to either (a) make the Oracle adapter drop the GIL at the appropriate points, or (b) adjust the test infrastructure so that the TCP proxy runs in a subprocess instead of a thread (and hence gets a different GIL). I'd be happy to see the Oracle code merged with those tests disabled until the problem gets resolved. Some manual testing would be in desirable in this case though. James. -- storm mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/storm
