Re: [Jruby-devel] Thread studying

Charles O Nutter Sat, 08 Jul 2006 21:51:43 -0700

I think I finally figured out when Ruby's threads yield to each other: every 10ms! When the first thread gets launched (not main thread) Ruby will do one of two things:

- set up a system timer (setitimer-style) to signal the process every 10ms
- create a new native thread that sleeps for 10ms between signalling

At any rate, it keeps a reference to the main thread (i.e. the only Ruby thread) and does something akin to a pthread_kill on it, sending an appropriate signal to tell it to reschedule. If when the signal goes off we're in a critical section, it just ignores it and lets execution continue until the next 10ms signal fires.

It's basically just a 10ms timeslicing thread scheduler.

There's various different ways it handles this signalling and timeslicing, but I believe the following macro actually triggers the scheduling:

# define CHECK_INTS do {\
    if (!(rb_prohibit_interrupt || rb_thread_critical)) {\
        if (rb_thread_pending) rb_thread_schedule();\
    if (rb_trap_pending) rb_trap_exec();\
    }\
} while (0)

All the various signal handlers set rb_thread_pending = 1, which causes this macro to initiate thread scheduling. So where is this macro used? A search points out the places...and it's definitely not at every node. From what I can see, thread scheduling can occur BEFORE executing the following nodes (in addition to the other places I mention below):

NODE_NEXT
NODE_REDO
NODE_RETRY

Then there's also this:

#define RETURN(v) do { \
    result = (v); \
    goto finish; \
} while (0)

and this:

finish:
    CHECK_INTS;
    ....

inside rb_eval. So if we look for calls to the RETURN() macro we find a few more places rescheduling can happen...when returning a value (leaf nodes setting a result, like 'self', or after control structures) for the following:

when node is null
NODE_OPT_N
NODE_SELF
NODE_NIL
NODE_TRUE
NODE_FALSE
NODE_WHEN
NODE_CASE
NODE_WHILE
NODE_UNTIL
NODE_OP_ASGN1 (if short circuiting && or || but not otherwise...weird)
NODE_OP_ASGN2 (ditto)

That's about it, as far as I can see. Nowhere else in the code is rb_thread_pending checked, and so thread context switches appear to only happen at these points. That simplifies things a bit, yes? This also explains why Ruby's threading seems so coarse-grained; a 10ms timeslice is fairly quick, but how much code might execute before you encounter one of the above nodes? That's probably why a test like x = false; Thread.new { x = true }; p x; almost always prints "true"...thread scheduling is kicked off immediately after the new thread is created, and only in very rare circumstances is the new thread not selected to run (I haven't worked that part out yet). It would be conceivable that without IO, signals, thread events, or control structures as above you could write a thread that would never give itself up. But I digress.

I think it's probably reasonable then that we only do our threading checks in these same locations. We may be able to tighten up some of the internal threading code as well now that we know we don't need to hit it for every single node.

On 7/8/06, Charles O Nutter < [EMAIL PROTECTED]> wrote:

Some notes on threading as I find them...I'm guessing at some of this because there's practically no commenting in the 250-line rb_thread_schedule function.

I. Scope appears to be dup'ed when creating a new thread. I think we may "move" the existing scope to the new thread.

II. When scheduling the next thread to run, Ruby does the following for each live thread in order (of creation, it appears):
1. is runnable? found = 1
2. is it stopped? continue to next thread
3. is it joined on another thread?
3a. if yes, is the second thread alive? if it is not, stop waiting and set the first thread runnable, found = 1
4. is it waiting on a file descriptor? then we will need to use select.
5. is it waiting on a select? we'll need to select; dup file descriptors and check how long it's been waiting.
6. has this thread waited longest? if so, we save this delay and continue

It seems to do this over the entire collection of threads every time there's a context switch. In order to enforce priorities and give threads a fair timeslice, it has to scan them all. Once it's complete it has a runnable thread that has been waiting the longest, and it schedules it to run.

III. Next it does the select, if necessary; various select results will cause it to scan through all threads again looking for bad file descriptors, dup'ed file descriptors, trying to resolve the select across threads using the same FDs. If at the end of the select logic we still don't have a runnable thread, go back to II above and start over. It will poll like this when select is involved until a thread is runnable.

IV. Scan through all threads for any that are to be killed; schedule them next. Otherwise, scan through all threads looking for a thread of a higher priority. (does Ruby run all higher priority threads to completion first?)

V. If there's no selects in progress and no threads to be killed and we still don't have a thread, we're likely deadlocked. Print a deadlock warning for each thread and its current state. Choose the main thread and set it read to be killed. Call rb_thread_deadlock (which does something).

VI. However, if we have a runnable thread and it's the current thread, just return and continue executing.

VII. If we have a runnable thread and it's NOT the current thread, we're doing a context switch. Save thread context and return.

VIII. Finally, if we've gotten past all this and we're killing a thread, choose that thread, restore its context, and let it die.

...

- rb_thread_schedule is called after select waits too long, join waits too long (in the scheduler's eyes), and so on. Any long-running or blocking events *could* fire immediately, but if they take too long to continue control is thrown back to the thread scheduler.
- There are various places where the code checks if we're in a critical section; if so, the call to rb_thread_schedule is skipped
- rb_thread_schedule is called when you start, pass, stop, put a thread to sleep forever, kill a thread, and so on, unsurprisingly. It's called immediately after you set any thread's priority. It's called when a trap event fires.
- Ruby almost always schedules a new thread immediately. The following code only prints out 'false' about 25-35 times on my system:
100000.times { x = false; Thread.new { x = true }; p x}
- A really bad way of building our own scheduler occurs to me now: make all threads wait until signalled and build our own scheduler that runs them like Ruby does. Of course that would be absurd, and no better than pure green threads. Plus we'd have to do all the same polling and scanning of threads. However, it would act just like Ruby! :)
- It's worth noting that this is cooperative multithreading rather than explicit preemptive multithreading, since the current thread executing BECOMES the thread scheduler at some point and chooses to yield to another thread. Calls out to C code, dynamic libraries, and possible extensions will block all other threads from running.
- I'm still trying to figure out how, other than thread and process events, context switches in Ruby.

--
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com

--
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

_______________________________________________
Jruby-devel mailing list
Jruby-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jruby-devel

Re: [Jruby-devel] Thread studying

Reply via email to