Hi,

We discussed (not) moving fibers between threads on DConf last week, and later it was discussed in the announce group, I think this matter is important enough to get a thread of it's own.

Software fibers/coroutines were created to make asynchronous programming using a Reactor (or another "event loop i/o scheduler") more seamless.

For those unaware of the Reactor Pattern, I advise reading [ http://en.wikipedia.org/wiki/Reactor_pattern ; http://www.dre.vanderbilt.edu/~schmidt/PDF/reactor-siemens.pdf ], and for some perspective at how other languages have addressed this I recommend watching Guido Van Rossum's talk about acyncio and Python: https://www.youtube.com/watch?v=aurOB4qYuFM

The Reactor pattern is a long-time widely accepted way to achieve low latency async io operations, that fortunately became famous thanks to the Web and the C10k requirement/problem. Using the Reactor is the most efficient way to leverage current CPU architectures to perform lots of IO for many reasons outside of this scope. Another very important quality to using a rector based approach, is that since all event handlers just serialize on a single IO scheduler ("the reactor") on each thread, if designed correctly programmers don't have to think about concurrency and care about code-races.

Another thing to note: when using the reactor pattern you have to make sure that no event handler blocks at all, never! Once an event-handler blocks, since being a non-preemptive model, the other event handlers will not be able to run, basically starving themselves and the clients on the other side of the network. Reactor implementations usually detect, and notify when an event handler took too much time until giving away control (this is dependent on application, but should be in the usec range on current hw).

The downside for the reactor pattern (used to be) that the programmer has to manually keep the state/context of how the event handler worked. Since each "logical" operation was comprised by many i/o transactions (some NW protocol to keep track, maybe accessing a networked DB for some data, reading/writing to local/remote files/ etc) the reactor would also keep a context for each callback and IO event and the programmer had to either update the context and keep registering new event handlers manually for all extra I/O transactions and in many cases change callback registration in some cases. This downside means that it's more difficult to program for a Reactor model, but since programmers don't have to think about races and concurrency issues (and then debug them...) from our experience it still more efficient to program than general-purpose threads if you care about correctness/coherency. One way so mitigate this complexity was through the Proactor pattern -- implementing higher-level async. IO services over the reactor, thus sparing the programmer a lot of the low-level context headaches.

Up until now I did not say anything about Fibers/coroutines.

What Fibers bring to the table, is the ability to program within the reactor model without having to manually keep a context that is separate for the program logic, and without the requirement to manually re/register callbacks for different IO events. D's Fibers allowed us to create an async io library with support for network/file/disk operations and higher level conditions (waiters, barriers, etc) that allows the programmer to write code as-if it runs in its own thread (almost, sometimes fibers are explicitly "spawned" -- added to the reactor, and fiber-conditions are slightly different than spawning and joining threads) without paying the huge correctness/coherence and performance penalties of the threading model.

There are two main reasons why it does not make sense to move fibers between threads:

1. You'll start having concurrency issues. Lets assume we have a main fiber that received some request, and it spawns 3 fibers looking into different DBs to get some info and update an array with the data. The array will probably be on the stack of the first fiber. If fibers don't move between threads, there is nothing to worry about (as expected by the model). If you start moving fibers across threads you have to start guarding this array now, to make sure it's still coherent. This is a simple example, but basically shows that you're "losing" one of the biggest selling point of the whole reactor based model.

2. Fibers and reactor based IO make work well (read: make sense) when you have a situation where you have lots of concurrent very small transactions (similar to the Web C10k problem or a storage machine). In this case, if one of the threads has more capacity than the rest, then the IO scheduler ("reactor") will just make sure to spawn new fibers accepting new transactions in that fiber. If you don't have a situation that balancing can be done via placing new requests in the right place, then probably you should not use the reactor model, but a different one that suits your application better. Currently we can spawn another reactor to take more load, but the load is balanced statically at a system-wide level. On previous projects we had several reactors running on different threads and providing very different functionality (with different handlers, naturally). We never got to a situation that moving a fiber between threads made any sense.

As we see, there is nothing to gain and lots to lose by moving fibers between threads.

Now, if we want to make sure fibers are well supported in D there are several other things we should do:

1. Implement a good asyncIO library that supports fiber based programming. I don't know Vibe.d very well (e.g. at all), maybe we (Weka.IO) can help review it and suggest ways to make it into a general async IO library (we have over 15 years experience developing with the reactor model in many environments)

2. Adding better compiler support. The one problem with fibers is that upon creation you have to know the stack size for that fiber. Different functions will create different stack depths. It is very convenient to use the stack to hold all objects (recall Walter's first day talk, for example), and it can be used as very convenient way to "garbage collect" all resources added during the run of that fiber, but currently we don't leverage it to the max since we don't have a good way to know/limit the amount of memory used this way. If the compiler will be able to analyze stack usage by functions (recursively) and be able to give us hints regarding the upper-bounds of stack usage, we will be able to use the stack more aggressively and utilize memory much better. Also -- I think such static analysis will be a big selling point for D for systems like ours.

I think now everything is written down, and we can move the discussion here.

Liran.

Reply via email to