[rust-dev] cost/benefits of tasks

Rafael Avila de Espindola Thu, 14 Apr 2011 17:31:55 -0700

I have been thinking about the costs and benefits we get from tasks. Ihad discussed some of them with Graydon both by email on IRC. The emailis a quick summary to open the discussion.

First, on the "copy stacks" X "link stacks" issue, some of the issueswith copying stacks:

*) We cannot in general inline from C to rust. For example, we cannotLTO LLVM into rustc. The problem is that a C compiler cannot prove wherea pointer to the stack might be hidden, so it is not safe to move thestack.

*) The idea of using a special calling convention for doing rust to Ccalls only works if we C stack is in a really easy to find place, like apinned register. We could do better than we do now by converting theupcall functions with intrinsic functions, but that is still not ideal.

*) The rust compiler knows what points to the stack, but LLVM has tokeep track of that. This is equivalent to what other languages have tokeep track of for GC and unfortunately LLVM is not very good at it rightnow. It only tracks GC roots in memory, which would force us to alwaysaccess pointers to the stack via a load of a root.

*) One case I am not sure how to handle is that of a function that takesa reference argument. That reference could point to the stack, so it hasto go to a GC root, but the check for "do I need more stack space" goesbefore we have a chance to store it in a root :-(

Given this and the fact that there is already interest in having LLVMsupport linked stacks, for the rest of the email I will assume we willuse stack linking instead of copying.

The way I see it, the big advantaged of tasks would be if they could beused like erlang threads or goroutines. The programmer can just createlots of them, use blocking APIs and they get scheduled as needed.

Unfortunately, that model *cannot* be implemented in rust. A task cannotmove from a thread to another, so it is possible for two tasks thatcould be executing to be in the same thread.

Consider the example of a browser that wants to fetch many objects andhandle them. It would be very tempting to create one task for each ofthe objects, but we cannot do that. The task creation would happenbefore the network request and we would be already pinned to a threadbefore knowing which resource would be available first.

A similar problem happens for pure IO, like a static http server. Open athread per request and you don't know which read will finish first. Someof this can be avoided by having a clever IO library where read justsends a message to an IO thread that uses select. Unfortunately, thiswill not work when using mmap for example.

For these reasons it looks to me as if tasks add a lot of cost for asmall benefit. My main proposal in this email (other than avoiding thestack copying implementation) is


--------------------------------------------------------------------

Lets implement just process and threads for now. With these in place wecan see how far we can go. Once we have a need for more abstraction, wecan revisit what a task is and implement it.

--------------------------------------------------------------------

And some different implementation ideas for when we do decide toimplement tasks:

* Use an OS thread of each task. What we currently call a thread in rustwill then just be a control for what tasks can run in parallel. A coarseand easy to use parallelism that that user can refine if it finds acontention.

This solves the "task blocking because of unrelated task" problem withno extra code, even for memory mapped IO.

Another advantage of this implementation is that we can expose any OSlevel services to the tasks. For example, we can deliver signals withouthaving to de multiplex them.

This is not as expensive as it looks, since we would still be usingsmall stacks. It is hard to image a case where this is too expensive butthe existing proposal is not. If there are cases that do need very lighttasks:

* Go with an even lighter notion of what a task is. The idea is toimplement something like GCD. In the current implementation of GCD (asin most C code), the burden of safety is always in the programmer. Wecan probably do a bit better for rust in the common case.

Thread pools could have ownership of what the tasks can access. In thecase of constant data, that is always safe. In the case of mutable datathey can use some form of exclusion (running in a single thread as thecurrent tasks or locks) or delegate to the programmer in an unsafe block.


The example of a browser reading images becomes:

* The image loading is done using threads, async IO or tasks. Each imageis fetched, frozen and sent to the pool.* Each image rendering is a task. They get issued as the images becomeavailable. After this, the programmer has some options:

  * Do an unsafe write to the assigned memory position.

* Freeze the rendered image and send it to a thread managing thefinal buffer.* Create a "splat this into the final buffer" task if we are reallyinto the GCD way.

This is more code than the what would be written with the current tasks,but at least it behaves as expected. You never get an image that is notdisplayed because another one is being slow to load.

_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

[rust-dev] cost/benefits of tasks

Reply via email to