Re: [rust-dev] cost/benefits of tasks

Graydon Hoare Thu, 14 Apr 2011 19:07:08 -0700

On 11-04-14 05:31 PM, Rafael Avila de Espindola wrote:

I have been thinking about the costs and benefits we get from tasks. I
had discussed some of them with Graydon both by email on IRC. The email
is a quick summary to open the discussion.


Thanks for the write-up. I'll try to be more coherent here :)

*) The idea of using a special calling convention for doing rust to C
calls only works if we C stack is in a really easy to find place, like a
pinned register. We could do better than we do now by converting the
upcall functions with intrinsic functions, but that is still not ideal.

I think we need something here in any case, since even with linkedstacks we might be calling into C code that was compiled without them.It will overrun the current segment and crash if it goes too deep. Weneed to switch stacks, sometimes :(

But aside from that I agree with the other points on linked stacks. It'sworth trying them, they look better than the approach rustboot tried.

Given this and the fact that there is already interest in having LLVM
support linked stacks, for the rest of the email I will assume we will
use stack linking instead of copying.


Yup.

The way I see it, the big advantaged of tasks would be if they could be
used like erlang threads or goroutines. The programmer can just create
lots of them, use blocking APIs and they get scheduled as needed.

Here is probably where we part ways. You're glossing over a lot ofdetail by saying this:


 - "Like Erlang" means that a runtime-provided, very careful async
   I/O manager is standing in the way of every "blocking"-looking API
   and intercepting it, multiplexing it through I/O facilities. This
   is *largely* what I've been proposing we do, with exceptions for
   unsafe calls when the user really wants the option to shoot their
   own foot.

 - "Like Go" only works in Go because they are willing to let all
   goroutines get remapped to threads at will and race on shared access
   to memory. We're not.

So let's please keep in mind that nobody, at the moment, has a "general"solution to the speed/complexity/safety tradeoff: those that pick safetyhave to compensate with a certain amount of runtime machinery for I/Oand possibly lose speed when dealing with concurrent access(particularly with mutable things); those that pick speed wind up(usually) losing safety.

I'd very much like to chart a path towards a nice balance between thesetensions, but we have to be honest about the tensions existing.

Unfortunately, that model *cannot* be implemented in rust. A task cannot
move from a thread to another, so it is possible for two tasks that
could be executing to be in the same thread.

Agreed. If you make an unsafe, blocking C OS API call in the currentmodel, it will block your thread and every task in it. Erlang can dodgethis bullet because there's *no* sharing between tasks; you can alwaysreschedule all tasks -- in their entirety -- to other threads, so onethread blocking is no big deal.

But that's also rare; Erlang discourages you *heavily* from making anyblocking C calls directly. The point in that model (and in many dozensof other systems, from Node to win32 IOCP) is that the work you dosaturating the CPU with "caulcuation" and the work you do feeding datain and out of I/O facilities use such dramatically different OSinterfaces and concurrency requirements that they tend to run ondifferent threads and communicate using queues local to the process.Separate IO thread pool and CPU thread pool. That's how we've beenproceeding so far (with a further dividing-up of memory into tasks toprovide multiplexed control within a thread). It's not strictlynecessary but it's also not completely made up or novel.

Consider the example of a browser that wants to fetch many objects and
handle them. It would be very tempting to create one task for each of
the objects, but we cannot do that. The task creation would happen
before the network request and we would be already pinned to a thread
before knowing which resource would be available first.

A similar problem happens for pure IO, like a static http server. Open a
thread per request and you don't know which read will finish first. Some
of this can be avoided by having a clever IO library where read just
sends a message to an IO thread that uses select. Unfortunately, this
will not work when using mmap for example.

For these reasons it looks to me as if tasks add a lot of cost for a
small benefit.

I disagree, obviously. The benefit has to do with conceptual simplicity.In concurrent programs, you often have two Very Hard Reasoning Problemsthat are typically Done Wrong:


  - Multiplexing ownership of memory (usually via locks, atomic
    refcounts or concurrent GC) between multiple cores.

  - Multiplexing blocking serial execution (usually via state machines)
    between multiple I/O endpoints.

A task unifies both these problems into a simple abstraction that a useris likely to get right, while keeping costs .. low. Depending on whichcost you want to measure. I agree there's a price -- implementing ascheduler, occasional mis-assignment of tasks to threads so you starve arunnable task, writing a decent I/O multiplexing library -- but thereare major benefits to be had in providing a simplified cognitive model.This is why we have OS processes with control stacks in the first place,rather than single address space and 'goto'.

--------------------------------------------------------------------
Lets implement just process and threads for now. With these in place we
can see how far we can go. Once we have a need for more abstraction, we
can revisit what a task is and implement it.
--------------------------------------------------------------------

My feeling is we're *starting* with a need for more abstraction: we arepresently shooting our feet regularly trying to write safe programs thatdo concurrent memory access and blocking I/O using just threads andprocesses, in C. It's too hard in general. People always get it wrong.

And some different implementation ideas for when we do decide to
implement tasks:

* Use an OS thread of each task. What we currently call a thread in rust
will then just be a control for what tasks can run in parallel. A coarse
and easy to use parallelism that that user can refine if it finds a
contention.

I agree that there might be a way to make this model of a task work ifthreads are sufficiently cheap -- on many unixes they are now, not surehow win32 is doing these days -- and we can make guarantees aboutexclusive ownership of memory. Even if it only works on some platforms,it may well be worth investigating as a way to instantiate the model.

This solves the "task blocking because of unrelated task" problem with
no extra code, even for memory mapped IO.


Agreed. That's a very desirable feature of it.

Another advantage of this implementation is that we can expose any OS
level services to the tasks. For example, we can deliver signals without
having to de multiplex them.

To the extent that the OS service does not rely on running on our stack,maybe. Even signals tend to want something for that (see sigaltstack).But in some cases I imagine you are right. The OS generally presentsAPIs that are ... if not thread *friendly*, at least thread *aware*.We'd probably get a few for free.

This is not as expensive as it looks, since we would still be using
small stacks. It is hard to image a case where this is too expensive but
the existing proposal is not.

It's not the expense of thread-per-task that concerns me. That's just aquestion of arithmetic: either it is or it isn't. We can simply measure:how much is N thousand threads on each OS? Easy to research and figureout threshold values.

My concern is with preserving the abstractions of multiplexed sequentialI/O (rather than manual continuation-passing / interleaved state-machinestyle) and non-contended ownership of a private piece of memory (or thefunctional equivalent, say "only immutable memory"). I want to hold onto those, because those are the parts that are hardest about concurrentprogramming today.

* Go with an even lighter notion of what a task is. The idea is to
implement something like GCD. In the current implementation of GCD (as
in most C code), the burden of safety is always in the programmer. We
can probably do a bit better for rust in the common case.

Maybe so. I'm interested in exploring this sort of interface more whenwe have implemented unique pointers, so that we can talk about a typekind that is known to require neither GC nor atomic refcounting. Itseems like a safe(r) GCD workalike could be written there.

But I feel there remains a strong need for an abstraction that lets youtrade some portion of maximum possible throughput (in terms of CPU *I/O) for a simplified mental model that multiplexes I/O and control flowsafely and automatically (if not optimally). I agree that necessarilyentails a good AIO multiplexing library, somewhat baked into thestandard library.

I think you overestimate the degree to which normal human programmersare currently able to write a correct program that *just* handles lotsof concurrent I/O, even setting aside the "saturate the CPU" problem.AIO libraries are in the dark ages. Node and Erlang both achieved whatnotoriety they did mostly by providing *any* kind of saner abstractionover it (they didn't even pick the same abstraction; node doesn't docoroutines). Simply exposing the AIO interfaces in a non-crazy way is ahuge win, and (IMO) providing a coroutine library on top is better yet.


Cognitive and correctness costs are real.

-Graydon

_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] cost/benefits of tasks

Reply via email to