Re: [rust-dev] cost/benefits of tasks

Graydon Hoare Fri, 15 Apr 2011 09:47:59 -0700

On 15/04/2011 8:32 AM, Marijn Haverbeke wrote:

They have pointers into a heap shared with other tasks in their thread. We'd
have to dig through that heap cloning everything they point to.


Right. I can see the costs, but you have to agree that migrating tasks
would be a *great* thing to have. Having shared values become more
complicated might be worth it. Unique boxes are one good solution. You
also alluded to a task-lifetime trick last week (task X holding onto
immutable value Z, and not being allowed to die until task Y, which
accesses this value, finishes -- if I understood it correctly). There
are probably other hacks that can be applied when sharing big
structures. For small ones, copying is a good idea anyway.

It's entirely possible to go down this road. I'm much, much morecomfortable if our research leads us to a design in which:


  - Domains don't exist.
  - Tasks are run on threads M:N where, at the limit, you may
    choose to make that 1:1. But if you have cost reasons to prefer
    different M and N you can get it, nothing breaks.
  - Tasks always own all their reachable data, no sharing.
  - Messages are therefore always either deep-copied or moved.

That's an ok cognitive model. Fewer parts, fewer corner cases; it givesup one case (shared messages) but might be a net win given thesimplification. Might, might not. Maybe that's all Rafael was suggestingin the first place, or close enough not to matter.

I wasn't sure what we were pushing toward; a conclusion like "we have touse unsafe blocks everywhere" is unacceptable to me. So is one where welose important parts of the per-task structure like its local GC pool,incoming lockless queues for ports, or its unwind semantics on failure.These are strict improvements to the notion of a thread, and they'rehard ones for users to simulate.

Losing the ability to share message substructures after sending is .. acost though. And it's one that's worth caring at least somewhat about;maybe we sacrifice it but I want everyone to know why it's worthkeeping, so I will place it in a Special Attention-Getting Block:


  I want users to feel comfortable making lots of tasks, not just for
  concurrency: it's a way to *isolate* code from side effects of other
  code. Even if I was developing a completely serial program I'd want
  to be able to carve it up into tasks. They are natural boundaries,
  like namespaces or such, where you have a line drawn that the language
  is telling you the semantics will prevent anyone from crossing, even
  considering dynamic reachability. That's important for maintaining
  partial correctness and system function in the presence of errors.

That said, Erlang seems able to encourage users to make lots of tasks --for robustness -- while having them pay for deep copies every time. Somaybe it's just something where users get accustomed to the copyboundaries and learn to live with that tax. And maybe, if we have uniquepointers, most serious code will lean on them heavily so that ownershiphandoff is more common. I'm unsure.

Of course, this'd also upset our current design of domains and such.
I'm not really putting myself behind any new approach at this point,
but I think we should definitely be open to anything that would help
us avoid costly and awkward I/O multiplexing magic.

Yeah .. I'm not wedded to domains; they seemed necessary todifferentiate cases, but if those cases wind up collapsing (and if,absent the effect system, there's no reason for *processes* to bereified in the language either) then removing the domain concept lowerscognitive costs, while removing an awkward case (task starvation), soI'm ... tentatively ok with it. If users will accept the loss of cheapisolation in exchange for simplified model, and we don't run into I/Oscalability issues. But regarding that, here is another AttentionGetting Block:


  Another thing to keep in mind: "awkward I/O multiplexing magic" is
  likely *necessary* on some platforms to scale well. Or at least this
  is the mythology. This is a numeric question that demands research.
  Try writing a C program that does "the smallest thread you can make"
  on each OS and tries to make 100,000 of them doing concurrent
  blocking reads on 100,000 file descriptors. See if it scales as well
  as a 100,000 way IOCP/kqueue/epoll approach. It might. It might not.
  Kernel people are always tilting the balance one way or another,
  sometimes userspace is just misinformed, working on old information.

  Even with a 4kb (1 page) stack, I'd expect to be able to make a
  million tasks on an 8gb machine. I .. actually demo'ed this on the old
  rustboot approach, way back when, when tasks started with 300 bytes of
  stack, it's plausible.

So: if you want to pursue this simplification, go forth and research!See what our limits are. Otherwise we're making stuff up based onfolklore and blog posts.


-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] cost/benefits of tasks

Reply via email to