A short followup:
I spent the last several days wondering if I'd exaggerated the various
risks of committing early to a 1:1 model for tasks (modeling tasks
strictly as threads). I don't like over-engineering any more than the
next person and the possibility that threads are "sufficiently" fast and
small on all platforms continued to nag at me. If we're injecting
yield-points in all cases to make code interruptable (which we are) then
I think *all* the arguments I had roughly boil down to "potential cost
problems".
So I did a little research on costs. It might be worth doing more
substantial research (measurements, gathering hard scalability numbers
ourselves, making benchmarks) but I'm a *bit* more convinced that I
wasn't just speaking nonsense the other day. Gathering more hard numbers
outselves may make sense, but I'm less uncertain now. The following
turned up in my search:
- Limits of windows: kernel stack for threads is minimum 12kb resident
on win32, 24k on win64, and expands to 20 and 48k respectively if the
thread touched GDI. Looks like you can push a process into the 10,000s
of threads, but probably not the 100,000s
- Limits of OSX: weird kernel restrictions. Non-server 10.6 has
arbitrary clamp at 2500 threads per system. Server 10.6 and both
versions of 10.7 clamp at 12,500 per 8gb installed memory, with only 20%
available to a given process (i.e. 2,500 threads per process, per 8gb).
This still sounds like an arbitrary non-adjustable limit though, unless
they happen to be dedicating 640k of kernel memory to each thread or
something.
- Limits of iOS: iPhones clamp to 1024 threads.
- Limits of solaris 10: kernel stacks are 8k on x86 (later bumped to
12k), and 20k on x64. But they're in a 512mb (x86) or 24gb (x64) pinned
segment; so on x86 this will clamp to around 45,000 threads.
- Linux has 8k or 4k kernel stacks. Much smaller, much better; but still
a fair bit larger than seems *necessary* for our task segment granularity.
- Even then, Intel claims (at least in '07) that its TBB tasks are ~18x
faster than a linux thread setup/teardown.
- Erlang processes are 300 bytes (?) whereas Haskell tasks get 1k stack
segments by default.
Some light reading for the interested:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
http://blogs.technet.com/b/markrussinovich/archive/2009/07/08/3261309.aspx
http://support.apple.com/kb/HT3854
http://www.codeguru.com/cpp/misc/misc/threadsprocesses/article.php/c13533
http://gurkulindia.com/2011/05/05/solaris-reference-understanding-solaris-kernel-stack-overflows/
In support of threads!
http://www.mailinator.com/tymaPaulMultithreaded.pdf is an interesting
counterpoint, where threads-and-blocking-IO are shown to have pulled
back ahead of the NIO interface (at least in java).
http://www.theserverside.com/discussions/thread.tss?thread_id=26700
-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev