On Wed, 2008-09-17 at 13:44 -0700, Evan Laforge wrote: > >> systems that don't use an existing user-space thread library (such as > >> Concurrent Haskell or libthread [1]) emulate user-space threads by > >> keeping a pool of processors and re-using them (e.g., IIUC Apache does > >> this). > > > > Your response seems to be yet another argument that processes are too > > expensive to be used the same way as threads. In my mind pooling vs > > new-creation is only relevant to process vs thread in the performance > > aspects. The fact that people use thread-pools means that they think > > that even thread-creation is too expensive. The central aspect in my > > mind is a default share-everything, or default share-nothing. One is > > much easier to reason about and encourages writing systems that have > > less shared-memory contention. > > This is similar to the plan9 conception of processes. You have a > generic rfork() call that takes flags that say what to share with your > parent: namespace, environment, heap, etc. Thus the only difference > between a thread and a process is different flags to rfork().
As I mentioned, Plan 9 also has a user-space thread library, similar to Concurrent Haskell. > Under the covers, I believe linux is similar, with its clone() call. > > The fast context switching part seems orthogonal to me. Why is it > that getting the OS involved for context switches kills the > performance? Read about CPU architecture. > Is it that the ghc RTS can switch faster because it > knows more about the code it's running (i.e. the OS obviously couldn't > switch on memory allocations like that)? Or is jumping up to kernel > space somehow expensive by nature? Yes. Kernel code is very different on the bare metal from userspace code; RTS code of course is not at all different. Switching processes in the kernel requires an interrupt or a system call. Both of those require the processor to dump the running process's state so it can be restored later (userspace thread-switching does the same thing, but it doesn't dump as much state because it doesn't need to be as conservative about what it saves). > And why does the OS need so many > more K to keep track of a thread than the RTS? An OS thread (Linux/Plan 9) stores: * Stack (definitely a stack pointer and stored registers (> 40 bytes on i686) and includes a special set of page tables on Plan 9) * FD set (even if it's the same as the parent thread, you need to keep a pointer to it * uid/euid/gid/egid (Plan 9 I think omits euid and egid) * Namespace (Plan 9 only; again, you need at least a pointer even if it's the same as the parent process) * Priority * Possibly other things I can't think of right now A Concurrent Haskell thread stores: * Stack * Allocation area (4KB) The kernel offers more to a process (and offers a wider separation between processes) than Concurrent Haskell offers to a thread. jcc _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe