RE: About Haskell Thread Model
On Mon, Oct 13, 2003 at 11:13:56AM +0200, Wolfgang Thaller wrote: The reason why we currently do not take advantage of SMP is that the Haskell Heap is a shared data structure which is modified whenever a thunk (an unevaluated expression) is evaluated. Using synchronisation primitives like pthread_mutex_lock for every evaluation of a thunk would be deadly for performance. There is some old (1999) code in the GHC RTS that attempts this (using intel cmpxchg instructions for synchronisation), but it's currently bitrotted and I don't know how successful it was. cmpxchg and then taking a blocking lock sounds like the 2-tier locking supported with Linux' new futex system calls. I wonder how they chose to block in the older GHC RTS. It was a spinlock. There was also an attempt to avoid having to lock every thunk entry and update by partitioning the allocation area per-thread, taking advantage of the fact that most thunk entries and updates are to objects created recently by the current thread. This is about where the experiment stalled - it was interesting, but even if the heap partitioning turned out to be effective, the gains weren't really going to be there without tackling multithreaded GC too (as Jan-Willem already pointed out). Cheers, Simon ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
William Lee Irwin III [EMAIL PROTECTED] discusses SMP threading issues: On Mon, Oct 13, 2003 at 12:02:52PM -0400, Jan-Willem Maessen wrote: In reality you probably want to use some sort of thin lock / fat lock approach a la Java. Is the location thinly locked? If so, CAS in a fat lock instead, then wait on that. Now update requires a CAS as well, to see whether the thin lock was fattened. If so, the fat lock must be unlocked and deallocated. Naturally, we'd like to keep track of unshared objects so we can avoid any kind of complicated update protocol in the common case. This requires the implementor to establish a careful and clear contract between the compiler, the GC, and the thread scheduler. This sounds very heavyweight. I'd be tempted to start looking at lockless algorithms from the outset if you need to synchronize such basic/frequently exercised operations. The above is about as lockless as you can get. The reader can read the object header, and only does fancy stuff if the object is uncomputed. The writer can fill the object at the cost of a memory barrier (if stores are not ordered) and a compare and swap. All the rest of the complexity is to get blocked threads to play nicely with the OS. It's possible that the futex abstraction in Linux would be really helpful here; I don't know it well enough to say. Ideally the OS would give us a primitive that says block this thread until this location changes value, but don't expect the OS to be told when that happens---in which case we need only a memory barrier, and not a CAS. In this latter case, if we have a machine with ordered stores we don't need any special operations at all, and thus we need not make a local/global distinction (though it may still be useful for multiprocessor GC). [Re: multiprocessor GC] I've not really heard horror stories per se, but indirectly I've gotten the idea that the JVM's out there spent a lot of time on this. I suspect some of the Java userspace lock contention I've seen was related to GC. Possibly. Mostly it involve a lot of development effort to make it correct, and still more dvelopment effort to tune it so that it performs reasonably. Again, because Haskell is allocation-intensive it's likely to require more intensive tuning, and probably a different GC strategy (for example, copying GC looks more attractive). -Jan-Willem Maessen [EMAIL PROTECTED] ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: About Haskell Thread Model
Thanks very much. It's very useful. PS: Sorry for the inconvenience I think I should have sent the mail in plain text. Qing -Original Message- From: Wolfgang Thaller [mailto:[EMAIL PROTECTED] Sent: Sunday, October 12, 2003 4:58 AM To: Yang, Qing Cc: [EMAIL PROTECTED] Subject: Re: About Haskell Thread Model I am a new learner of Haskell and I am interested in Haskell's concurrent model. Can somebody give me a brief intro about Haskell's thread model, like how the use-level threads are mapped to kernel thread and what scheduling mechanism that Haskell uses, or point me to some links or documents that I can learn from. In the interpreter Hugs, all Haskell threads are run in one kernel thread. They are scheduled cooperatively; thread switches only take place when a function from the Concurrent module is called. In the currently released version of GHC, all Haskell threads are run in one kernel thread, too; however, thread switches can take place whenever memory is allocated --- and in Haskell, that means almost always. The optimizer manages to compile a fibonacci function that doesn't allocate any memory, but in the real world, it's as good as real preemption. If you compile the bleeding-edge GHC from the CVS HEAD, you'll get something else; while most threads (those created using forkIO) are still light-weight threads that are scheduled in just one kernel thread, you can also create threads that get their own operating system thread. This is solves all the problems that lightweight threads can have with foreign (i.e. non-Haskell) libraries. You should also note that no Haskell implementation currently supports SMP; even when multiple kernel threads are used, there is a mutual exclusion lock on the Haskell heap, so a multithreaded Haskell program will use only one CPU on an SMP system. I hope my answer was useful... Cheers, Wolfgang P.S.: If you want to do me a favour, you could tell your mail program not to send multipart or HTML messages to the list; they look terrible to people like me who get a daily digest from the mailing list. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
On Sat, Oct 11, 2003 at 10:58:04PM +0200, Wolfgang Thaller wrote: If you compile the bleeding-edge GHC from the CVS HEAD, you'll get something else; while most threads (those created using forkIO) are still light-weight threads that are scheduled in just one kernel thread, you can also create threads that get their own operating system thread. This is solves all the problems that lightweight threads can have with foreign (i.e. non-Haskell) libraries. You should also note that no Haskell implementation currently supports SMP; even when multiple kernel threads are used, there is a mutual exclusion lock on the Haskell heap, so a multithreaded Haskell program will use only one CPU on an SMP system. I hope my answer was useful... That's a painful-sounding state of affairs, though not entirely unexpected. It would be interesting to hear of BKL breakup efforts for Haskell runtime systems, though anymore I'm totally ignorant of what the devil is going on in userspace except database-only syscalls. -- wli ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
William Lee Irwin III wrote: On Sat, Oct 11, 2003 at 10:58:04PM +0200, Wolfgang Thaller wrote: You should also note that no Haskell implementation currently supports SMP; even when multiple kernel threads are used, there is a mutual exclusion lock on the Haskell heap, so a multithreaded Haskell program will use only one CPU on an SMP system. I hope my answer was useful... That's a painful-sounding state of affairs, though not entirely unexpected. It would be interesting to hear of BKL breakup efforts for Haskell runtime systems, though anymore I'm totally ignorant of what the devil is going on in userspace except database-only syscalls. I'm not sure I understand you. I found out that BKL refers to the Big Kernel Lock (in the Linux kernel), but I have no idea what database-only syscalls are. The reason why we currently do not take advantage of SMP is that the Haskell Heap is a shared data structure which is modified whenever a thunk (an unevaluated expression) is evaluated. Using synchronisation primitives like pthread_mutex_lock for every evaluation of a thunk would be deadly for performance. There is some old (1999) code in the GHC RTS that attempts this (using intel cmpxchg instructions for synchronisation), but it's currently bitrotted and I don't know how successful it was. Cheers, Wolfgang ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
William Lee Irwin III wrote: That's a painful-sounding state of affairs, though not entirely unexpected. It would be interesting to hear of BKL breakup efforts for Haskell runtime systems, though anymore I'm totally ignorant of what the devil is going on in userspace except database-only syscalls. On Mon, Oct 13, 2003 at 11:13:56AM +0200, Wolfgang Thaller wrote: I'm not sure I understand you. I found out that BKL refers to the Big Kernel Lock (in the Linux kernel), but I have no idea what database-only syscalls are. remap_file_pages() (lightweight mmap() for doing, say, 65536 mappings of distinct 32KB chunks of a file without kernel memory usage exploding) and async io syscalls are examples of database-only syscalls, that I'm aware userspace is doing largely from various kernel mailing list controversies. On Mon, Oct 13, 2003 at 11:13:56AM +0200, Wolfgang Thaller wrote: The reason why we currently do not take advantage of SMP is that the Haskell Heap is a shared data structure which is modified whenever a thunk (an unevaluated expression) is evaluated. Using synchronisation primitives like pthread_mutex_lock for every evaluation of a thunk would be deadly for performance. There is some old (1999) code in the GHC RTS that attempts this (using intel cmpxchg instructions for synchronisation), but it's currently bitrotted and I don't know how successful it was. cmpxchg and then taking a blocking lock sounds like the 2-tier locking supported with Linux' new futex system calls. I wonder how they chose to block in the older GHC RTS. -- wli ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
RE: About Haskell Thread Model
Wolfgang, According to the User Guide of GHC 6.1 that GHC supports both Concurrent Haskell and Parallel Haskell. And Concurrent Haskell and Parallel Haskell have very different purpose. To my understanding about the document and your reply, it's Concurrent Haskell who uses Threads to process tasks concurrently and the threads are actually user-level thread. And no SMP is supported now in Concurrent Haskell. Is this right? For Parallel Haskell, it is said in the User Guide document as follows: --- Parallel Haskell is about speed - spawning threads onto multiple processors so that your program will run faster. .. A Parallel Haskell program implies multiple processes running on multiple processors, under a PVM (Parallel Virtual Machine) framework. .. Do you have some experience or knowledge about Parallel Haskell? And what you mentioned in you previous email is all about Concurrent Haskell or about the both? Thanks, Qing Yang System Software Lab Intel Corporation [EMAIL PROTECTED] 86-10-85298800 Ext. 1955 (Office) 8751-1955 (iNet) -Original Message- From: Wolfgang Thaller [mailto:[EMAIL PROTECTED] Sent: Sunday, October 12, 2003 4:58 AM To: Yang, Qing Cc: [EMAIL PROTECTED] Subject: Re: About Haskell Thread Model I am a new learner of Haskell and I am interested in Haskell's concurrent model. Can somebody give me a brief intro about Haskell's thread model, like how the use-level threads are mapped to kernel thread and what scheduling mechanism that Haskell uses, or point me to some links or documents that I can learn from. In the interpreter Hugs, all Haskell threads are run in one kernel thread. They are scheduled cooperatively; thread switches only take place when a function from the Concurrent module is called. In the currently released version of GHC, all Haskell threads are run in one kernel thread, too; however, thread switches can take place whenever memory is allocated --- and in Haskell, that means almost always. The optimizer manages to compile a fibonacci function that doesn't allocate any memory, but in the real world, it's as good as real preemption. If you compile the bleeding-edge GHC from the CVS HEAD, you'll get something else; while most threads (those created using forkIO) are still light-weight threads that are scheduled in just one kernel thread, you can also create threads that get their own operating system thread. This is solves all the problems that lightweight threads can have with foreign (i.e. non-Haskell) libraries. You should also note that no Haskell implementation currently supports SMP; even when multiple kernel threads are used, there is a mutual exclusion lock on the Haskell heap, so a multithreaded Haskell program will use only one CPU on an SMP system. I hope my answer was useful... Cheers, Wolfgang P.S.: If you want to do me a favour, you could tell your mail program not to send multipart or HTML messages to the list; they look terrible to people like me who get a daily digest from the mailing list. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
Do you have some experience or knowledge about Parallel Haskell? And what you mentioned in you previous email is all about Concurrent Haskell or about the both? Everything I said was about Concurrent Haskell. I have no experience with Parallel Haskell. All the binaries available on the GHC web page are built for Concurrent Haskell. Cheers, Wolfgang ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
On Mon, 13 Oct 2003, Wolfgang Thaller wrote: Do you have some experience or knowledge about Parallel Haskell? And Parallel Haskell runs, but there are problems. Unless someone has slipped something past me, there is no parallel implementation for Release 6 yet, so if you want to tinker with it, you'll need to go to the CVS repository for Release 5 of Haskell and the parallel run-time system (look for the GUM branch). Please note, the Release 5 version of parallel Haskell should be considered experimental and best left aside if you are unwilling to fix code. Murray Gross Metis Project, Brooklyn College ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
William Lee Irwin III [EMAIL PROTECTED] asks about true SMP haskell. On Mon, Oct 13, 2003 at 11:13:56AM +0200, Wolfgang Thaller wrote: The reason why we currently do not take advantage of SMP is that the Haskell Heap is a shared data structure which is modified whenever a thunk (an unevaluated expression) is evaluated. Using synchronisation primitives like pthread_mutex_lock for every evaluation of a thunk would be deadly for performance. There is some old (1999) code in the GHC RTS that attempts this (using intel cmpxchg instructions for synchronisation), but it's currently bitrotted and I don't know how successful it was. cmpxchg and then taking a blocking lock sounds like the 2-tier locking supported with Linux' new futex system calls. I wonder how they chose to block in the older GHC RTS. This invariably proves tricky, even more so now that GHC effectively uses a many-to-one threading model (and where it may therefore be wrong to simply block the OS-level thread). In pH, which isn't quite haskell but has the same syntax and runs on multiple processors with a shared heap, we kludged: we simply called yield or sleep and kept spinning. We were actually using work stealing (with lock-free work queues as I recall) along with wait queues on empty locations, so we didn't sleep very often. Nonetheless, this did occasionally turn out to cause real performance problems. In reality you probably want to use some sort of thin lock / fat lock approach a la Java. Is the location thinly locked? If so, CAS in a fat lock instead, then wait on that. Now update requires a CAS as well, to see whether the thin lock was fattened. If so, the fat lock must be unlocked and deallocated. Naturally, we'd like to keep track of unshared objects so we can avoid any kind of complicated update protocol in the common case. This requires the implementor to establish a careful and clear contract between the compiler, the GC, and the thread scheduler. Finally, the complexity of thunk update pales in comparison to the complexity of multiprocessor GC. In pH we punted on this issue---debugging a GC with per-thread allocation pools was hard enough. It killed us; thread performance didn't scale because GC didn't scale. Eager Haskell was designed with multithreaded shared-memory execution in mind, but I simply couldn't find the year or more it would take to build and debug a true multiprocessor GC---so it remains a uniprocessor implementation. -Jan-Willem Maessen [EMAIL PROTECTED] ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
William Lee Irwin III [EMAIL PROTECTED] asks about true SMP cmpxchg and then taking a blocking lock sounds like the 2-tier locking supported with Linux' new futex system calls. I wonder how they chose to block in the older GHC RTS. On Mon, Oct 13, 2003 at 12:02:52PM -0400, Jan-Willem Maessen wrote: This invariably proves tricky, even more so now that GHC effectively uses a many-to-one threading model (and where it may therefore be wrong to simply block the OS-level thread). In pH, which isn't quite haskell but has the same syntax and runs on multiple processors with a shared heap, we kludged: we simply called yield or sleep and kept spinning. We were actually using work stealing (with lock-free work queues as I recall) along with wait queues on empty locations, so we didn't sleep very often. Nonetheless, this did occasionally turn out to cause real performance problems. Abusing sched_yield() like this has been a performance problem with threading engines on Linux for a while, though usually as the middle tier of 3-tier locks as opposed to being the sole blocking primitive. On Mon, Oct 13, 2003 at 12:02:52PM -0400, Jan-Willem Maessen wrote: In reality you probably want to use some sort of thin lock / fat lock approach a la Java. Is the location thinly locked? If so, CAS in a fat lock instead, then wait on that. Now update requires a CAS as well, to see whether the thin lock was fattened. If so, the fat lock must be unlocked and deallocated. Naturally, we'd like to keep track of unshared objects so we can avoid any kind of complicated update protocol in the common case. This requires the implementor to establish a careful and clear contract between the compiler, the GC, and the thread scheduler. This sounds very heavyweight. I'd be tempted to start looking at lockless algorithms from the outset if you need to synchronize such basic/frequently exercised operations. On Mon, Oct 13, 2003 at 12:02:52PM -0400, Jan-Willem Maessen wrote: Finally, the complexity of thunk update pales in comparison to the complexity of multiprocessor GC. In pH we punted on this issue---debugging a GC with per-thread allocation pools was hard enough. It killed us; thread performance didn't scale because GC didn't scale. Eager Haskell was designed with multithreaded shared-memory execution in mind, but I simply couldn't find the year or more it would take to build and debug a true multiprocessor GC---so it remains a uniprocessor implementation. I've not really heard horror stories per se, but indirectly I've gotten the idea that the JVM's out there spent a lot of time on this. I suspect some of the Java userspace lock contention I've seen was related to GC. -- wli ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
From www.haskell.org, there's a link Haskell Compilers and Interpreters; and from there, the first link leads to www.haskell.org/hugs, which is Hugs' home page. Cheers, Wolfgang ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
I am a new learner of Haskell and I am interested in Haskell's concurrent model. Can somebody give me a brief intro about Haskell's thread model, like how the use-level threads are mapped to kernel thread and what scheduling mechanism that Haskell uses, or point me to some links or documents that I can learn from. In the interpreter Hugs, all Haskell threads are run in one kernel thread. They are scheduled cooperatively; thread switches only take place when a function from the Concurrent module is called. In the currently released version of GHC, all Haskell threads are run in one kernel thread, too; however, thread switches can take place whenever memory is allocated --- and in Haskell, that means almost always. The optimizer manages to compile a fibonacci function that doesn't allocate any memory, but in the real world, it's as good as real preemption. If you compile the bleeding-edge GHC from the CVS HEAD, you'll get something else; while most threads (those created using forkIO) are still light-weight threads that are scheduled in just one kernel thread, you can also create threads that get their own operating system thread. This is solves all the problems that lightweight threads can have with foreign (i.e. non-Haskell) libraries. You should also note that no Haskell implementation currently supports SMP; even when multiple kernel threads are used, there is a mutual exclusion lock on the Haskell heap, so a multithreaded Haskell program will use only one CPU on an SMP system. I hope my answer was useful... Cheers, Wolfgang P.S.: If you want to do me a favour, you could tell your mail program not to send multipart or HTML messages to the list; they look terrible to people like me who get a daily digest from the mailing list. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
Wolfgang, How can I get source of Hugs? (frm http://www.haskell.org?). Regards, Vasili From: Wolfgang Thaller <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] Subject: Re: About Haskell Thread Model Date: Sat, 11 Oct 2003 22:58:04 +0200 I am a new learner of Haskell and I am interested in Haskell's concurrent model. Can somebody give me a brief intro about Haskell's thread model, like how the use-level threads are mapped to kernel thread and what scheduling mechanism that Haskell uses, or point me to some links or documents that I can learn from. In the interpreter Hugs, all Haskell threads are run in one kernel thread. They are scheduled cooperatively; thread switches only take place when a function from the "Concurrent" module is called. In the currently released version of GHC, all Haskell threads are run in one kernel thread, too; however, thread switches can take place whenever memory is allocated --- and in Haskell, that means "almost always". The optimizer manages to compile a fibonacci function that doesn't allocate any memory, but in the real world, it's as good as real preemption. If you compile the bleeding-edge GHC from the CVS HEAD, you'll get something else; while "most" threads (those created using "forkIO") are still light-weight threads that are scheduled in just one kernel thread, you can also create threads that get their own operating system thread. This is solves all the problems that lightweight threads can have with foreign (i.e. non-Haskell) libraries. You should also note that no Haskell implementation currently supports SMP; even when multiple kernel threads are used, there is a mutual exclusion lock on the Haskell heap, so a multithreaded Haskell program will use only one CPU on an SMP system. I hope my answer was useful... Cheers, Wolfgang P.S.: If you want to do me a favour, you could tell your mail program not to send multipart or HTML messages to the list; they look terrible to people like me who get a daily digest from the mailing list. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell Send and receive larger attachments with Hotmail Extra Storage. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
Re: About Haskell Thread Model
Hello Yang Qing, Go to: http://www.haskell.org/~simonmar/bib.htmland also http://sourceforge.net/forum/forum.php?forum_id=253134. The last time I read code I believe Simon Marlowe used monads to implement threads in Haskell. Read the code and you will see. Regards, Vasili I.Galchin From: "Yang, Qing" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Subject: About Haskell Thread Model Date: Fri, 10 Oct 2003 10:06:46 +0800 Hello, I am a new learner of Haskell and I am interested in Haskell's concurrent model. Can somebody give me a brief intro about Haskell's thread model, like how the use-level threads are mapped to kernel thread and what scheduling mechanism that Haskell uses, or point me to some links or documents that I can learn from. Thanks, Qing Yang [EMAIL PROTECTED] ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell Get 10MB of e-mail storage! Sign up for Hotmail Extra Storage. ___ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell