RE: Lessons to learn from ithreads (was: threads?)
Date: Tue, 12 Oct 2010 23:46:48 +0100 From: tim.bu...@pobox.com To: faw...@gmail.com CC: ben-goldb...@hotmail.com; perl6-language@perl.org Subject: Lessons to learn from ithreads (was: threads?) On Tue, Oct 12, 2010 at 03:42:00PM +0200, Leon Timmermans wrote: On Mon, Oct 11, 2010 at 12:32 AM, Ben Goldberg ben-goldb...@hotmail.com wrote: If thread-unsafe subroutines are called, then something like ithreads might be used. For the love of $DEITY, let's please not repeat ithreads! It's worth remembering that ithreads are far superior to the older 5005threads model, where multiple threads ran with an interpreter. [Shudder] It's also worth remembering that real O/S level threads are needed to work asynchronously with third-party libraries that would block. Database client libraries that don't offer async support are an obvious example. I definitely agree that threads should not be the dominant form of concurrency, and I'm certainly no fan of working with O/S threads. They do, however, have an important role and can't be ignored. So I'd like to use this sub-thread to try to identify when lessons we can learn from ithreads. My initial thoughts are: - Don't clone a live interpreter. Start a new thread with a fresh interpreter. - Don't try to share mutable data or data structures. Use message passing and serialization. Tim. If the starting subroutine for a thread is reentrant, then no message passing is needed,and the only serialization that might be needed is for the initial arguments and for thereturn values (which will be gotten by the main thread via join). As for starting a new thread in a fresh interpreter, I think that it might be necessary topopulate that fresh interpreter with (copies of) data which is reachable from thesubroutine that the thread calls... reachability can probably be identified by usingthe same technique the garbage collector uses. This would provide an effect similar toithreads, but only copying what's really needed. To minimize copying, we would only treat things as reachable when we have to -- forexample, if there's no eval-string used a given sub, then the sub only reaches thosescopes (lexical and global) which it actually uses, not every scope that it could use.
Re: Lessons to learn from ithreads (was: threads?)
On Thu, Oct 14, 2010 at 11:52:00PM -0400, Benjamin Goldberg wrote: From: tim.bu...@pobox.com So I'd like to use this sub-thread to try to identify when lessons we can learn from ithreads. My initial thoughts are: - Don't clone a live interpreter. Start a new thread with a fresh interpreter. - Don't try to share mutable data or data structures. Use message passing and serialization. If the starting subroutine for a thread is reentrant, then no message passing is needed, and the only serialization that might be needed is for the initial arguments and for the return values (which will be gotten by the main thread via join). As for starting a new thread in a fresh interpreter, I think that it might be necessary to populate that fresh interpreter with (copies of) data which is reachable from the subroutine that the thread calls... reachability can probably be identified by using the same technique the garbage collector uses. This would provide an effect similar to ithreads, but only copying what's really needed. To minimize copying, we would only treat things as reachable when we have to -- for example, if there's no eval-string used a given sub, then the sub only reaches those scopes (lexical and global) which it actually uses, not every scope that it could use. Starting an empty interpreter, connected to the parent by some 'channels', is simple to understand, implement and test. In contrast, I suspect the kind of partial-cloning you describe above would be complex, hard to implement, hard to test, and fragile to use. It is, for example, more complex than ithreads, so the long history of ithreads bugs should server as a warning. I'd rather err on the side of simplicity. Tim.
Re: Ruby Fibers (was: threads?)
On Fri, Oct 15, 2010 at 09:57:26AM -0400, Mark J. Reed wrote: On Fri, Oct 15, 2010 at 7:42 AM, Leon Timmermans faw...@gmail.com wrote: Continuations and fibers are incredibly useful and should be easy to implement on parrot/rakudo but they aren't really concurrency. They're a solution to a different problem. I would argue that concurrency isn't a problem to solve; it's one form of solution to the problem of maximizing efficiency. Continuations/fibers and asynchronous event loops are different solutions to the same problem. Pardon my ignorance, but are continuations the same thing as co-routines, or is it more primitive than that? Also, doesn't this really just allow context switching outside of the knowledge of a kernel thread, thus allowing one to implement tasks at the user level? Concurrency can apply to a lot of different things, but the problem is now not only implementing an algorithm concurrently but also using the concurrency available in the hardware efficiently. Brett -- Mark J. Reed markjr...@gmail.com -- B. Estrade estr...@gmail.com
Re: Lessons to learn from ithreads (was: threads?)
Earlier, Leon Timmermans wrote: : * Code sharing is actually quite nice. Loading Moose separately in a : hundred threads is not. This is not trivial though, Perl being so : dynamic. I suspect this is not possible without running into the same : issues as ithreads does. On Fri, Oct 15, 2010 at 01:22:10PM +0200, Leon Timmermans wrote: On Wed, Oct 13, 2010 at 1:13 PM, Tim Bunce tim.bu...@pobox.com wrote: If you wanted to start a hundred threads in a language that has good support for async constructs you're almost certainly using the wrong approach. In the world of perl6 I expect threads to be used rarely and for specific unavoidably-bocking tasks, like db access, and where true concurrency is needed. I agree starting a large number of threads is usually the wrong approach, but at the same time I see more reasons to use threads than just avoiding blocking. We live in a multicore world, and it would be nice if it was easy to actually use those cores. I know people who are deploying to 24 core systems now, and that number will only grow. Processes shouldn't be the only way to utilize that. We certainly need to be able to make good use of multiple cores. As I mentioned earlier, we should aim to be able to reuse shared pages of readonly bytecode and jit-compiled subs. So after a module is loaded into one interpreter it should be much cheaper to load it into others. That's likely to be a much simpler/safer approach than trying to clone interpreters. Another important issue here is portability of concepts across implementations of perl6. I'd guess that starting a thread with a fresh interpreter is likely to be supportable across more implementations than starting a thread with cloned interpreter. Also, if we do it right, it shouldn't make much difference if the new interpreter is just a new thread or also a new process (perhaps even on a different machine). The IPC should be sufficiently abstracted to just work. (Adding thread/multiplicity support to NYTProf shouldn't be too hard. I don't have the time/inclination to do it at the moment, but I'll fully support anyone who has.) I hate how you once again make my todo list grow :-p Well volunteered! ;) Tim.
Re: Ruby Fibers
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/15/10 10:22 , B. Estrade wrote: Pardon my ignorance, but are continuations the same thing as co-routines, or is it more primitive than that? Also, doesn't this really just allow context switching outside of the knowledge of a kernel thread, thus allowing one to implement tasks at the user level? Continuations are lower level; coroutines are one of many things you can implement with them. - -- brandon s. allbery [linux,solaris,freebsd,perl] allb...@kf8nh.com system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAky5xRIACgkQIn7hlCsL25VNgQCgoaK28S7W2C6mMBrU9kpdmBUM 8UMAmQGaT/46BzW2nNXOsX/nf546QgoE =0ee6 -END PGP SIGNATURE-
Re: Lessons to learn from ithreads (was: threads?)
On Fri, Oct 15, 2010 at 10:56 AM, Tim Bunce tim.bu...@pobox.com wrote: ... Another important issue here is portability of concepts across implementations of perl6. I'd guess that starting a thread with a fresh interpreter is likely to be supportable across more implementations than starting a thread with cloned interpreter. ... Well volunteered! ;) Tim. Hi, I don't have much to offer on the topic of concurrency, but as someone who is in the process of slowly implementing a native-ish code compiler for Perl 6 (technically probably a compiler to LLVM assembly with the intention of then compiling to native code), I'd like to remind everyone that not every implementation will have an interpreter. I don't think you actually necessarily mean an interpreter here, but rather whatever structure is analogous to that which, in an interpreter, would hold the interpreter's global state. If this is the case, I think it may be helpful to state more precisely what state you think would need to be cloned or recreated between threads or processes and what would not. Also, it is important to consider how different designs will affect the complexity and performance of concurrency primitives for Perl 6 implementations (especially for more common implementation strategies), but neither interpreter nor VM appears in a quick grepping of S17. I don't think that should change. -- Tyler Curtis
Re: threads?
Damian (), Matt (): Perhaps we need to think more Perlishly and reframe the entire question. Not: What threading model do we need?, but: What kinds of non-sequential programming tasks do we want to make easy...and how would we like to be able to specify those tasks? I watched a presentation by Guy Steele at the Strange Loop conference on Thursday where he talked about non-sequential programming. One of the interesting things that he mentioned was to use the algebraic properties of an operation to know when a large grouping of operations can be done non-sequentially. For example, we know that the meta reduction operator could take very large lists and split them into smaller lists across all available cores when performing certain operations, like addition and multiplication. If we could mark new operators that we create with this knowledge we could do this for custom operators too. This isn't a new idea, but it seems like it would be a helpful tool in simplifying non-sequential programming and I didn't see it mentioned in this thread yet. This idea seems to be in the air somehow. (Even though all copies of the meme might have its roots in that Guy you mention.) http://irclog.perlgeek.de/perl6/2010-10-15#i_2914961 Perl 6 has all the prerequisites for making this happen. It's mostly a question of marking things up with some trait or other. our multi sub infix:+($a, $b) will optimizeassociativity { ... } (User-defined ops can be markes in exactly the same way.) All that's needed after that is a reduce sub that's sensitive to such traits. Oh, and threads. // Carl
Re: threads?
On Oct 12, 2010, at 9:22 AM, Damian Conway wrote: Perhaps we need to think more Perlishly and reframe the entire question. Not: What threading model do we need?, but: What kinds of non-sequential programming tasks do we want to make easy...and how would we like to be able to specify those tasks? I watched a presentation by Guy Steele at the Strange Loop conference on Thursday where he talked about non-sequential programming. One of the interesting things that he mentioned was to use the algebraic properties of an operation to know when a large grouping of operations can be done non-sequentially. For example, we know that the meta reduction operator could take very large lists and split them into smaller lists across all available cores when performing certain operations, like addition and multiplication. If we could mark new operators that we create with this knowledge we could do this for custom operators too. This isn't a new idea, but it seems like it would be a helpful tool in simplifying non-sequential programming and I didn't see it mentioned in this thread yet. Here are the slides to the talk to which I'm referring: http://strangeloop2010.com/talk/presentation_file/14299/GuySteele-parallel.pdf ~Matt
Re: Methodicals: A better way to monkey type
Stefan (): A methodical is an operator which syntactically behaves as a method but is subject to scoping rules. Methodicals are defined using the ordinary method keyword, qualified with my or our. (TODO: This seems the most natural syntax to me, but it conflicts with existing usage. Which is more worth having on it?) Methodicals do not need to be declared in classes, but they should generally have declared receiver types. { my method make-me-a-sandwich(Num:) { ... } 2.make-me-a-sandwich; # calls our method ham.make-me-a-sandwich; # either an error, or method dispatch } 2.make-ma-a-sandwich; # ordinary method dispatch With just the addition of a single character, this already works in Perl 6: $ perl6 -e 'my make-me-a-sandwich = method (Numeric $sandwichee:) { say Hokay, $sandwichee }; 2.make-me-a-sandwich; ham.make-me-a-sandwich' Hokay, 2 Nominal type check failed for parameter '$sandwichee'; expected Numeric but got Str instead It is maybe a testament to Perl 6's strange consistency that no-one had anticipated this combination of syntax and semantics, and yet there it is, fallen out of more fundamental design choices. There was some surprise and amusement on #perl6 when we discovered this. In summary, the . syntax: * can be used on lexical variables that don't pollute the global referencing environment, * does not compromise encapsulation, * does not require MONKEY_TYPING, * does not conflict with the spec'd usage of the 'method' declaration, * is implemented today, as it turns out. 's all good. // Carl