subject:"threads\?"

[perl6/specs] 8b6d0b: Elaborate a bit on exit, END blocks and threads

2014-08-13 Thread GitHub

  Branch: refs/heads/master
  Home:   https://github.com/perl6/specs
  Commit: 8b6d0bb3fef29ad61540730c8ff99b5c69c99709
  
https://github.com/perl6/specs/commit/8b6d0bb3fef29ad61540730c8ff99b5c69c99709
  Author: Elizabeth Mattijsen l...@dijkmat.nl
  Date:   2014-08-11 (Mon, 11 Aug 2014)

  Changed paths:
M S29-functions.pod

  Log Message:
  ---
  Elaborate a bit on exit, END blocks and threads

Re: Lessons to learn from ithreads (was: threads?)

2010-12-05 Thread Joshua ben Jore

On Tue, Oct 12, 2010 at 3:46 PM, Tim Bunce tim.bu...@pobox.com wrote:
 On Tue, Oct 12, 2010 at 03:42:00PM +0200, Leon Timmermans wrote:
 On Mon, Oct 11, 2010 at 12:32 AM, Ben Goldberg ben-goldb...@hotmail.com 
 wrote:
  If thread-unsafe subroutines are called, then something like ithreads
  might be used.

 For the love of $DEITY, let's please not repeat ithreads!

 It's worth remembering that ithreads are far superior to the older
 5005threads model, where multiple threads ran with an interpreter.
 [Shudder]

 It's also worth remembering that real O/S level threads are needed to
 work asynchronously with third-party libraries that would block.
 Database client libraries that don't offer async support are an
 obvious example.

Hi,
I'm showing up only because I happened to check my Perl 6 inbox. For
various work related reasons, I'd peeked my head into a couple other
language VMs threading implementations. Seems relevant to mention more
possibilities:

ruby-1.8:
- green threads
- single actual process
- scheduling handled by switching to something else after N
opcodes are dispatched
- system() and other blocking system calls are implemented as
non-blocking alternatives
- C extensions must also use non-blocking code and be written
to call back to the scheduling core
- able to share data easily without onerous user-level
synchronization because there's really no such thing as being
concurrent

ruby-1.9 + python:
- real threads
- global interpreter lock over the core so they're not CPU concurrent
- don't have the story for C extensions
- able to share data easily without onerous user-level
synchronization because there's really no such thing as being
concurrent

jruby:
- Java
- real threads
- fully concurrent
- mostly can't use C extensions
- able to share data easily without onerous user-level
synchronization because ... ? Java magic?

Those implementations all do very well for tasks where actual CPU
concurrency isn't needed. A common sweet spot are web services and
other things that divide time over IO waiting. They also do well by
not having to instantiate separable VMs per thread. They also don't
require the user (as in Perl 5) to carefully mark their data as shared
because by default everything is shared (but then they don't have
actual concurrency either).

They do poorly when expected to take advantage of multiple cores. When
using this kind of concurrent software for web services, I've
compensated by just running enough processes to keep my CPUs busy. I
got the advantage of having something that behaved like threads but
was extremely easy to work with. This is very unlike my experience
with Perl 5 threads which I still fear to work with (mostly because I
worry of dangling pointers from difficult to spot miscellaneous magic
attachments).

Our own threading story could use tricks from the above and include
more than just what you've mentioned.

Josh

Re: Lessons to learn from ithreads (was: threads?)

2010-12-02 Thread Tim Bunce

On Fri, Oct 15, 2010 at 11:04:18AM +0100, Tim Bunce wrote:
 On Thu, Oct 14, 2010 at 11:52:00PM -0400, Benjamin Goldberg wrote:
  From: tim.bu...@pobox.com
 
  So I'd like to use this sub-thread to try to identify when lessons we
  can learn from ithreads. My initial thoughts are:
 
  - Don't clone a live interpreter.
  Start a new thread with a fresh interpreter.
 
  - Don't try to share mutable data or data structures.
  Use message passing and serialization.
 
 If the starting subroutine for a thread is reentrant, then no message 
  passing is needed,
 and the only serialization that might be needed is for the initial 
  arguments and for the
 return values (which will be gotten by the main thread via join).
 As for starting a new thread in a fresh interpreter, I think that it 
  might be necessary to
 populate that fresh interpreter with (copies of) data which is reachable 
  from the
 subroutine that the thread calls... reachability can probably be 
  identified by using
 the same technique the garbage collector uses.  This would provide an 
  effect similar to
 ithreads, but only copying what's really needed.
 To minimize copying, we would only treat things as reachable when we 
  have to -- for
 example, if there's no eval-string used a given sub, then the sub only 
  reaches those
 scopes (lexical and global) which it actually uses, not every scope that 
  it could use.
 
 Starting an empty interpreter, connected to the parent by some
 'channels', is simple to understand, implement and test.

I was recently reminded of the ongoing formal specification of Web Workers,
which fits that description quite well:

http://www.whatwg.org/specs/web-workers/current-work/

Since no one seems to have mentioned it in the thread I thought I would.

From the intro:

This specification defines an API for running scripts in the background
independently of any user interface scripts.

This allows for long-running scripts that are not interrupted by scripts
that respond to clicks or other user interactions, and allows long tasks
to be executed without yielding to keep the page responsive.

Workers (as these background scripts are called herein) are relatively
heavy-weight, and are not intended to be used in large numbers. For
example, it would be inappropriate to launch one worker for each pixel
of a four megapixel image. The examples below show some appropriate uses
of workers.

Generally, workers are expected to be long-lived, have a high start-up
performance cost, and a high per-instance memory cost.

Tim.


 In contrast, I suspect the kind of partial-cloning you describe above
 would be complex, hard to implement, hard to test, and fragile to use.
 It is, for example, more complex than ithreads, so the long history of
 ithreads bugs should server as a warning.
 
 I'd rather err on the side of simplicity.
 
 Tim.

Re: Ruby Fibers (was: threads?)

2010-11-07 Thread Mark J. Reed

On Fri, Oct 15, 2010 at 10:22 AM, B. Estrade estr...@gmail.com wrote:
 Pardon my ignorance, but are continuations the same thing as
 co-routines, or is it more primitive than that?

Continuations are not the same thing as coroutines, although they can
be used to implement coroutines - in fact, continuations can be used
to implement any sort of flow control whatsoever, because they are a
way of generalizing flow control.  Goto's, function calls, coroutines,
setjmp/longjmp, loops, exception throwing and catching - these and
more can all be regarded as special cases of continuation
manipulation.

A continuation is just a snapshot of a point in a program's run, which
can then be 'called' later to return control to that point.  The
entire execution context is preserved, so you can call the same
continuation multiple times, re-enter a function that has already
returned, etc.   But state changes are not undone, so the program can
still behave differently after the continuation is called.

-- 
Mark J. Reed markjr...@gmail.com

Re: threads?

2010-10-24 Thread Christian Mueller

I would implement threads in the following form

$thread_counter = 0;
$global = lock;

$thread = new thread( \thread_sub );
$thread-start();

thread_sub {
lock( $global ) {
print i'm thread , ++$thread_counter, \n;
}
}

It's a mixture of ithreads and the C# threading model. The thread works in
the same interpreter. You have to do locking by yourself. That would make
it light weighted and gives you the power to do everything you want. I
don't think that normal threads are very difficult to understand. But it
gives the highest flexibility.

Re: threads?

2010-10-22 Thread Aaron Sherman

On Thu, Oct 21, 2010 at 6:04 PM, Darren Duncan dar...@darrenduncan.netwrote:

Aaron Sherman wrote:

Things that typically precipitate threading in an application:

- Blocking IO
- Event management (often as a crutch to avoid asynchronous code)
- Legitimately parallelizable, intense computing

Interestingly, the first two tend to be where most of the need comes from
and the last one tends to be what drives most discussion of threading.

The last one in particular would legitimately get attention when one
considers that it is for this that the concern about using multi-core
machines efficiently comes into play.

That sounds great, but what's the benefit to a common use case? Sorting
lists with higher processor overhead and waste heat in applications that
traditionally weren't processor-bound in the first place?

Over the past 20+ years, I've seen some very large, processor-bound
applications that could (and in some cases, did) benefit from threading over
multiple cores. However, they were so far in the minority as to be nearly
invisible, and in many cases such applications can simply be run multiple
times per host in order to VERY efficiently consume every available
processor.

The vast majority of my computing experience has been in places where I'm
actually willing to use Perl, a grossly inefficient language (I say this,
coming as I do from C, not in comparison to other HLLs), because my
performance concerns are either non-existent or related almost entirely to
non-trivial IO (i.e. anything sendfile can do).

The first 2 are more about lowering latency and appearing responsive to a
user on a single core machine.

Write me a Web server, and we'll talk. Worse, write a BitTorrent client that
tries to store its results into a high performance, local datastore without
reducing theoretical, back-of-the-napkin throughput by a staggering amount.
Shockingly enough, neither of these frequently used examples are
processor-bound.

The vast majority of today's applications are written with network
communications in mind to one degree or another. The user, isn't so much
interesting as servicing network and disk IO responsively enough that
hardware and network protocol stacks wait on you to empty or fill a buffer
as infrequently as possible. This is essential in such rare circumstances
as:

- Database intensive applications
- Moving large data files across wide area networks
- Parsing and interpreting highly complex languages inline from
data received over multiple, simultaneous network connections (sounds like
this should be rare, but your browser does it every time you click on a
link)

Just in working with Rakudo, I have to use git, make and Perl itself, all of
which can improve CPU performance all they like, but will ultimately run
slow if they don't handle reading dozens of files, possibly from multiple IO
devices (disks, network filesystems, remote repositories, etc) as
responsively as possible.

Now, to back up and think this through, there is one place where multi-core
processor usage is going to become critical over the next few years: phones.
Android-based phones are going multi-core within the next six months. My
money is on a multi-core iPhone within a year. These platforms are going to
need to take advantage of multiple cores for primarily single-application
performance in a low-power environment.

So, I don't want you to think that I'm blind to the need you describe. I
just don't want you to be unrealistic about the application balance out
there.

I think that Perl 6's implicit multi-threading approach such as for
hyperops or junctions is a good best first choice to handle many common
needs, the last list item above, without users having to think about it.
Likewise any pure functional code. -- Darren Duncan

It's very common for people working on the design or implementation of a
programming language to become myopic with respect to the importance of
executing code as quickly as possible, and I'm not faulting anyone for that.
It's probably a good thing in most circumstances, but in this case, assuming
that the largest need is going to be the execution of code turns out to be a
misleading instinct. Computers execute code far, far less than you would
expect, and the cost of failing to service events is often orders of
magnitude greater than the cost of spending twice the number of cycles doing
so.

PS: Want an example of how important IO is? Google has their own multi-core
friendly network protocol modifications to Linux that have been pushed out
in the past 6 months:

95 matches

Mail list logo