------- Forwarded message -------
From: nigelsande...@btconnect.com
To: "Dave Whipp - d...@whipp.name" <+nntp+browseruk+e66dbbe0cf.dave#whipp.n...@spamgourmet.com>
Cc:
Subject: Re: Parallelism and Concurrency was Re: Ideas for a"Object-Belongs-to-Thread" (nntp: message 4 of 20) threading model (nntp: message 20 of 20 -lastone!-) (nntp: message 13 of 20)
Date: Mon, 17 May 2010 22:31:45 +0100

On Mon, 17 May 2010 20:33:24 +0100, Dave Whipp - dave_wh...@yahoo.com
<+nntp+browseruk+2dcf7cf254.dave_whipp#yahoo....@spamgourmet.com> wrote:

From that statement, you do not appear to understand the subject matter
of this thread: Perl 6 concurrency model.

Actually, the reason for my post was that I fear that I did understand the subject matter of the thread: seems to me that any reasonable discussion of "perl 6 concurrency" should not be too focused on pthreads-style threading.

Okay. Now we're at cross-purposes about the heavily overloaded term
"threading". Whilst GPUs overload the term "threading" for their internal
operations, they are for the most part invisible to applications
programmer. And quite different to the 100,000 threads demos in the Go and
Erlang documentation to which I referred. The latter being MIMD
algorithms, and significantly harder to find applications for than, SIMD
algorithms which are commonplace and well understood.

My uses of the terms "threading" and "threads" are limited specifically to
MIMD threading of two forms:

Kernel threading    : pthreads, Win32/64 threads etc.
User-space threading: green threads; coroutines; goroutines; Actors; etc.

See below for why I've been limiting myself to these two definitions.

OpenCL/Cuda are not exotic $M hardware: they are available (and performant) on any PC (or Mac) that is mainstream or above. Millions of threads is not a huge number: its one thread per pixel on a 720p video frame (and I see no reason, other than performance, not to use Perl6 for image processing).

If the discussion is stricly limited abstracting remote procedure calls, then I'll back away. But the exclusion of modules that map hyper-operators (and feeds, etc.) to OpenCL from the generic concept of "perl6 concurrency" seems rather blinkered.



FWIW, I absolutely agree with you that the mapping between Perl 6
hyper-operators and (GPU-based or otherwise) SIMD instructions is a
natural fit. But, in your post above you said:

"Pure SIMD (vectorization) is insufficient for many of these workloads:
programmers really do need to think in terms of threads (most likely
mapped to OpenCL or Cuda under the hood)."

By which I took you to mean that in-box SIMD (be it x86/x64 CPU or GPU
SIMD instruction sets) was "insufficient for many of the[se] workloads"
you were considering. And therefore took you to be suggesting that the
Perl 6 should also be catering for the heterogeneous aspects of OpenCL in
core.

I now realise that you were distinguishing between CPU SIMD instructions
and GPU SIMD instructions. But the real point here is Perl 6 doesn't need
a threading model to use and benefit from using GPU SIMD.

Any bog-standard single-threaded process can benefit from using CUDA or
the homogeneous aspect of OpenCL where available, for SIMD algorithms.
Their use can be entirely transparent to the language semantics for
built-in operations like the hyper-operators. Ideally, the Perl 6 runtime
would implement roles for OpenCl or CUDA for hyper-operations; fall back
to CPU SIMD instructions; ad fall back again to old-fashioned loops if
neither where available. This would all be entirely transparent to the
Perl 6 programmer, just as utilising discrete FPUs was transparent to the
C programmer back in the day. In an ideal world, Perl 6.0.0.0.0 would ship
with just the looping hyper-operator implementation; and it would be down
to users loading in an appropriately named Role that matched the
hardware's capabilities that would then get transparently picked up and
used by the hyper-operations to give them CPU-SIMD or GPU-SIMD as
available. Or perhaps these would become perl6 build-time configuration
options.

The discussion (which originally started outside of this list), was about
MIMD threading--the two categories above--in order to utilise the multiple
*C*PU cores that are now ubiquitous. For this Perl 6 does need to sort out
a threading model.

The guts of the discussion has been kernel threading (and mutable shared
state) is necessary. The perception being that by using user-threading (on
a single core at a time), you avoid the need for and complexities of
locking and synchronisation. And one of the (I believe spurious) arguments
for the use of user-space (MIMD) threading, is that they are lightweight
which allows you to runs thousands of concurrent threads.

And it does. I've done it with Erlang right here on my dirt-cheap Intel
Core2 Quad Q6600 processor. But, no matter how hard you try, you can never
push the CPU utilisation above 25%, because those 100,000 user-threads all
run in a single kernel-thread. And that means I waste 75% of my processing
power. And next year (or maybe the spring after), when the lowest spec
Magny-Cours 12-core processor systems have fallen to my 'dirt-cheap' price
point, I'd be wasting 92% of my processing power.

And for those geneticists and engineers trying to use Perl 6 on their
relatively cheap 48-core boxes to chug through their inherently MIMD
algorithms, they'd be wasting 98% of their CPU power if Perl 6 does not
provide for a threading model that scales across multiple cores.

I hope that gives some context to a) my misunderstanding of your post; b)
my continued advocacy that kernel threading has to underpin Perl 6's
threading model.

Java has has user-space threading (green threads) for years; and it was an
ongoing nightmare until they adopted kernels threads in Java 1.5.
Erlang has had user-space threading (coroutines) for years; but they've
had to add kernel threading in the last couple of versions in order to
scale.
IO has had coroutines; but has now added kernel threading in order to
scale.

Buk

Reply via email to