On Mon, 17 May 2010 17:20:28 +0100, Dave Whipp - d...@dave.whipp.name
There are very few algorithms that actually benefit from using even low
hundreds of threads, let alone thousands. The ability of Erlang (and go
an IO and many others) to spawn 100,000 threads makes an impressive
demo for the uninitiated, but finding practical uses of such abilities
is very hard.
It may be true that there are only a small number of basic algorithms
that benefit from massive parallelization.
There is a distinct difference between "massive parallelisation" and
"thousands of threads".
The important thing is not the number of algorithms: it's the number
programs and workloads.
From that statement, you do not appear to understand the subject matter of
this thread: Perl 6 concurrency model.
Even if there was only one parallel algorithm, if that algorithm was
needed for the majority of parallel workloads then it would be
In fact, though utilizing thousands of threads may be hard, once you get
to millions of threads then things become interesting again. Physical
simulations, image processing, search, finance, etc., are all fields
that exhibit workloads amenable to large scale parallelization.
Again, "large scale parallelisation" does not equate to "millions of
For CPU-bound processes, there is no benefit in trying to utilise more
than one thread per core--or hardware thread if your cores have
hyper-threading. Context switches are expensive, and running hundreds (let
alone thousands or millions) of threads on 2/4/8/12 core commodity
hardware, means that you'll spend more time context switching than doing
actual work. With the net result of less rather than more throughput.
Sure, there are exotica hardware that have thousands of cores, but it
hardly seems likely that having spent $millions upon such hardware to run
your massively parallel algorithms to solve problems in realistic time
frames, that your going to use a dynamic (no-compiled) language to write
your solutions in.
And for IO-bound processes, asynchronous IO scales far better than
throwing threads at the problem.
Pure SIMD (vectorization) is insufficient for many of these workloads:
programmers really do need to think in terms of threads (most likely
mapped to OpenCL or Cuda under the hood).
If you care to review the thread, you'll find that I'm the
pro-(kernel)threading candidate in this debate.
Sure, OpenCL goes far beyond just throwing SIMD workloads at the local
GPU, extending out to heterogeneous clusters and other massively parallel
hardware setups, but I think that it is beyond the scope of Perl 5 to
consider catering to such systems as a core requirement. Catering to such
exotic systems is far better left to external modules written and tested
by those with the need for and experience of such systems--and with the
hardware to run it on. There is simply no purpose in burdening the 99% of
Perl 6 installations that will never have a need for such things with the
infrastructure to support them in the core.
To use millions of threads you don't focus on what the algorithm is
doing: you focus on where the data is going. If you move data
unnecessarily (or fail to move it when it was necessary) then you'll
burn power and lose performance.
Sorry, but I've got to call you on this.
Parallelisation (threading) is all about improving performance. And the
first three rules of performance are: algorithm; algorithm; algorithm.
Choose the wrong algorithm and you are wasting cycles. Parallelise that
wrong algorithm, and you're just multiplying the number of cycles you're