On Mon, 17 May 2010 17:20:28 +0100, Dave Whipp - d...@dave.whipp.name <+nntp+browseruk+a2ac8a2dcb.dpuu#dave.whipp.n...@spamgourmet.com> wrote:

nigelsande...@btconnect.com wrote:

There are very few algorithms that actually benefit from using even low hundreds of threads, let alone thousands. The ability of Erlang (and go an IO and many others) to spawn 100,000 threads makes an impressive demo for the uninitiated, but finding practical uses of such abilities is very hard.

It may be true that there are only a small number of basic algorithms that benefit from massive parallelization.

There is a distinct difference between "massive parallelisation" and "thousands of threads".

The important thing is not the number of algorithms: it's the number programs and workloads.

From that statement, you do not appear to understand the subject matter of this thread: Perl 6 concurrency model.

Even if there was only one parallel algorithm, if that algorithm was needed for the majority of parallel workloads then it would be significant.

In fact, though utilizing thousands of threads may be hard, once you get to millions of threads then things become interesting again. Physical simulations, image processing, search, finance, etc., are all fields that exhibit workloads amenable to large scale parallelization.

Again, "large scale parallelisation" does not equate to "millions of threads".

For CPU-bound processes, there is no benefit in trying to utilise more than one thread per core--or hardware thread if your cores have hyper-threading. Context switches are expensive, and running hundreds (let alone thousands or millions) of threads on 2/4/8/12 core commodity hardware, means that you'll spend more time context switching than doing actual work. With the net result of less rather than more throughput.

Sure, there are exotica hardware that have thousands of cores, but it hardly seems likely that having spent $millions upon such hardware to run your massively parallel algorithms to solve problems in realistic time frames, that your going to use a dynamic (no-compiled) language to write your solutions in.

And for IO-bound processes, asynchronous IO scales far better than throwing threads at the problem.

Pure SIMD (vectorization) is insufficient for many of these workloads: programmers really do need to think in terms of threads (most likely mapped to OpenCL or Cuda under the hood).

If you care to review the thread, you'll find that I'm the pro-(kernel)threading candidate in this debate.

Sure, OpenCL goes far beyond just throwing SIMD workloads at the local GPU, extending out to heterogeneous clusters and other massively parallel hardware setups, but I think that it is beyond the scope of Perl 5 to consider catering to such systems as a core requirement. Catering to such exotic systems is far better left to external modules written and tested by those with the need for and experience of such systems--and with the hardware to run it on. There is simply no purpose in burdening the 99% of Perl 6 installations that will never have a need for such things with the infrastructure to support them in the core.

To use millions of threads you don't focus on what the algorithm is doing: you focus on where the data is going. If you move data unnecessarily (or fail to move it when it was necessary) then you'll burn power and lose performance.

Sorry, but I've got to call you on this.

Parallelisation (threading) is all about improving performance. And the first three rules of performance are: algorithm; algorithm; algorithm. Choose the wrong algorithm and you are wasting cycles. Parallelise that wrong algorithm, and you're just multiplying the number of cycles you're wasting.

Reply via email to