Eric Holk wrote:
> I've been doing some background thinking on what a good data
> parallel story for Rust could be. Most data parallel languages so
> far fall into the camp of requiring a lot of compiler heroics to
> make them work well. Once we get to this point in Rust, it'll
> probably be a good time to think about this some more. I like
> keeping the core language as simple as possible, but maybe we can do
> more aggressive things with syntax extensions.

Keeping the language or the compiler simple?  Most compilers seem to
get more and more complex over time as they aim for greater
performance for the generated code.

A simple and working compiler is stage one.  However, I think there is
an advantage to finding a clean language design that doesn't create
obstacles to later optimisation.  The current minimalistic combination
of tasks, ports and channels seems like a good foundation for later
optimisation, so long as it stays clean.

Are you thinking of syntax extensions to tell the compiler: "optimise
this in a certain way"?  I can see the idea: instead of letting the
compiler recognise certain task-usage patterns, explicitly specify
that that pattern is being used, and let the compiler complain if the
code breaks that pattern.  That does make sense as well.

(I guess an iterator could be one of those patterns, which might
optimise down to something like the existing implementation.)


> As it currently stands, the scheduler is free to run tasks on
> whatever core it wants, although last I checked it doesn't actually
> move tasks except when they are first created. Spawning tasks still
> creates parallelism. The case where the "yield to receiver" pattern
> would work well, I think, is in a program like MapReduce, where a
> bunch of tasks are doing independent work and sending intermediate
> results to a single accumulator task. One way to think of it is that
> rather than everyone sending their results to the receiver, the
> receiver would travel around gathering results from the sender.

I think that there are two sources of information that could be useful
to the runtime task scheduler:

- Static information about a task's pattern of use and how it relates
  to other tasks via channels and ports (whether derived from syntax
  extensions or from static analysis)

- Dynamic observation of the behaviour of the tasks at runtime

The first needs some help from the compiler, and potentially could
result in compiler-based optimisations as well.


> Two examples I was thinking of are stream processing languages like
> StreamIt, and task parallel languages like Cilk. In Cilk, spawn is
> basically a function call, but it marks the stack with a stealable
> continuation. Some other thread can then steal the continuation and
> pick up executing after the spawn statement. This works well because
> Cilk doesn't allow tasks to outlive the function they were spawned
> in, and continuation stealing allows large chunks of work to be
> stolen early on. Languages like StreamIt, on the other hand, define
> programs in terms of streams of data and filters on the data. The
> filters are typically small, so the compiler will try to take
> several adjacent filters and combine them into a bigger filter to
> reduce scheduling overhead. Data parallel languages such as
> Copperhead do similar things.

These ideas are exciting and interesting but which optimisation to
apply would likely depend on the pattern of usage of each individual
task.

Really Rust is giving the programmer an open model with massive
parallelism, but real machines provide only a relatively small amount
of parallelism, so more often than not tasks will be running in a
serial or interleaved fashion.  As soon as the cores are saturated,
certain patterns would be better run optimised inline, and this will
be the more common execution case.  (And I guess this is the case that
Cilk is optimising for.)

Well, there is enormous potential for optimisation when the time
comes.  Just look how many GC schemes Java has gone through behind the
scenes based on a simple, minimalistic, few-promises GC contract with
the programmer.

Jim

-- 
 Jim Peters                  (_)/=\~/_(_)                 [email protected]
                          (_)  /=\  ~/_  (_)
 UazĂș                  (_)    /=\    ~/_    (_)                http://
 in Peru            (_) ____ /=\ ____ ~/_ ____ (_)            uazu.net
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to