On 20/12/2010 9:18 AM, Sebastian Sylvan wrote:

Not sure I understand this. What happens if I grab the value from a
view[T], and then store a reference to some internal sub-part of that T
(e.g. let's say the data is a tree, and I want to keep a reference to
some sub-tree)? I can't increment its ref-count, but does that mean I
can't keep a hold of a reference to it at all? I'd have to assume so
since the view[T] only tracks the root and can only give me a "none" for
that, which means I must be prohibited from taking a reference to any
sub-structure?

There are a few options. One is that we develop a linear (non-copyable) layer, which patrick's been working on semantics of (it's useful for modeling resources and dtors) and your view data is one of those. No heap-copies allowed, only aliases. Another option is that each pin 'view' acquired by the viewer actually survives until its *domain* dies (for one-shot worker threads, this is plausible). Another is you use a 'pinned' magic rc that differs from the 'const' magic rc, and causes a local copy (again: detachment) when you do a heap-store of a pointer to a pinned box (aliases are free though).

As with optimizing rc traffic, there's a pretty wide variety of tricks to try here, none of which is particularly painful to try out and measure.

You might want to look at Nested Data Parallel Haskell here. It has the
appealing property that it can flatten/fuse nested data parallel
algorithms for you (e.g. want to walk a binary tree? Just kick off two
jobs for the two children in each node in an almost task-parallel
fashion, and the compiler will flatten it and create one "pass" for each
"level" in the tree automatically).

Very different evaluation model, deep changes to semantics, etc. I don't disagree with the research programme -- and it might keep getting better, attracting programmers, etc. -- but it's not the path I want to follow.

IMO the OpenCL, DirectCompute, CUDA crew are all getting this wrong. The
biggest hurdle with GPGPU isn't that you have to work through a graphics
interface (although that sucks too), which is what they keep spending
all their effort on fixing. The biggest hurdle for me has always been
that you have to manually turn your algorithms inside-out to create a
series of "flat" data parallel passes, rather than have the compiler do
that work for you. It took me something like two days to get a
correct version of a simple odd-even merge sort a few years ago (before
there were code samples around for it!). This is a really, really,
difficult.

I don't know how to avoid some problems in programming remaining hard for the programmer, and am not (yet) aware of completely satisfactory ways to automate such reasoning. We've been waiting for the "magical parallelizing compiler" for 40 years or so. They keep getting better, but there always seems to be a pretty hard limit to the tech.

Maybe the compilers aren't even the issue, and some kind of reduceron haskell-hardware monster will get us there! I'd love to wind up wrong on some tech bets, wind up in a future with "easy parallelism", because that alternative future would be very pretty. But I'm still betting with the more likely (to me) path of a heterogeneous collection of techniques that ease, but do not "make easy", the parallelism problem.

I realise this is highly "researchy"/"unproven" at the moment, but it
might be worth keeping it in mind so you don't make any design decisions
that would rule it out in the future.

Nah, not ruling out. I just don't like depending on too many hypotheticals. If they arise and work well (and are simple, well understood and patent-free) a good research result is hard to turn down :)

-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to