Re: General DSL idiom question

Dmitriy Lyubimov Sun, 13 Jul 2014 22:06:07 -0700

On Sun, Jul 13, 2014 at 4:22 PM, Ted Dunning <[email protected]> wrote:


> I have a program that I am trying to build that has this pattern:
>
>     broadcast state to all blocks
>     block map to do a bit of computation, create local state
>     merge all of the local states back to the global state
>     repeat
>
> What is the suggestion for merging the local state back to the global
> state?
>

that would be one of half a dozen of distinct shuffle types in spark they
call "reduce".
one of the realization that I had was that the only invariant part of all
shuffle types a distributed engine currently can offer is map. which was
realized in a custom operatior mapBlock(). that was (almost) the only
customization that this work was willing to offer. being an R-like
algebraic thing first, we don't offer anything beyond that. you still can
apply any of spark shuffle types if you went down to aquire a checkpoint's
`rdd` and then work with full awareness of Spark environment after that. it
would be very easy to accomplish what you say there.

Doing distributed abstractions is probably leaning more torwards data frame
work than algebra. MLI kind of has done just that -- it's just i am still
not sure i see much more there than just an attempt to translate Spark to
yet-another-flavor of `Spark II`-- i am likely wrong about it, there has to
be so much more to MLI.

Re: General DSL idiom question

Reply via email to