the only problem with that I see is that would not be algebra any more. that would be functional programming, and as such there are probably better frameworks to address these kind of things than a DRM. Drm currently suggest just to exist to engine level primitives, i.e. do something like A.rdd.reduce(_+_).
On Sun, Jul 13, 2014 at 10:02 PM, Anand Avati <[email protected]> wrote: > How about a new drm API: > > > type ReduceFunc = (Vector, Vector) => Vector > > def reduce(rf: ReduceFunc): Vector = { ... } > > The row keys in this case are ignored/erased, but I'm not sure if they are > useful (or even meaningful) for reduction. Such an API should be sufficient > for kmeans (in combination with mapBlock). But does this feel generic > enough? Maybe a good start? Feedback welcome. > > > > On Sun, Jul 13, 2014 at 6:34 PM, Ted Dunning <[email protected]> > wrote: > > > > > Yeah. Collect was where I had gotten, and was rather sulky about the > > results. > > > > It does seem like a reduce is going to be necessary. > > > > Anybody else have thoughts on this? > > > > Sent from my iPhone > > > > > On Jul 13, 2014, at 17:58, Anand Avati <[email protected]> wrote: > > > > > > collect(), hoping the result fits in memory, and do the reduction > > in-core. > > > I think some kind of a reduce operator needs to be introduced for doing > > > even simple things like scalable kmeans. Haven't thought of how it > would > > > look yet. > > >
