Agree. Could be very useful.

Map reduce functionality has been a customer ASK for many years.
Have you considered it extending to K-V pairs so the reduce shuffle is on
the key. This is the essence of Map reduce, right? equivalent of user
defined aggregation on groups of data.

The distributed, parallel function for the 'map' would then hash on the key
to route all map results with the same key to some node in the network.
Essentially a parallel reduce. Maybe already done?



On Thu, Aug 13, 2015 at 6:07 PM, Sudhir Menon <[email protected]> wrote:

> This is pretty neat. Turn this into a feature request and I am sure
> people will find this valuable.
>
> Suds
> Sent from my iPhone
>
> > On Aug 13, 2015, at 5:24 PM, Dan Smith <[email protected]> wrote:
> >
> > Along the same lines as Anthony's email on distributed classloading,
> QiHong
> > and I also hacked up a prototype of shipping Java 8 stream operations to
> > gemfire data stores.
> >
> > What we did:
> >
> > Java 8 allows people to do functional operations on a region using the
> new
> > stream API:
> >
> > region.entrySet().stream()
> >        .filter(e -> e.getKey() % 2 == 0)
> >        .map(e -> e.getValue())
> >        .reduce(1, Integer::sum);
> >
> > It would be much more efficient if the filtering, mapping, reducing, etc.
> > happen on the members that actual host the data, rather than shipping all
> > the data to a single member.
> >
> > So we implemented a new method, remoteStream, which collects the
> operations
> > as they are applied to a stream. When a terminal operation like reduce is
> > encountered, the whole pipeline is sent out to all members with a gemfire
> > function, and the results of processing that pipeline on the data stores
> > are brought back.
> >
> > What we did is basically a quick and dirty proof of concept. There's a
> lot
> > to polish up if we want to make this a real geode feature. Is this
> > something people would be interested in using?
> >
> > The proof of concept code is sitting in github if anyone is interested.
> > Checkout the stream-prototype and look at the tests in
> StreamsP2PDUnitTest:
> >
> > https://github.com/upthewaterspout/incubator-geode
>

Reply via email to