Re: Geode and Java 8 Streams

Dan Smith Fri, 21 Aug 2015 15:58:05 -0700

Just following up on this - I created GEODE-262 to track this feature
request.


Thanks!
-Dan

On Tue, Aug 18, 2015 at 11:43 AM, Anthony Baker <[email protected]> wrote:

> Another place to go with this is to apply an OQL query to generate the
> stream.
>
> region.entrySet().remoteStream(“select * from /myregion.entries e where
> e.key > 10")
>        .filter(e -> e.getKey() % 2 == 0)
>        .map(e -> e.getValue())
>        .reduce(1, Integer::sum);
>
> Anthony
>
>
>
> > On Aug 16, 2015, at 10:56 PM, Jags Ramnarayanan <[email protected]>
> wrote:
> >
> > right. Use Spark's API as input.
> >
> > Dan, if you are anyway extending 'streams' with 'remoteStreams' you
> should
> > be able to extend the API for K-V. I haven't gone through Java8 streams,
> > but, one small step for you could be one giant leap into "Big data" for
> Gem
> > :-)
> >
> > All your tool has to be capable of is implement the "hello world" for big
> > data - count words in sentences. :-)
> > Your output needs to be k-v collection where the key is the word and v is
> > the count. The fastest, scalable guy wins. And, you know what I am
> getting
> > at - we are very used to parallel behavior localized to data but assume a
> > central aggregator. Here you want the aggregator to be parallelized too.
> > Most common solutions use disk for shuffle. Gem's function service can
> > pipeline with its chunking support.
> >
> > After you implement map-reduce read this perspective from Stonebraker -
> >
> https://homes.cs.washington.edu/~billhowe/mapreduce_a_major_step_backwards.html
> > Just kidding.
> >
> >
> >
> > On Sun, Aug 16, 2015 at 4:12 PM, Roman Shaposhnik <[email protected]>
> > wrote:
> >
> >> On Fri, Aug 14, 2015 at 1:51 PM, Dan Smith <[email protected]> wrote:
> >>> The java 8 reduce() method returns a scalar. So my .map().reduce()
> >> example
> >>> didn't really have a shuffle phase. We haven't implemented any sort of
> >>> shuffle, but our reduce is processed on the servers first and then
> >>> aggregated on the client. I'm not quite sure what the best way to work
> a
> >>> shuffle into this stream API would be, actually. I suppose using a map
> >>> followed by a sort(). We didn't do anything clever with sort either :)
> >>
> >> Isn't what you're looking for analogous to  reduce() versus
> reduceByKey()
> >> in Spark terminology
> >>
> >> Thanks,
> >> Roman.
> >>
>
>

Re: Geode and Java 8 Streams

Reply via email to