Sorry. Never mind... I guess that's what "Summingbird" is all about. Never heard of it.
> On Jun 27, 2014, at 7:10 PM, Marco Shaw <marco.s...@gmail.com> wrote: > > Dean: Some interesting information... Do you know where I can read more about > these coming changes to Scalding/Cascading? > >> On Jun 27, 2014, at 9:40 AM, Dean Wampler <deanwamp...@gmail.com> wrote: >> >> ... and to be clear on the point, Summingbird is not limited to MapReduce. >> It abstracts over Scalding (which abstracts over Cascading, which is being >> moved from MR to Spark) and over Storm for event processing. >> >> >>> On Fri, Jun 27, 2014 at 7:16 AM, Sean Owen <so...@cloudera.com> wrote: >>> On Thu, Jun 26, 2014 at 9:15 AM, Aureliano Buendia <buendia...@gmail.com> >>> wrote: >>> > Summingbird is for map/reduce. Dataflow is the third generation of >>> > google's >>> > map/reduce, and it generalizes map/reduce the way Spark does. See more >>> > about >>> > this here: http://youtu.be/wtLJPvx7-ys?t=2h37m8s >>> >>> Yes, my point was that Summingbird is similar in that it is a >>> higher-level service for batch/streaming computation, not that it is >>> similar for being MapReduce-based. >>> >>> > It seems Dataflow is based on this paper: >>> > http://pages.cs.wisc.edu/~akella/CS838/F12/838-CloudPapers/FlumeJava.pdf >>> >>> FlumeJava maps to Crunch in the Hadoop ecosystem. I think Dataflows is >>> more than that but yeah that seems to be some of the 'language'. It is >>> similar in that it is a distributed collection abstraction. >> >> >> >> -- >> Dean Wampler, Ph.D. >> Typesafe >> @deanwampler >> http://typesafe.com >> http://polyglotprogramming.com