Dataflow is a hosted service and tries to abstract an entire pipeline; Spark maps to some components in that pipeline and is software. My first reaction was that Dataflow mapped more to Summingbird, as part of it is a higher-level system for doing a specific thing in batch/streaming -- aggregations.
On Wed, Jun 25, 2014 at 8:23 PM, Aureliano Buendia <buendia...@gmail.com> wrote: > Hi, > > Today Google announced their cloud dataflow, which is very similar to spark > in performing batch processing and stream processing. > > How does spark compare to Google cloud dataflow? Are they solutions trying > to aim the same problem? > >