Thats a great post. Thanks. - Thanks, via mobile, excuse brevity. On Mar 14, 2016 11:49 AM, "Jean-Baptiste Onofré" <[email protected]> wrote:
> Hi Yash, > > you can already take a look on Google Dataflow examples, and blog posts ( > http://blog.nanthrax.net/2016/01/introducing-apache-dataflow/) > > Regards > JB > > On 03/13/2016 11:46 PM, Yash Sharma wrote: > >> Thanks Jean. >> I am excited to see some examples of Beam 'getting started' once the >> bootstrap is complete. >> >> Best, >> yash >> >> >> >> On Sun, Mar 13, 2016 at 4:22 PM, Jean-Baptiste Onofré <[email protected]> >> wrote: >> >> Hi Yash, >>> >>> Beam is a SDK, so it runs on an existing cluster. >>> >>> You design jobs as pipeline: it's a "programming model". >>> >>> For your late data arrival issues, maybe Falcon can help there. >>> >>> Regards >>> JB >>> >>> >>> On 03/13/2016 03:31 AM, Yash Sharma wrote: >>> >>> Hi All, >>>> I have been recently reading about Apache Beam and am interested in >>>> exploring how it fits into our stack. >>>> >>>> We currently have our hive and spark pipelines. We have the late data >>>> arrival issues and have to reprocess couple of steps to ensure the data >>>> is >>>> consumed. >>>> >>>> Couple of questions on top of my mind are - >>>> >>>> 1. Does Beam use the existing cluster or needs its own cluster ? >>>> 2. How Beam fits with the existing Hive and Spark jobs ? What changes >>>> might >>>> be required in the jobs for starting with Beam ? >>>> >>>> Best, >>>> Yash >>>> >>>> >>>> -- >>> Jean-Baptiste Onofré >>> [email protected] >>> http://blog.nanthrax.net >>> Talend - http://www.talend.com >>> >>> >> > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
