Thanks Jean. I am excited to see some examples of Beam 'getting started' once the bootstrap is complete.
Best, yash On Sun, Mar 13, 2016 at 4:22 PM, Jean-Baptiste Onofré <[email protected]> wrote: > Hi Yash, > > Beam is a SDK, so it runs on an existing cluster. > > You design jobs as pipeline: it's a "programming model". > > For your late data arrival issues, maybe Falcon can help there. > > Regards > JB > > > On 03/13/2016 03:31 AM, Yash Sharma wrote: > >> Hi All, >> I have been recently reading about Apache Beam and am interested in >> exploring how it fits into our stack. >> >> We currently have our hive and spark pipelines. We have the late data >> arrival issues and have to reprocess couple of steps to ensure the data is >> consumed. >> >> Couple of questions on top of my mind are - >> >> 1. Does Beam use the existing cluster or needs its own cluster ? >> 2. How Beam fits with the existing Hive and Spark jobs ? What changes >> might >> be required in the jobs for starting with Beam ? >> >> Best, >> Yash >> >> > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
