Re: Newbie question

Yash Sharma Mon, 14 Mar 2016 01:13:09 -0700

Thats a great post.  Thanks.

- Thanks, via mobile,  excuse brevity.
On Mar 14, 2016 11:49 AM, "Jean-Baptiste Onofré" <[email protected]> wrote:


> Hi Yash,
>
> you can already take a look on Google Dataflow examples, and blog posts (
> http://blog.nanthrax.net/2016/01/introducing-apache-dataflow/)
>
> Regards
> JB
>
> On 03/13/2016 11:46 PM, Yash Sharma wrote:
>
>> Thanks Jean.
>> I am excited to see some examples of Beam 'getting started' once the
>> bootstrap is complete.
>>
>> Best,
>> yash
>>
>>
>>
>> On Sun, Mar 13, 2016 at 4:22 PM, Jean-Baptiste Onofré <[email protected]>
>> wrote:
>>
>> Hi Yash,
>>>
>>> Beam is a SDK, so it runs on an existing cluster.
>>>
>>> You design jobs as pipeline: it's a "programming model".
>>>
>>> For your late data arrival issues, maybe Falcon can help there.
>>>
>>> Regards
>>> JB
>>>
>>>
>>> On 03/13/2016 03:31 AM, Yash Sharma wrote:
>>>
>>> Hi All,
>>>> I have been recently reading about Apache Beam and am interested in
>>>> exploring how it fits into our stack.
>>>>
>>>> We currently have our hive and spark pipelines. We have the late data
>>>> arrival issues and have to reprocess couple of steps to ensure the data
>>>> is
>>>> consumed.
>>>>
>>>> Couple of questions on top of my mind are -
>>>>
>>>> 1. Does Beam use the existing cluster or needs its own cluster ?
>>>> 2. How Beam fits with the existing Hive and Spark jobs ? What changes
>>>> might
>>>> be required in the jobs for starting with Beam ?
>>>>
>>>> Best,
>>>> Yash
>>>>
>>>>
>>>> --
>>> Jean-Baptiste Onofré
>>> [email protected]
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>>
>>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: Newbie question

Reply via email to