For my question (2), From my understanding checkpointing ensures the recovery 
from failures.
Sent from my iPhone

> On Jun 22, 2016, at 10:27 AM, pandees waran <pande...@gmail.com> wrote:
> 
> In general, if you have multiple steps in a workflow :
> For every batch 
> 1.stream data from s3 
> 2.write it to hbase
> 3.execute a hive step using the data in s3 
> 
> In this case all these 3 steps are part of the workflow. That's the reason I 
> mentioned about workflow orchestration.
> 
> The other question (2) is about how to manage the clusters without any 
> downtime / data loss .(especially when you want k being down the cluster and 
> create a new one for running spark streaming )
> 
> 
> Sent from my iPhone
> 
>> On Jun 22, 2016, at 10:17 AM, Mich Talebzadeh <mich.talebza...@gmail.com> 
>> wrote:
>> 
>> Hi Pandees,
>> 
>> can you kindly explain what you are trying to achieve by incorporating Spark 
>> streaming with workflow orchestration. Is this some form of back-to-back 
>> seamless integration.
>> 
>> I have not used it myself but would be interested in knowing more about your 
>> use case.
>> 
>> Cheers,
>> 
>> 
>> 
>> 
>> Dr Mich Talebzadeh
>>  
>> LinkedIn  
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>  
>> http://talebzadehmich.wordpress.com
>>  
>> 
>>> On 22 June 2016 at 15:54, pandees waran <pande...@gmail.com> wrote:
>>> Hi Mich, please let me know if you have any thoughts on the below. 
>>> 
>>> ---------- Forwarded message ----------
>>> From: pandees waran <pande...@gmail.com>
>>> Date: Wed, Jun 22, 2016 at 7:53 AM
>>> Subject: spark streaming questions
>>> To: user@spark.apache.org
>>> 
>>> 
>>> Hello all,
>>> 
>>> I have few questions regarding spark streaming :
>>> 
>>> * I am wondering anyone uses spark streaming with workflow orchestrators 
>>> such as data pipeline/SWF/any other framework. Is there any advantages 
>>> /drawbacks on using a workflow orchestrator for spark streaming?
>>> 
>>> *How do you guys manage the cluster(bringing down /creating a new cluster ) 
>>> without any data loss in streaming? 
>>> 
>>> I would like to hear your thoughts on this.
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Thanks,
>>> Pandeeswaran
>> 

Reply via email to