With Flume, what would be your sink?


On Mon, Jan 16, 2017 at 10:44 PM, Guillermo Ortiz <konstt2...@gmail.com>
wrote:

> I'm wondering to use Flume (channel file)-Spark Streaming.
>
> I have some doubts about it:
>
> 1.The RDD size is all data what it comes in a microbatch which you have
> defined. Risght?
>
> 2.If there are 2Gb of data, how many are RDDs generated? just one and I
> have to make a repartition?
>
> 3.When is the ACK sent back  from Spark to Flume?
>   I guess that if Flume dies, Flume is going to send the same data again
> to Spark
>   If Spark dies, I have no idea if Spark is going to reprocessing same
> data again when it is sent again.
>   Coult it be different if I use Kafka Channel?
>
>
>
>
>


-- 
Best Regards,
Ayan Guha

Reply via email to