Spark Streaming, download a s3 file to run a script shell on it

Gianluca Privitera Thu, 05 Jun 2014 14:32:27 -0700

Hi,
I've got a weird question but maybe someone has already dealt with it.
My Spark Streaming application needs to
- download a file from a S3 bucket,
- run a script with the file as input,
- create a DStream from this script output.

I've already got the second part done with the rdd.pipe() API thatreally fits my request, but I have no idea how to manage the first part.How can I manage to download a file and run a script on them inside aSpark Streaming Application?

Should I use process() from Scala or it won't work?


Thanks
Gianluca

Spark Streaming, download a s3 file to run a script shell on it

Reply via email to