Spark streaming

Antoine DUBOIS Fri, 17 May 2019 05:29:25 -0700

Hello, 

I've a question regarding a use case. 
I have an ETL using spark and working great. 
I use cephFS mounted on all spark node to store data. 
However one problem I have is that b2zipping + transfer from source to spark 
storage is really long. 
I would like to be able to process the file as it's written by chunk of 100MB. 
Is there something like that possible in Spark or do I need to use spark 
streaming, and if using spark streaming would it mean my application would need 
to run as a daemon on the spark node ?


Thank you for your help and ideas. 
Antoine

smime.p7s
Description: S/MIME Cryptographic Signature

Spark streaming

Reply via email to