Spark StreamingStatefull information

2015-10-22 Thread Arttii
Hi, So I am working on a usecase, where Clients are walking in and out of geofences and sendingmessages based on that. I currently have some in Memory Broadcast vars to do certain lookups for client and geofence info, some of this is also coming from Cassandra. My current quandry is that I need

Spark Streaming many subscriptions vs many jobs

2015-09-29 Thread Arttii
Hi, So I am working on a project where we might end up having a bunch of decoupled logic components that have to run inside spark streaming. We are using KAFKA as the source of streaming data. My first question is; is it better to chain these logics together by applying transforms to a single rdd