I have streaming application wherein I train the model using a receiver input
stream in 4 sec batches
val stream = ssc.receiverStream(receiver) //receiver gets new data every
batch
model.trainOn(stream.map(Vectors.parse))
If I use
model.latestModel.clusterCenters.foreach(println)
the value of clu
bq. streamingContext.remember("duration") did not help
Can you give a bit more detail on the above ?
Did you mean the job encountered OOME later on ?
Which Spark release are you using ?
tried these 2 global settings (and restarted the app) after enabling cache
for stream1
conf.set("spark.stream
We have a streaming application containing approximately 12 jobs every batch,
running in streaming mode (4 sec batches). Each job has several
transformations and 1 action (output to cassandra) which causes the
execution of the job (DAG)
For example the first job,
/job 1
---> receive Stream A -->