-streaming-log-aggregation/blob/master/src/main/scala/com/chimpler/sparkstreaminglogaggregation/LogAggregator.scala
Thanks, Mike.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Integrating-Spark-Streaming-with-Reactive-Mongo-tp21828.html
Sent from the Apache
Hi All,
I have Spark Streaming setup to write data to a replicated MongoDB database
and would like to understand if there would be any issues using the
Reactive Mongo library to write directly to the mongoDB? My stack is Apache
Spark sitting on top of Cassandra for the datastore, so my thinking
Hey Mike,
I quickly looked through the example and I found major performance issue.
You are collecting the RDDs to the driver and then sending them to Mongo in
a foreach. Why not doing a distributed push to Mongo?
WHAT YOU HAVE
val mongoConnection = ...
WHAT YOU SHUOLD DO
rdd.foreachPartition