Use foreachPartition and batch the writes
On Sat, Jul 25, 2015 at 9:14 AM, wrote:
> Hello,
> I am new user of Spark, and need to know what could be the best practice
> to do the following scenario :
>
> - Spark Streaming receives XML messages from Kafka
> - Spark transforms each message of the R
Hello,
I am new user of Spark, and need to know what could be the best practice to do
the following scenario :
- Spark Streaming receives XML messages from Kafka
- Spark transforms each message of the RDD (xml2json + some enrichments)
- Spark store the transformed/enriched messages inside MongoDB