Hi guys, I am new to spark and we are running a small project that collects data from Kinesis and inserts in to mongo. I would like to share a high level view of how it is done and would love you input on it.
I am fetching kinesis data and for each RDD -> Parsing String data -> Inserting into a mongo storage So what I understand is when in each RDD "we are parsing data”, that is serialized and send to workers. So when I would want to write to mongo. Each workers creates a new connection to write to data. Is there any way I can use a connection pool? By the way I am using scala and spark streaming. A.K.M. Ashrafuzzaman Lead Software Engineer NewsCred (M) 880-175-5592433 Twitter | Blog | Facebook Check out The Academy, your #1 source for free content marketing resources