Re: init / shutdown for complex map job?

2014-12-28 Thread Akhil Das
Something like? val a = myRDD.mapPartitions(p = { //Do the init //Perform some operations //Shut it down? }) Thanks Best Regards On Sun, Dec 28, 2014 at 1:53 AM, Kevin Burton bur...@spinn3r.com wrote: I have a job where I want to map over

Re: init / shutdown for complex map job?

2014-12-28 Thread Sean Owen
You can't quite do cleanup in mapPartitions in that way. Here is a bit more explanation (farther down): http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/ On Dec 28, 2014 8:18 AM, Akhil Das ak...@sigmoidanalytics.com wrote: Something like? val a =

Re: init / shutdown for complex map job?

2014-12-28 Thread Ray Melton
A follow-up to the blog cited below was hinted at, per But Wait, There's More ... To keep this post brief, the remainder will be left to a follow-up post. Is this follow-up pending? Is it sort of pending? Did the follow-up happen, but I just couldn't find it on the web? Regards, Ray. On Sun,

Re: init / shutdown for complex map job?

2014-12-28 Thread Sean Owen
(Still pending, but believe it's in progress and being written by a colleague here.) On Sun, Dec 28, 2014 at 2:41 PM, Ray Melton rtmel...@gmail.com wrote: A follow-up to the blog cited below was hinted at, per But Wait, There's More ... To keep this post brief, the remainder will be left to a

init / shutdown for complex map job?

2014-12-27 Thread Kevin Burton
I have a job where I want to map over all data in a cassandra database. I’m then selectively sending things to my own external system (ActiveMQ) if the item matches criteria. The problem is that I need to do some init and shutdown. Basically on init I need to create ActiveMQ connections and on