Re: Stateful mapPartitions

2014-12-05 Thread Patrick Wendell
Yeah the main way to do this would be to have your own static cache of connections. These could be using an object in Scala or just a static variable in Java (for instance a set of connections that you can borrow from). - Patrick On Thu, Dec 4, 2014 at 5:26 PM, Tobias Pfeiffer wrote: > Hi, > > O

Re: Stateful mapPartitions

2014-12-04 Thread Tobias Pfeiffer
Hi, On Fri, Dec 5, 2014 at 3:56 AM, Akshat Aranya wrote: > Is it possible to have some state across multiple calls to mapPartitions > on each partition, for instance, if I want to keep a database connection > open? > If you're using Scala, you can use a singleton object, this will exist once pe

Re: Stateful mapPartitions

2014-12-04 Thread Akshat Aranya
I want to have a database connection per partition of the RDD, and then reuse that connection whenever mapPartitions is called, which results in compute being called on the partition. On Thu, Dec 4, 2014 at 11:07 AM, Paolo Platter wrote: > Could you provide some further details ? > What do you