Hi, please see this post:
http://apache-spark-user-list.1001560.n3.nabble.com/Keep-state-inside-map-function-tp10968p11009.html Where it says "some setup code here", you could add your code to load the library. Note, however, that this is not once per node, but once per partition, so it might be called more than once on each node. If you want to have setup code that is run once per node, put it in a Scala "object" (as Matei pointed out back then). For example: object JvmLocalResource { val resource = { someInitFunction() new SomeResource() } } Now if you use JvmLocalResource.resource, the someInitFunction() will be called exactly once on each node (in each JVM). If the library loading is synchronous (i.e., doesn't start some fancy background action that is not finished yet when you want to start processing), that should do it. Tobias