I've had good luck in a similar scenario using a static instance of Guava's loading cache, and fetching from GCS inside the load function.
On Oct 7, 2017 3:52 PM, "Eugene Kirpichov" <[email protected]> wrote: Hi, I'm not sure what you mean by this: "But they are non-serializable so I can't just create a static constructor and create it while starting the pipeline." You can definitely use static variables in DoFn's, same way as you can use them in any other Java code. I'm not sure how serializability is an issue here, because Java serialization doesn't serialize static variables - you serialize object instances, and static variables do not belong to the object instance (of course, unless you're explicitly holding a reference to the static variable through your instance). Did you hit a NotSerializableException? Can you show your code and/or try running with the JVM flag -Dsun.io.serialization.extendedDebugInfo=true ? You need to be very careful with thread safety though - indeed there will be multiple threads running a given DoFn on a given worker (these will be different *instances* of the same DoFn class, but they'll of course still concurrently access the static variables). On Sat, Oct 7, 2017 at 3:44 PM Derek Hao Hu <[email protected]> wrote: > Hi, > > I'm looking for ways to use a static variable in a DoFn. The background > during run-time I need to construct some non-serializable (but expensive) > variables from some binary blobs downloaded from GCS buckets. > > The fact that the construction of these models are expensive makes me feel > I should try to make them static, or at least static to each worker. But > they are non-serializable so I can't just create a static constructor and > create it while starting the pipeline. > > Originally I thought DoFn.Setup is what I need but after trying it seems > DoFn.Setup would be executed per thread instead of per worker. Is there > anything we can use so we can create something that is shared by multiple > threads? > > Thanks, > -- > Derek Hao Hu > > Software Engineer | Snapchat > Snap Inc. >
