Hi, I'm looking for ways to use a static variable in a DoFn. The background during run-time I need to construct some non-serializable (but expensive) variables from some binary blobs downloaded from GCS buckets.
The fact that the construction of these models are expensive makes me feel I should try to make them static, or at least static to each worker. But they are non-serializable so I can't just create a static constructor and create it while starting the pipeline. Originally I thought DoFn.Setup is what I need but after trying it seems DoFn.Setup would be executed per thread instead of per worker. Is there anything we can use so we can create something that is shared by multiple threads? Thanks, -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.
