Thanks guys for your quick reply! I've just realized I've made a stupid bug
in my double checked locking implementation. :)

Things appear to be working fine. It seems there is no strange thing
related to static variables right now.

Thanks!

Derek

On Sat, Oct 7, 2017 at 4:07 PM, Kevin Peterson <[email protected]> wrote:

> I've had good luck in a similar scenario using a static instance of
> Guava's loading cache, and fetching from GCS inside the load function.
>
> On Oct 7, 2017 3:52 PM, "Eugene Kirpichov" <[email protected]> wrote:
>
> Hi,
> I'm not sure what you mean by this: "But they are non-serializable so I
> can't just create a static constructor and create it while starting the
> pipeline."
>
> You can definitely use static variables in DoFn's, same way as you can use
> them in any other Java code. I'm not sure how serializability is an issue
> here, because Java serialization doesn't serialize static variables - you
> serialize object instances, and static variables do not belong to the
> object instance (of course, unless you're explicitly holding a reference to
> the static variable through your instance). Did you hit a
> NotSerializableException? Can you show your code and/or try running with
> the JVM flag -Dsun.io.serialization.extendedDebugInfo=true ?
>
> You need to be very careful with thread safety though - indeed there will
> be multiple threads running a given DoFn on a given worker (these will be
> different *instances* of the same DoFn class, but they'll of course still
> concurrently access the static variables).
>
> On Sat, Oct 7, 2017 at 3:44 PM Derek Hao Hu <[email protected]>
> wrote:
>
>> Hi,
>>
>> I'm looking for ways to use a static variable in a DoFn. The background
>> during run-time I need to construct some non-serializable (but expensive)
>> variables from some binary blobs downloaded from GCS buckets.​
>>
>> The fact that the construction of these models are expensive makes me
>> feel I should try to make them static, or at least static to each worker.
>> But they are non-serializable so I can't just create a static constructor
>> and create it while starting the pipeline.
>>
>> Originally I thought DoFn.Setup is what I need but after trying it seems
>> DoFn.Setup would be executed per thread instead of per worker. Is there
>> anything we can use so we can create something that is shared by multiple
>> threads?
>>
>> Thanks,
>> --
>> Derek Hao Hu
>>
>> Software Engineer | Snapchat
>> Snap Inc.
>>
>
>


-- 
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.

Reply via email to