You can see an example implementation of Luke's suggestion in the tensorflow-transform project [1]. Thread local is used in that case, this will work for runners that re-use the same thread to execute bundles.
[1] *https://github.com/tensorflow/transform/blob/master/tensorflow_transform/beam/impl.py#L253 <https://github.com/tensorflow/transform/blob/master/tensorflow_transform/beam/impl.py#L253>* On Wed, May 24, 2017 at 8:00 AM, Lukasz Cwik <[email protected]> wrote: > Why not use a singleton like pattern and have a function which either > loads and caches the ML model from a side input or returns the singleton if > it has been loaded. > You'll want to use some form of locking to ensure that you really only > load the ML model once. > > On Wed, May 24, 2017 at 6:18 AM, Vilhelm von Ehrenheim < > [email protected]> wrote: > >> Hi all! >> I would like to load a heavy object (think ML model) into memory that >> should be available in a ParDo for quick predictions. >> >> What is the preferred way of doing this without loading the model for >> each ParDo call (slow and will flood memory on the nodes). I don't seem to >> be able to do it in the DoFn's __init__ block either as this is only done >> once for all nodes (my guess here though) and then it breaks when >> replicated internally (even on the DirectRunner, I suspect it is pickled >> and this object cannot be pickled). If I load it as a side input it seems >> to still be loaded into memory separately for each ParDo. >> >> If there is a better way to handle it in Java I'm happy to do it there >> instead. It was just easier to attack the problem w python as the models >> were developed in python. >> >> Any sort of pointers or tips are welcome! >> >> Thanks! >> Vilhelm von Ehrenheim >> > >
