Awesome! I will remember your offer next time I visit Stockholm :) On Thu, May 25, 2017 at 1:48 PM, Vilhelm von Ehrenheim < [email protected]> wrote:
> Wow! That answer truly solved my problem. I would have never thought of > using threading.local for this. Thank you so much! If you ever stop by > Stockholm I'll be happy to buy you guys a beer! > > > On Wed, May 24, 2017 at 6:38 PM, Ahmet Altay <[email protected]> wrote: > >> You can see an example implementation of Luke's suggestion in the >> tensorflow-transform project [1]. Thread local is used in that case, this >> will work for runners that re-use the same thread to execute bundles. >> >> [1] >> *https://github.com/tensorflow/transform/blob/master/tensorflow_transform/beam/impl.py#L253 >> <https://github.com/tensorflow/transform/blob/master/tensorflow_transform/beam/impl.py#L253>* >> >> On Wed, May 24, 2017 at 8:00 AM, Lukasz Cwik <[email protected]> wrote: >> >>> Why not use a singleton like pattern and have a function which either >>> loads and caches the ML model from a side input or returns the singleton if >>> it has been loaded. >>> You'll want to use some form of locking to ensure that you really only >>> load the ML model once. >>> >>> On Wed, May 24, 2017 at 6:18 AM, Vilhelm von Ehrenheim < >>> [email protected]> wrote: >>> >>>> Hi all! >>>> I would like to load a heavy object (think ML model) into memory that >>>> should be available in a ParDo for quick predictions. >>>> >>>> What is the preferred way of doing this without loading the model for >>>> each ParDo call (slow and will flood memory on the nodes). I don't seem to >>>> be able to do it in the DoFn's __init__ block either as this is only done >>>> once for all nodes (my guess here though) and then it breaks when >>>> replicated internally (even on the DirectRunner, I suspect it is pickled >>>> and this object cannot be pickled). If I load it as a side input it seems >>>> to still be loaded into memory separately for each ParDo. >>>> >>>> If there is a better way to handle it in Java I'm happy to do it there >>>> instead. It was just easier to attack the problem w python as the models >>>> were developed in python. >>>> >>>> Any sort of pointers or tips are welcome! >>>> >>>> Thanks! >>>> Vilhelm von Ehrenheim >>>> >>> >>> >> >
