Thanks for the writeup Anand, this looks like a good approach.

> There is a *WIP
<https://docs.google.com/document/d/12j4bDwsIBhMN_8DNT2KGXPol7YS_G-DZFy6fjRsUGOQ/edit#bookmark=id.wdsu0jkyygmh>*
section
in the doc, where I am figuring out a better solution. I would love to hear
any suggestions or alternatives for that section.

Just wanted to boost this in case there are people who don't click through
to the doc. The problem to solve is how to handle loading a new model
without disrupting the current in progress threads that are performing
inference on the old model (since loading a model can take minutes and take
up a lot of space). Anand's current proposal is to load the second model
into memory and require machines to have enough memory to store 2 models.
If anyone has tried loading multiple large objects into a single process
before, some insight on best practices could be helpful!

Thanks,
Danny

On Mon, Nov 21, 2022 at 4:26 PM Anand Inguva via dev <dev@beam.apache.org>
wrote:

> Hi,
>
> I created a doc
> <https://docs.google.com/document/d/12j4bDwsIBhMN_8DNT2KGXPol7YS_G-DZFy6fjRsUGOQ/edit?usp=sharing>[1]
> on a feature that I am working on for the RunInference
> <https://github.com/apache/beam/blob/814a5ded8c493d55edeaf350c808c131289165e8/sdks/python/apache_beam/ml/inference/base.py#L269>
> transform, where users can provide dynamic model updates via side inputs to
> the RunInference transform.
>
> There is a *WIP
> <https://docs.google.com/document/d/12j4bDwsIBhMN_8DNT2KGXPol7YS_G-DZFy6fjRsUGOQ/edit#bookmark=id.wdsu0jkyygmh>*
> section in the doc, where I am figuring out a better solution. I would love
> to hear any suggestions or alternatives for that section.
>
> Please go through the doc and let me know what you think.
>
> Thanks,
> Anand
>
> [1]
> https://docs.google.com/document/d/12j4bDwsIBhMN_8DNT2KGXPol7YS_G-DZFy6fjRsUGOQ/edit?usp=sharing
>

Reply via email to