tvalentyn commented on code in PR #31052:
URL: https://github.com/apache/beam/pull/31052#discussion_r1575524505
##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -952,6 +977,12 @@ def get_preprocess_fns(self) -> Iterable[Callable[[Any],
Any]]:
def should_skip_batching(self) -> bool:
return True
+ def share_model_across_processes(self) -> bool:
Review Comment:
(cleanup, can be deferred)
we can leverage reflection here and delegate calls to base via `__getattr__`
like in
https://github.com/apache/beam/blob/37609ba70fab2216edc338121bf2f3a056a1e490/sdks/python/apache_beam/internal/gcp/auth.py
Per
https://stackoverflow.com/questions/2405590/how-do-i-override-getattr-without-breaking-the-default-behavior,
explicitly defined methods should take priority.
##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -1434,19 +1518,28 @@ def load():
if isinstance(side_input_model_path, str) and side_input_model_path != '':
model_tag = side_input_model_path
if self._model_handler.share_model_across_processes():
- model = multi_process_shared.MultiProcessShared(
- load, tag=model_tag, always_proxy=True).acquire()
+ # TODO - update this to populate a list of models of configurable length
+ models = []
+ for i in range(self._model_handler.max_shared_model_copies()):
+ models.append(
+ multi_process_shared.MultiProcessShared(
+ load, tag=f'{model_tag}{i}', always_proxy=True).acquire())
+ model_wrapper = _CrossProcessModelWrapper(models, model_tag)
else:
model = self._shared_model_handle.acquire(load, tag=model_tag)
+ model_wrapper = _CrossProcessModelWrapper([model], model_tag)
Review Comment:
oh, i actually got confused by the `>=0` condition and thought that you
intended to use the multiprocess-shared regardless of # of models.
_SharedModelWrapper is better.
##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -234,6 +235,9 @@ def __init__(
memory pressure if you load multiple copies. Given a model that
consumes N memory and a machine with W cores and M memory, you should
set this to True if N*W > M.
+ model_copies: The exact number of models that you would like loaded
+ onto your machine. This can be useful if you exactly know your CPU or
Review Comment:
oh my concern was not an incorrect configuration but cognitive burden for
users: would they be thinking if they should set only one param, or both in
their use case, while in the end it doesn't matter. but now it also seems that
`large_model` becomes redundant as it is equivalent to passing `model_copies =
1`, right?
Possibly except the fact that using model_copies is currently disallowed
with KeyedMH, and large_model might still allow that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]