Re: [PR] Add ability to load multiple copies of a model across processes [beam]

via GitHub Mon, 22 Apr 2024 19:45:42 -0700


tvalentyn commented on code in PR #31052:
URL: https://github.com/apache/beam/pull/31052#discussion_r1575524505



##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -952,6 +977,12 @@ def get_preprocess_fns(self) -> Iterable[Callable[[Any], 
Any]]:
   def should_skip_batching(self) -> bool:
     return True
 
+  def share_model_across_processes(self) -> bool:

Review Comment:
   (cleanup, can be deferred)
   
   we can leverage reflection here and delegate calls to base via `__getattr__` 
 like in 
   
   
https://github.com/apache/beam/blob/37609ba70fab2216edc338121bf2f3a056a1e490/sdks/python/apache_beam/internal/gcp/auth.py
   
   
   Per 
https://stackoverflow.com/questions/2405590/how-do-i-override-getattr-without-breaking-the-default-behavior,
 explicitly defined methods should take priority.
   



##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -1434,19 +1518,28 @@ def load():
     if isinstance(side_input_model_path, str) and side_input_model_path != '':
       model_tag = side_input_model_path
     if self._model_handler.share_model_across_processes():
-      model = multi_process_shared.MultiProcessShared(
-          load, tag=model_tag, always_proxy=True).acquire()
+      # TODO - update this to populate a list of models of configurable length
+      models = []
+      for i in range(self._model_handler.max_shared_model_copies()):
+        models.append(
+            multi_process_shared.MultiProcessShared(
+                load, tag=f'{model_tag}{i}', always_proxy=True).acquire())
+      model_wrapper = _CrossProcessModelWrapper(models, model_tag)
     else:
       model = self._shared_model_handle.acquire(load, tag=model_tag)
+      model_wrapper = _CrossProcessModelWrapper([model], model_tag)

Review Comment:
   oh, i actually got confused by the `>=0` condition and thought that you 
intended  to use the multiprocess-shared regardless of # of models. 
_SharedModelWrapper is better.



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -234,6 +235,9 @@ def __init__(
         memory pressure if you load multiple copies. Given a model that
         consumes N memory and a machine with W cores and M memory, you should
         set this to True if N*W > M.
+      model_copies: The exact number of models that you would like loaded
+        onto your machine. This can be useful if you exactly know your CPU or

Review Comment:
   oh my concern was not an incorrect configuration but cognitive burden for 
users: would they be thinking if they should set only one param, or both in 
their use case, while in the end it doesn't matter. but now it also seems that  
`large_model` becomes redundant as it is equivalent to passing `model_copies = 
1`, right?
   
   Possibly except the fact that   using model_copies is currently disallowed 
with KeyedMH, and large_model might still allow that. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add ability to load multiple copies of a model across processes [beam]

Reply via email to