[jira] [Work logged] (BEAM-14218) Add resource location hints to base RunInference Implementation

ASF GitHub Bot (Jira) Mon, 25 Apr 2022 16:02:07 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-14218?focusedWorklogId=762063&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762063
 ]


ASF GitHub Bot logged work on BEAM-14218:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Apr/22 23:01
            Start Date: 25/Apr/22 23:01
    Worklog Time Spent: 10m 
      Work Description: TheNeuralBit commented on code in PR #17448:
URL: https://github.com/apache/beam/pull/17448#discussion_r858088473


##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -83,14 +84,29 @@ def get_inference_runner(self) -> InferenceRunner:
 
 
 class RunInference(beam.PTransform):
-  """An extensible transform for running inferences."""
-  def __init__(self, model_loader: ModelLoader, clock=None):
+  """An extensible transform for running inferences.
+  Args:
+      model_loader: An implementation of InferenceRunner.
+      clock: A clock implementing get_current_time_in_microseconds.
+      close_to_resource: A string representing the resource location hints.
+  """
+  def __init__(self,
+               model_loader: ModelLoader,
+               clock:_Clock=None,
+               close_to_resource:str=None):
     self._model_loader = model_loader
     self._clock = clock
+    self._close_to_resource = close_to_resource

Review Comment:
   Should this be a property of the model loader? That's what has knowledge of 
the resource, right?



##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -83,14 +84,29 @@ def get_inference_runner(self) -> InferenceRunner:
 
 
 class RunInference(beam.PTransform):
-  """An extensible transform for running inferences."""
-  def __init__(self, model_loader: ModelLoader, clock=None):
+  """An extensible transform for running inferences.
+  Args:
+      model_loader: An implementation of InferenceRunner.
+      clock: A clock implementing get_current_time_in_microseconds.
+      close_to_resource: A string representing the resource location hints.
+  """
+  def __init__(self,
+               model_loader: ModelLoader,
+               clock:_Clock=None,

Review Comment:
   Unrelated: How about defining an interface like NanosecondClock to use here? 
Then _Clock can just be the default implementation



##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -83,14 +84,29 @@ def get_inference_runner(self) -> InferenceRunner:
 
 
 class RunInference(beam.PTransform):
-  """An extensible transform for running inferences."""
-  def __init__(self, model_loader: ModelLoader, clock=None):
+  """An extensible transform for running inferences.
+  Args:
+      model_loader: An implementation of InferenceRunner.
+      clock: A clock implementing get_current_time_in_microseconds.
+      close_to_resource: A string representing the resource location hints.
+  """
+  def __init__(self,
+               model_loader: ModelLoader,
+               clock:_Clock=None,
+               close_to_resource:str=None):
     self._model_loader = model_loader
     self._clock = clock
+    self._close_to_resource = close_to_resource
 
   # TODO(BEAM-14208): Add batch_size back off in the case there
   # are functional reasons large batch sizes cannot be handled.
   def expand(self, pcoll: beam.PCollection) -> beam.PCollection:
+    # TODO(BEAM-13690): Do this unconditionally one resolved.
+    if resources.ResourceHint.is_registered('close_to_resources') and 
self._close_to_resource:
+      pcoll |= (
+          'CloseToResources' >> beam.Map(lambda x: x).with_resource_hints(
+              close_to_resources=self._close_to_resource))

Review Comment:
   Why not add this resource hint to the _RunInferenceDoFn instead of adding an 
identity ParDo?



##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -83,14 +84,29 @@ def get_inference_runner(self) -> InferenceRunner:
 
 
 class RunInference(beam.PTransform):
-  """An extensible transform for running inferences."""
-  def __init__(self, model_loader: ModelLoader, clock=None):
+  """An extensible transform for running inferences.
+  Args:
+      model_loader: An implementation of InferenceRunner.

Review Comment:
   ```suggestion
         model_loader: An implementation of ModelLoader.
   ```





Issue Time Tracking
-------------------

    Worklog Id:     (was: 762063)
    Time Spent: 1h 20m  (was: 1h 10m)

> Add resource location hints to base RunInference Implementation
> ---------------------------------------------------------------
>
>                 Key: BEAM-14218
>                 URL: https://issues.apache.org/jira/browse/BEAM-14218
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Ryan Thompson
>            Assignee: Ryan Thompson
>            Priority: P2
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Our generic version of RunInference should add resource location hints, 
> similar to what TFX-BSL does here.
>  
> [https://github.com/tensorflow/tfx-bsl/blob/182918aaefd5287e7669d87bfc818155470315aa/tfx_bsl/beam/run_inference.py#L117]
>  
> There's no reason that these hints shouldn't be just as relevant to sk-learn 
> or pytorch large models hosted on GC or wherever these hints are supported.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Work logged] (BEAM-14218) Add resource location hints to base RunInference Implementation

Reply via email to