[jira] [Work logged] (BEAM-13982) Implement Generic RunInference Base class

ASF GitHub Bot (Jira) Wed, 06 Apr 2022 20:32:26 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-13982?focusedWorklogId=753826&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-753826
 ]


ASF GitHub Bot logged work on BEAM-13982:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Apr/22 03:31
            Start Date: 07/Apr/22 03:31
    Worklog Time Spent: 10m 
      Work Description: ryanthompson591 commented on code in PR #16970:
URL: https://github.com/apache/beam/pull/16970#discussion_r844607799


##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -0,0 +1,262 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""An extensible run inference transform.
+
+Users of this module can extend the ModelLoader class for any MLframework. Then
+pass their extended ModelLoader object into RunInference to create a
+RunInference Beam transform for that framework.
+
+The transform will handle standard inference functionality like metric
+collection, sharing model between threads and batching elements.
+"""
+
+import logging
+import os
+import pickle
+import platform
+import sys
+import time
+from typing import Any
+from typing import Iterable
+from typing import Tuple
+
+import apache_beam as beam
+from apache_beam.utils import shared
+
+try:
+  # pylint: disable=g-import-not-at-top
+  import resource
+except ImportError:
+  resource = None
+
+_MICROSECOND_TO_MILLISECOND = 1000
+_NANOSECOND_TO_MICROSECOND = 1000
+_SECOND_TO_MICROSECOND = 1_000_000
+
+
+class InferenceRunner():
+  """Implements running inferences for a framework."""
+  def run_inference(self, batch: Any, model: Any) -> Iterable[Any]:
+    """Runs inferences on a batch of examples and returns an Iterable of 
Predictions."""
+    raise NotImplementedError(type(self))
+
+  def get_num_bytes(self, batch: Any) -> int:
+    """Returns the number of bytes of data for a batch."""
+    return len(pickle.dumps(batch))
+
+  def get_metrics_namespace(self) -> str:
+    """Returns a namespace for metrics collected by the RunInference 
transform."""
+    return 'RunInference'
+
+
+class ModelLoader():
+  """Has the ability to load an ML model."""
+  def load_model(self) -> Any:

Review Comment:
   I had written this originally this way as I was under the impression shared 
needed to live outside of the doFn.  However, if it can live inside the DoFn, 
why can't it also live inside the model loader.
   
   Originally I had built model loader and inference runner to be separate 
interfaces.
   
   My question the is, why not just have a single interface?
   
   class InferenceRunner:
     def init(self):
        self.shared_handle = Shared.shared()
        self.model = None
     def load_model(self):
        self.model = ....
     def run_inference(self):
        ...
   
   Is this what we want to do?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 753826)
    Time Spent: 16h 50m  (was: 16h 40m)

> Implement Generic RunInference Base class
> -----------------------------------------
>
>                 Key: BEAM-13982
>                 URL: https://issues.apache.org/jira/browse/BEAM-13982
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Andy Ye
>            Assignee: Ryan Thompson
>            Priority: P2
>              Labels: run-inference
>          Time Spent: 16h 50m
>  Remaining Estimate: 0h
>
> This base class will have
>  * Metrics
>  * Will call dependent framework-specific classes
>  * Unit tests



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work logged] (BEAM-13982) Implement Generic RunInference Base class

Reply via email to