[jira] [Work logged] (BEAM-13984) Implement RunInference for PyTorch

ASF GitHub Bot (Jira) Thu, 07 Apr 2022 07:07:04 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-13984?focusedWorklogId=754101&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-754101
 ]


ASF GitHub Bot logged work on BEAM-13984:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Apr/22 14:06
            Start Date: 07/Apr/22 14:06
    Worklog Time Spent: 10m 
      Work Description: yeandy commented on code in PR #17196:
URL: https://github.com/apache/beam/pull/17196#discussion_r845179479


##########
sdks/python/apache_beam/ml/inference/pytorch_impl_test.py:
##########
@@ -0,0 +1,221 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pytype: skip-file
+
+import os
+import shutil
+import tempfile
+import unittest
+from collections import OrderedDict
+
+import numpy as np
+import pytest
+import torch
+
+import apache_beam as beam
+from apache_beam.ml.inference import base
+from apache_beam.ml.inference.pytorch_impl import PytorchModelLoader
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+
+
+class PytorchLinearRegression(torch.nn.Module):
+  def __init__(self, inputSize, outputSize):
+    super().__init__()
+    self.linear = torch.nn.Linear(inputSize, outputSize)
+
+  def forward(self, x):
+    out = self.linear(x)
+    return out
+
+
+class PytorchRunInferenceTest(unittest.TestCase):
+  def setUp(self):
+    self.tmpdir = tempfile.mkdtemp()
+
+  def tearDown(self):
+    shutil.rmtree(self.tmpdir)
+
+  def test_simple_single_tensor_feature(self):
+    with TestPipeline() as pipeline:
+      examples = torch.from_numpy(
+          np.array([1, 5, 3, 10], dtype="float32").reshape(-1, 1))
+      expected = torch.Tensor([example * 2.0 + 0.5 for example in examples])
+
+      state_dict = OrderedDict([('linear.weight', torch.Tensor([[2.0]])),
+                                ('linear.bias', torch.Tensor([0.5]))])
+      path = os.path.join(self.tmpdir, 'my_state_dict_path')
+      torch.save(state_dict, path)
+
+      input_dim = 1
+      output_dim = 1
+
+      model_loader = PytorchModelLoader(
+          input_dim=input_dim,
+          state_dict_path=path,
+          model_class=PytorchLinearRegression(input_dim, output_dim))
+
+      pcoll = pipeline | 'start' >> beam.Create(examples)
+      actual = pcoll | base.RunInference(model_loader)
+      assert_that(actual, equal_to(expected))
+
+  def test_invalid_input_type(self):
+    with self.assertRaisesRegex(
+        ValueError, "PCollection must be an numpy array or a torch Tensor"):
+      with TestPipeline() as pipeline:
+        examples = [1, 5, 3, 10]
+
+        state_dict = OrderedDict([('linear.weight', torch.Tensor([[2.0]])),
+                                  ('linear.bias', torch.Tensor([0.5]))])
+        path = os.path.join(self.tmpdir, 'my_state_dict_path')
+        torch.save(state_dict, path)
+
+        input_dim = 1
+        output_dim = 1
+
+        model_loader = PytorchModelLoader(
+            input_dim=input_dim,
+            state_dict_path=path,
+            model_class=PytorchLinearRegression(input_dim, output_dim))
+
+        pcoll = pipeline | 'start' >> beam.Create(examples)
+        # pylint: disable=expression-not-assigned
+        pcoll | base.RunInference(model_loader)

Review Comment:
   It's really fast. For a single test case running locally, 1 second.
   
   Yeah, this is going into the territory of integration tests. I mean if we 
really wanted to go simple, I could just construct a `PytorchModelLoader` 
object and check to make sure that `PytorchInferenceRunner` and the proper 
variables are set. 
   
   Should I do that?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 754101)
    Time Spent: 2.5h  (was: 2h 20m)

> Implement RunInference for PyTorch
> ----------------------------------
>
>                 Key: BEAM-13984
>                 URL: https://issues.apache.org/jira/browse/BEAM-13984
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Andy Ye
>            Assignee: Andy Ye
>            Priority: P2
>              Labels: run-inference
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Implement RunInference for PyTorch as described in the design doc 
> [https://s.apache.org/inference-sklearn-pytorch]
> There will be a pytorch_impl.py file that contains PyTorchModelLoader and 
> PyTorchInferenceRunner classes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work logged] (BEAM-13984) Implement RunInference for PyTorch

Reply via email to