[
https://issues.apache.org/jira/browse/BEAM-13984?focusedWorklogId=754101&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-754101
]
ASF GitHub Bot logged work on BEAM-13984:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 07/Apr/22 14:06
Start Date: 07/Apr/22 14:06
Worklog Time Spent: 10m
Work Description: yeandy commented on code in PR #17196:
URL: https://github.com/apache/beam/pull/17196#discussion_r845179479
##########
sdks/python/apache_beam/ml/inference/pytorch_impl_test.py:
##########
@@ -0,0 +1,221 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pytype: skip-file
+
+import os
+import shutil
+import tempfile
+import unittest
+from collections import OrderedDict
+
+import numpy as np
+import pytest
+import torch
+
+import apache_beam as beam
+from apache_beam.ml.inference import base
+from apache_beam.ml.inference.pytorch_impl import PytorchModelLoader
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+
+
+class PytorchLinearRegression(torch.nn.Module):
+ def __init__(self, inputSize, outputSize):
+ super().__init__()
+ self.linear = torch.nn.Linear(inputSize, outputSize)
+
+ def forward(self, x):
+ out = self.linear(x)
+ return out
+
+
+class PytorchRunInferenceTest(unittest.TestCase):
+ def setUp(self):
+ self.tmpdir = tempfile.mkdtemp()
+
+ def tearDown(self):
+ shutil.rmtree(self.tmpdir)
+
+ def test_simple_single_tensor_feature(self):
+ with TestPipeline() as pipeline:
+ examples = torch.from_numpy(
+ np.array([1, 5, 3, 10], dtype="float32").reshape(-1, 1))
+ expected = torch.Tensor([example * 2.0 + 0.5 for example in examples])
+
+ state_dict = OrderedDict([('linear.weight', torch.Tensor([[2.0]])),
+ ('linear.bias', torch.Tensor([0.5]))])
+ path = os.path.join(self.tmpdir, 'my_state_dict_path')
+ torch.save(state_dict, path)
+
+ input_dim = 1
+ output_dim = 1
+
+ model_loader = PytorchModelLoader(
+ input_dim=input_dim,
+ state_dict_path=path,
+ model_class=PytorchLinearRegression(input_dim, output_dim))
+
+ pcoll = pipeline | 'start' >> beam.Create(examples)
+ actual = pcoll | base.RunInference(model_loader)
+ assert_that(actual, equal_to(expected))
+
+ def test_invalid_input_type(self):
+ with self.assertRaisesRegex(
+ ValueError, "PCollection must be an numpy array or a torch Tensor"):
+ with TestPipeline() as pipeline:
+ examples = [1, 5, 3, 10]
+
+ state_dict = OrderedDict([('linear.weight', torch.Tensor([[2.0]])),
+ ('linear.bias', torch.Tensor([0.5]))])
+ path = os.path.join(self.tmpdir, 'my_state_dict_path')
+ torch.save(state_dict, path)
+
+ input_dim = 1
+ output_dim = 1
+
+ model_loader = PytorchModelLoader(
+ input_dim=input_dim,
+ state_dict_path=path,
+ model_class=PytorchLinearRegression(input_dim, output_dim))
+
+ pcoll = pipeline | 'start' >> beam.Create(examples)
+ # pylint: disable=expression-not-assigned
+ pcoll | base.RunInference(model_loader)
Review Comment:
It's really fast. For a single test case running locally, 1 second.
Yeah, this is going into the territory of integration tests. I mean if we
really wanted to go simple, I could just construct a `PytorchModelLoader`
object and check to make sure that `PytorchInferenceRunner` and the proper
variables are set.
Should I do that?
Issue Time Tracking
-------------------
Worklog Id: (was: 754101)
Time Spent: 2.5h (was: 2h 20m)
> Implement RunInference for PyTorch
> ----------------------------------
>
> Key: BEAM-13984
> URL: https://issues.apache.org/jira/browse/BEAM-13984
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-py-core
> Reporter: Andy Ye
> Assignee: Andy Ye
> Priority: P2
> Labels: run-inference
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> Implement RunInference for PyTorch as described in the design doc
> [https://s.apache.org/inference-sklearn-pytorch]
> There will be a pytorch_impl.py file that contains PyTorchModelLoader and
> PyTorchInferenceRunner classes.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)