[
https://issues.apache.org/jira/browse/BEAM-14068?focusedWorklogId=774242&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774242
]
ASF GitHub Bot logged work on BEAM-14068:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 24/May/22 21:36
Start Date: 24/May/22 21:36
Worklog Time Spent: 10m
Work Description: AnandInguva commented on code in PR #17462:
URL: https://github.com/apache/beam/pull/17462#discussion_r880957064
##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import argparse
+import io
+import os
+from functools import partial
+
+import apache_beam as beam
+import torch
+import torchvision
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.pytorch import PytorchModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from PIL import Image
+from torchvision import transforms
+
+
+def read_image(image_file_name: str, path_to_dir: str):
+ image_file_name = os.path.join(path_to_dir, image_file_name)
+ with FileSystems().open(image_file_name, 'r') as file:
+ data = Image.open(io.BytesIO(file.read())).convert('RGB')
+ return image_file_name, data
+
+
+def preprocess_data(data):
Review Comment:
Done
##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import argparse
+import io
+import os
+from functools import partial
+
+import apache_beam as beam
+import torch
+import torchvision
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.pytorch import PytorchModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from PIL import Image
+from torchvision import transforms
+
+
+def read_image(image_file_name: str, path_to_dir: str):
+ image_file_name = os.path.join(path_to_dir, image_file_name)
+ with FileSystems().open(image_file_name, 'r') as file:
+ data = Image.open(io.BytesIO(file.read())).convert('RGB')
+ return image_file_name, data
+
+
+def preprocess_data(data):
+ image_size = (224, 224)
+ normalize = transforms.Normalize(
+ mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
Review Comment:
done
##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
Review Comment:
Done.
##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import argparse
+import io
+import os
+from functools import partial
+
+import apache_beam as beam
+import torch
+import torchvision
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.pytorch import PytorchModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from PIL import Image
+from torchvision import transforms
+
+
+def read_image(image_file_name: str, path_to_dir: str):
+ image_file_name = os.path.join(path_to_dir, image_file_name)
+ with FileSystems().open(image_file_name, 'r') as file:
+ data = Image.open(io.BytesIO(file.read())).convert('RGB')
+ return image_file_name, data
+
+
+def preprocess_data(data):
+ image_size = (224, 224)
+ normalize = transforms.Normalize(
+ mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+ transform = transforms.Compose([
+ transforms.Resize(image_size),
+ transforms.ToTensor(),
+ normalize,
+ ])
+ return transform(data)
+
+
+class PostProcessor(beam.DoFn):
+ """Post process PredictionResult to output filename and
+ prediction using torch."""
Review Comment:
Reworded the docstring
##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import argparse
+import io
+import os
+from functools import partial
+
+import apache_beam as beam
+import torch
+import torchvision
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.pytorch import PytorchModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from PIL import Image
+from torchvision import transforms
+
+
+def read_image(image_file_name: str, path_to_dir: str):
+ image_file_name = os.path.join(path_to_dir, image_file_name)
+ with FileSystems().open(image_file_name, 'r') as file:
+ data = Image.open(io.BytesIO(file.read())).convert('RGB')
+ return image_file_name, data
+
+
+def preprocess_data(data):
+ image_size = (224, 224)
+ normalize = transforms.Normalize(
+ mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+ transform = transforms.Compose([
+ transforms.Resize(image_size),
+ transforms.ToTensor(),
+ normalize,
+ ])
+ return transform(data)
+
+
+class PostProcessor(beam.DoFn):
+ """Post process PredictionResult to output filename and
+ prediction using torch."""
+ def process(self, element):
+ filename, prediction_result = element
+ prediction = torch.argmax(prediction_result.inference, dim=0)
+ yield filename + ',' + str(int(prediction))
Review Comment:
without int, it outputted `tensor(value)`. So to get the value, i converted
to int, so when writing to a text file, it will write as <value> instead of
<tensor(value)>. But looking more into it, torch has a method called
tensor.item() which would provide the value(might be int, float) etc.
##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import argparse
+import io
+import os
+from functools import partial
+
+import apache_beam as beam
+import torch
+import torchvision
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.pytorch import PytorchModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from PIL import Image
+from torchvision import transforms
+
+
+def read_image(image_file_name: str, path_to_dir: str):
Review Comment:
Done. I added them for most of the methods.
##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import argparse
+import io
+import os
+from functools import partial
+
+import apache_beam as beam
+import torch
+import torchvision
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.pytorch import PytorchModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from PIL import Image
+from torchvision import transforms
+
+
+def read_image(image_file_name: str, path_to_dir: str):
+ image_file_name = os.path.join(path_to_dir, image_file_name)
+ with FileSystems().open(image_file_name, 'r') as file:
+ data = Image.open(io.BytesIO(file.read())).convert('RGB')
+ return image_file_name, data
+
+
+def preprocess_data(data):
+ image_size = (224, 224)
+ normalize = transforms.Normalize(
+ mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+ transform = transforms.Compose([
+ transforms.Resize(image_size),
+ transforms.ToTensor(),
+ normalize,
+ ])
+ return transform(data)
+
+
+class PostProcessor(beam.DoFn):
+ """Post process PredictionResult to output filename and
+ prediction using torch."""
+ def process(self, element):
+ filename, prediction_result = element
+ prediction = torch.argmax(prediction_result.inference, dim=0)
+ yield filename + ',' + str(int(prediction))
+
+
+def run_pipeline(options: PipelineOptions, args=None):
+ """Sets up PyTorch RunInference pipeline"""
+ model_class = torchvision.models.mobilenet_v2
+ model_params = {'pretrained': False}
Review Comment:
this change is made here https://github.com/apache/beam/pull/17494
##########
build.gradle.kts:
##########
@@ -312,25 +312,26 @@ tasks.register("python37PostCommit") {
dependsOn(":sdks:python:test-suites:dataflow:py37:spannerioIT")
dependsOn(":sdks:python:test-suites:direct:py37:spannerioIT")
dependsOn(":sdks:python:test-suites:portable:py37:xlangSpannerIOIT")
+ dependsOn(":sdks:python:test-suites:dataflow:py37:torchTests")
+
}
tasks.register("python38PostCommit") {
dependsOn(":sdks:python:test-suites:dataflow:py38:postCommitIT")
dependsOn(":sdks:python:test-suites:direct:py38:postCommitIT")
dependsOn(":sdks:python:test-suites:direct:py38:hdfsIntegrationTest")
dependsOn(":sdks:python:test-suites:portable:py38:postCommitPy38")
+ dependsOn(":sdks:python:test-suites:dataflow:py38:torchTests")
Review Comment:
Makes sense.
##########
sdks/python/apache_beam/ml/inference/pytorch_it_test.py:
##########
@@ -0,0 +1,95 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pylint: skip-file
+
+"""End-to-End test for Pytorch Inference"""
+
+import logging
+import unittest
+import uuid
+
+import pytest
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.testing.test_pipeline import TestPipeline
+
+try:
+ import torch
+ from apache_beam.examples.inference import pytorch_image_classification
+except ImportError as e:
+ torch = None
+
+_EXPECTED_OUTPUTS = {
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005001.JPEG':
'681',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005002.JPEG':
'333',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005003.JPEG':
'711',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005004.JPEG':
'286',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005005.JPEG':
'433',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005006.JPEG':
'290',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005007.JPEG':
'890',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005008.JPEG':
'592',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005009.JPEG':
'406',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005010.JPEG':
'996',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005011.JPEG':
'327',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005012.JPEG':
'573'
+}
+
+
+def process_outputs(filepath):
+ with FileSystems().open(filepath) as f:
+ lines = f.readlines()
+ lines = [l.decode('utf-8').strip('\n') for l in lines]
+ return lines
+
+
[email protected](
+ torch is None,
+ 'Missing dependencies. '
+ 'Test depends on torch, torchvision and pillow')
+class PyTorchInference(unittest.TestCase):
+ @pytest.mark.uses_pytorch
+ @pytest.mark.it_postcommit
+ def test_predictions_output_file(self):
+ test_pipeline = TestPipeline(is_integration_test=True)
+ output_file_dir =
'gs://apache-beam-ml/temp_storage_end_to_end_testing/outputs'
+ output_file = '/'.join([output_file_dir, str(uuid.uuid4()), 'result.txt'])
+ file_of_image_names =
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/imagenet_samples.csv'
Review Comment:
Yes, I will update it soon(last step of this PR) once I clean the bucket
##########
sdks/python/apache_beam/ml/inference/pytorch_it_test.py:
##########
@@ -0,0 +1,95 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pylint: skip-file
+
+"""End-to-End test for Pytorch Inference"""
+
+import logging
+import unittest
+import uuid
+
+import pytest
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.testing.test_pipeline import TestPipeline
+
+try:
+ import torch
+ from apache_beam.examples.inference import pytorch_image_classification
+except ImportError as e:
+ torch = None
+
+_EXPECTED_OUTPUTS = {
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005001.JPEG':
'681',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005002.JPEG':
'333',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005003.JPEG':
'711',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005004.JPEG':
'286',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005005.JPEG':
'433',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005006.JPEG':
'290',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005007.JPEG':
'890',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005008.JPEG':
'592',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005009.JPEG':
'406',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005010.JPEG':
'996',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005011.JPEG':
'327',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005012.JPEG':
'573'
+}
+
+
+def process_outputs(filepath):
+ with FileSystems().open(filepath) as f:
+ lines = f.readlines()
+ lines = [l.decode('utf-8').strip('\n') for l in lines]
+ return lines
+
+
[email protected](
+ torch is None,
+ 'Missing dependencies. '
+ 'Test depends on torch, torchvision and pillow')
Review Comment:
Done
##########
sdks/python/apache_beam/ml/inference/torch_tests_requirements.txt:
##########
@@ -0,0 +1,20 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+torch>=1.7.1
+torchvision>=0.8.2
+pillow>=8.0.0 # bump the version to support Python 3.10 later
Review Comment:
I added the comment when I had a range between lower and upper but when I
changed to the current code, forgot to remove the comment.
Thanks for catching
##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import argparse
+import io
+import os
+from functools import partial
+
+import apache_beam as beam
+import torch
+import torchvision
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.pytorch import PytorchModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+from PIL import Image
+from torchvision import transforms
+
+
+def read_image(image_file_name: str, path_to_dir: str):
+ image_file_name = os.path.join(path_to_dir, image_file_name)
+ with FileSystems().open(image_file_name, 'r') as file:
+ data = Image.open(io.BytesIO(file.read())).convert('RGB')
+ return image_file_name, data
+
+
+def preprocess_data(data):
+ image_size = (224, 224)
+ normalize = transforms.Normalize(
+ mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+ transform = transforms.Compose([
+ transforms.Resize(image_size),
+ transforms.ToTensor(),
+ normalize,
+ ])
+ return transform(data)
+
+
+class PostProcessor(beam.DoFn):
+ """Post process PredictionResult to output filename and
+ prediction using torch."""
+ def process(self, element):
+ filename, prediction_result = element
+ prediction = torch.argmax(prediction_result.inference, dim=0)
+ yield filename + ',' + str(int(prediction))
+
+
+def run_pipeline(options: PipelineOptions, args=None):
+ """Sets up PyTorch RunInference pipeline"""
+ model_class = torchvision.models.mobilenet_v2
+ model_params = {'pretrained': False}
Review Comment:
This is how we pass a torch model class definition and the `model_params`
are the params that are passed to the torch model class constructor.
Note: here `torchvision.models.mobilenet_v2` is a torch Module class
definition and this is used in RunInference API to create a torch Module object
by passing `model_params` to Module's constructor
##########
sdks/python/apache_beam/ml/inference/pytorch_it_test.py:
##########
@@ -0,0 +1,95 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pylint: skip-file
+
+"""End-to-End test for Pytorch Inference"""
+
+import logging
+import unittest
+import uuid
+
+import pytest
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.testing.test_pipeline import TestPipeline
+
+try:
+ import torch
+ from apache_beam.examples.inference import pytorch_image_classification
+except ImportError as e:
+ torch = None
+
+_EXPECTED_OUTPUTS = {
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005001.JPEG':
'681',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005002.JPEG':
'333',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005003.JPEG':
'711',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005004.JPEG':
'286',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005005.JPEG':
'433',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005006.JPEG':
'290',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005007.JPEG':
'890',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005008.JPEG':
'592',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005009.JPEG':
'406',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005010.JPEG':
'996',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005011.JPEG':
'327',
+
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/ILSVRC2012_val_00005012.JPEG':
'573'
+}
+
+
+def process_outputs(filepath):
+ with FileSystems().open(filepath) as f:
+ lines = f.readlines()
+ lines = [l.decode('utf-8').strip('\n') for l in lines]
+ return lines
+
+
[email protected](
+ torch is None,
+ 'Missing dependencies. '
+ 'Test depends on torch, torchvision and pillow')
+class PyTorchInference(unittest.TestCase):
+ @pytest.mark.uses_pytorch
+ @pytest.mark.it_postcommit
+ def test_predictions_output_file(self):
+ test_pipeline = TestPipeline(is_integration_test=True)
+ output_file_dir =
'gs://apache-beam-ml/temp_storage_end_to_end_testing/outputs'
+ output_file = '/'.join([output_file_dir, str(uuid.uuid4()), 'result.txt'])
+ file_of_image_names =
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs/imagenet_samples.csv'
+ base_output_files_dir =
'gs://apache-beam-ml/temp_storage_end_to_end_testing/inputs'
+
+ model_path =
'gs://apache-beam-ml/temp_storage_end_to_end_testing/models/mobilenet_v2.pt'
+ extra_opts = {
+ 'input': file_of_image_names,
+ 'output': output_file,
+ 'model_path': model_path,
Review Comment:
Makes sense
Issue Time Tracking
-------------------
Worklog Id: (was: 774242)
Time Spent: 6h 20m (was: 6h 10m)
> RunInference Benchmarking tests
> -------------------------------
>
> Key: BEAM-14068
> URL: https://issues.apache.org/jira/browse/BEAM-14068
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-py-core
> Reporter: Anand Inguva
> Assignee: Anand Inguva
> Priority: P2
> Time Spent: 6h 20m
> Remaining Estimate: 0h
>
> RunInference benchmarks will evaluate performance of Pipelines, which
> represent common use cases of Beam + Dataflow in Pytorch, sklearn and
> possibly TFX. These benchmarks would be the integration tests that exercise
> several software components using Beam, PyTorch, Scikit learn and TensorFlow
> extended.
> we would use the datasets that's available publicly (Eg; Kaggle).
> Size: small / 10 GB / 1 TB etc
> The default execution runner would be Dataflow unless specified otherwise.
> These tests would be run very less frequently(every release cycle).
--
This message was sent by Atlassian Jira
(v8.20.7#820007)