[jira] [Work logged] (BEAM-14068) RunInference Benchmarking tests

ASF GitHub Bot (Jira) Tue, 17 May 2022 13:43:07 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-14068?focusedWorklogId=771575&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771575
 ]


ASF GitHub Bot logged work on BEAM-14068:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/May/22 20:42
            Start Date: 17/May/22 20:42
    Worklog Time Spent: 10m 
      Work Description: AnandInguva commented on code in PR #17462:
URL: https://github.com/apache/beam/pull/17462#discussion_r875245170


##########
sdks/python/apache_beam/ml/inference/pytorch_it_test.py:
##########
@@ -0,0 +1,99 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pylint: skip-file
+
+"""End-to-End test for Pytorch Inference"""
+
+import logging
+import unittest
+import uuid
+
+import pytest
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.testing.test_pipeline import TestPipeline
+
+try:
+  import torch
+  from apache_beam.ml.inference.examples import pytorch_image_classification
+except ImportError as e:
+  torch = None
+
+_EXPECTED_OUTPUTS = {

Review Comment:
   that's one way to do it. But these are only 15 samples. So thats why it was 
like this. But once the sample size is more, I would be in more favor of your 
approach. 





Issue Time Tracking
-------------------

    Worklog Id:     (was: 771575)
    Time Spent: 3h 50m  (was: 3h 40m)

> RunInference Benchmarking tests
> -------------------------------
>
>                 Key: BEAM-14068
>                 URL: https://issues.apache.org/jira/browse/BEAM-14068
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Anand Inguva
>            Assignee: Anand Inguva
>            Priority: P2
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> RunInference benchmarks will evaluate performance of Pipelines, which 
> represent common use cases of Beam + Dataflow in Pytorch, sklearn and 
> possibly TFX. These benchmarks would be the integration tests that exercise 
> several software components using Beam, PyTorch, Scikit learn and TensorFlow 
> extended.
> we would use the datasets that's available publicly (Eg; Kaggle). 
> Size: small / 10 GB / 1 TB etc
> The default execution runner would be Dataflow unless specified otherwise.
> These tests would be run very less frequently(every release cycle).  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Work logged] (BEAM-14068) RunInference Benchmarking tests

Reply via email to