[
https://issues.apache.org/jira/browse/BEAM-14068?focusedWorklogId=777785&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777785
]
ASF GitHub Bot logged work on BEAM-14068:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 02/Jun/22 18:58
Start Date: 02/Jun/22 18:58
Worklog Time Spent: 10m
Work Description: AnandInguva commented on code in PR #17462:
URL: https://github.com/apache/beam/pull/17462#discussion_r888288662
##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -185,3 +185,36 @@ tasks.register("hdfsIntegrationTest") {
}
}
}
+
+// Pytorch RunInference IT tests
+task torchTests {
+ dependsOn 'installGcpTest'
+ dependsOn ':sdks:python:sdist'
+ def requirementsFile =
"${rootDir}/sdks/python/apache_beam/ml/inference/torch_tests_requirements.txt"
+ doFirst {
+ exec {
+ executable 'sh'
+ args '-c', ". ${envdir}/bin/activate && pip install -r
$requirementsFile"
+ }
+ }
+ doLast {
+ def testOpts = basicTestOpts
+ def argMap = [
+ "test_opts": testOpts,
+ "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+ "collect": "uses_pytorch and it_postcommit",
+ "runner": "TestDirectRunner"
+ ]
+ def cmdArgs = mapToArgString(argMap)
+ exec {
+ executable 'sh'
+ args '-c', ". ${envdir}/bin/activate && export FORCE_TORCH_IT=1 &&
${runScriptsDir}/run_integration_test.sh $cmdArgs"
+ }
+ }
+}
+
+// Add all the RunInference framework IT tests to this gradle task that runs
on Direct Runner Post commit suite.
+// TODO(anandinguva): Add sklearn IT test here
Review Comment:
I think its not needed for this situation. But we can specify this in the
documentation of RunInference developer guide(when we add it in confluence)
Issue Time Tracking
-------------------
Worklog Id: (was: 777785)
Time Spent: 8h (was: 7h 50m)
> RunInference Benchmarking tests
> -------------------------------
>
> Key: BEAM-14068
> URL: https://issues.apache.org/jira/browse/BEAM-14068
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-py-core
> Reporter: Anand Inguva
> Assignee: Anand Inguva
> Priority: P2
> Time Spent: 8h
> Remaining Estimate: 0h
>
> RunInference benchmarks will evaluate performance of Pipelines, which
> represent common use cases of Beam + Dataflow in Pytorch, sklearn and
> possibly TFX. These benchmarks would be the integration tests that exercise
> several software components using Beam, PyTorch, Scikit learn and TensorFlow
> extended.
> we would use the datasets that's available publicly (Eg; Kaggle).
> Size: small / 10 GB / 1 TB etc
> The default execution runner would be Dataflow unless specified otherwise.
> These tests would be run very less frequently(every release cycle).
--
This message was sent by Atlassian Jira
(v8.20.7#820007)