tysonjh commented on a change in pull request #12435:
URL: https://github.com/apache/beam/pull/12435#discussion_r466648416
##########
File path: .test-infra/jenkins/job_LoadTests_ParDo_Python.groovy
##########
@@ -151,3 +151,35 @@
CronJobBuilder.cronJob('beam_LoadTests_Python_ParDo_Dataflow_Batch', 'H 13 * * *
]
batchLoadTestJob(delegate,
CommonTestProperties.TriggeringContext.POST_COMMIT)
}
+
+def streamingLoadTestJob = { scope, triggeringContext ->
+ scope.description('Runs Python ParDo load tests on Dataflow runner in
streaming mode')
+ commonJobProperties.setTopLevelMainJobProperties(scope, 'master', 120)
+
+ def datasetName = loadTestsBuilder.getBigQueryDataset('load_test',
triggeringContext)
+ for (testConfiguration in loadTestConfigurations("streaming", datasetName)) {
+ // Skipping case 2 in streaming because it timeouts. To be checked TODO:
kkucharc
+ if(testConfiguration.title != "ParDo Python Load test: 2GB 100 byte
records 200 times") {
Review comment:
This should be a property in the testConfiguration instead of relying on
title implicitly? That would also eliminate adding a 'streaming' property below
as well.
In fact, doing this may also allow refactoring 'batchLoadTestJob' to a more
generic 'loadTestJob' for reuse in both Streaming/Batch cases. It may
unnecessarily complicate things, I'll leave it up to your judgement.
##########
File path: .test-infra/jenkins/job_LoadTests_ParDo_Python.groovy
##########
@@ -151,3 +151,35 @@
CronJobBuilder.cronJob('beam_LoadTests_Python_ParDo_Dataflow_Batch', 'H 13 * * *
]
batchLoadTestJob(delegate,
CommonTestProperties.TriggeringContext.POST_COMMIT)
}
+
+def streamingLoadTestJob = { scope, triggeringContext ->
+ scope.description('Runs Python ParDo load tests on Dataflow runner in
streaming mode')
+ commonJobProperties.setTopLevelMainJobProperties(scope, 'master', 120)
+
+ def datasetName = loadTestsBuilder.getBigQueryDataset('load_test',
triggeringContext)
+ for (testConfiguration in loadTestConfigurations("streaming", datasetName)) {
+ // Skipping case 2 in streaming because it timeouts. To be checked TODO:
kkucharc
Review comment:
Are you planning to investigate this now, soon, or in the future? If it
isn't something you'll be looking into immediately and there is some additional
context related to the timeouts that you've found while testing it would be
helpful to create a Jira issue and link it here instead of a TODO for yourself.
##########
File path: sdks/python/apache_beam/testing/load_tests/pardo_test.py
##########
@@ -125,7 +125,9 @@ def process(self, element, state=state_param):
state.add(1)
yield element
- if self.get_option_or_default('streaming', False):
+ if self.get_option_or_default(
+ 'streaming',
+ False) and self.pipeline.get_option('runner') == "PortableRunner":
Review comment:
Why only for the PortableRunner now?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]