tysonjh commented on a change in pull request #12435:
URL: https://github.com/apache/beam/pull/12435#discussion_r466648416



##########
File path: .test-infra/jenkins/job_LoadTests_ParDo_Python.groovy
##########
@@ -151,3 +151,35 @@ 
CronJobBuilder.cronJob('beam_LoadTests_Python_ParDo_Dataflow_Batch', 'H 13 * * *
   ]
   batchLoadTestJob(delegate, 
CommonTestProperties.TriggeringContext.POST_COMMIT)
 }
+
+def streamingLoadTestJob = { scope, triggeringContext ->
+  scope.description('Runs Python ParDo load tests on Dataflow runner in 
streaming mode')
+  commonJobProperties.setTopLevelMainJobProperties(scope, 'master', 120)
+
+  def datasetName = loadTestsBuilder.getBigQueryDataset('load_test', 
triggeringContext)
+  for (testConfiguration in loadTestConfigurations("streaming", datasetName)) {
+    // Skipping case 2 in streaming because it timeouts. To be checked TODO: 
kkucharc
+    if(testConfiguration.title != "ParDo Python Load test: 2GB 100 byte 
records 200 times") {

Review comment:
       This should be a property in the testConfiguration instead of relying on 
title implicitly? That would also eliminate adding a 'streaming' property below 
as well.
   
   In fact, doing this may also allow refactoring 'batchLoadTestJob' to a more 
generic 'loadTestJob' for reuse in both Streaming/Batch cases. It may 
unnecessarily complicate things, I'll leave it up to your judgement.

##########
File path: .test-infra/jenkins/job_LoadTests_ParDo_Python.groovy
##########
@@ -151,3 +151,35 @@ 
CronJobBuilder.cronJob('beam_LoadTests_Python_ParDo_Dataflow_Batch', 'H 13 * * *
   ]
   batchLoadTestJob(delegate, 
CommonTestProperties.TriggeringContext.POST_COMMIT)
 }
+
+def streamingLoadTestJob = { scope, triggeringContext ->
+  scope.description('Runs Python ParDo load tests on Dataflow runner in 
streaming mode')
+  commonJobProperties.setTopLevelMainJobProperties(scope, 'master', 120)
+
+  def datasetName = loadTestsBuilder.getBigQueryDataset('load_test', 
triggeringContext)
+  for (testConfiguration in loadTestConfigurations("streaming", datasetName)) {
+    // Skipping case 2 in streaming because it timeouts. To be checked TODO: 
kkucharc

Review comment:
       Are you planning to investigate this now, soon, or in the future? If it 
isn't something you'll be looking into immediately and there is some additional 
context related to the timeouts that you've found while testing it would be 
helpful to create a Jira issue and link it here instead of a TODO for yourself.

##########
File path: sdks/python/apache_beam/testing/load_tests/pardo_test.py
##########
@@ -125,7 +125,9 @@ def process(self, element, state=state_param):
             state.add(1)
         yield element
 
-    if self.get_option_or_default('streaming', False):
+    if self.get_option_or_default(
+        'streaming',
+        False) and self.pipeline.get_option('runner') == "PortableRunner":

Review comment:
       Why only for the PortableRunner now?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to