[ 
https://issues.apache.org/jira/browse/BEAM-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512643#comment-17512643
 ] 

Valentyn Tymofieiev edited comment on BEAM-14163 at 3/26/22, 2:10 AM:
----------------------------------------------------------------------

Looking.

Using this command to repro:

 
{noformat}
./gradlew -PloadTest.mainClass=apache_beam.testing.load_tests.pardo_test \
 -Prunner=DataflowRunner \
 
-PloadTest.args="--job_name=load-tests-python-dataflow-streaming-pardo-4-valentyn
 \
   --project=apache-beam-testing 
   --region=us-central1 
   --temp_location=gs://temp-storage-for-perf-tests/loadtests 
   --publish_to_big_query=false 
   
--input_options='{\"num_records\":20000000,\"key_size\":10,\"value_size\":90}'
   --iterations=1 
   --number_of_counter_operations=100 
   --number_of_counters=1 
   --num_workers=5 
   --autoscaling_algorithm=NONE
   --streaming
   --experiments=use_runner_v2
   --runner=DataflowRunner" \
 -PpythonVersion=3.7 \
 --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
-Dorg.gradle.jvmargs=-Xmx4g -Dorg.gradle.vfs.watch=false \
:sdks:python:apache_beam:testing:load_tests:run{noformat}
Command line taken from console output  on Jenikins, and syntax adjusted 
accordingly to example in: 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/load_tests/pardo_test.py.]
 The last part is important: just copy-pasting the command does not work and 
due to  finicky details about escaping various characters.


was (Author: tvalentyn):
Looking.

Using this command to repro:

 
{noformat}
./gradlew -PloadTest.mainClass=apache_beam.testing.load_tests.pardo_test 
-Prunner=DataflowRunner 
'-PloadTest.args=--job_name=load-tests-python-dataflow-streaming-pardo-4-valentyn
 --project=apache-beam-testing --region=us-central1 
--temp_location=gs://temp-storage-for-perf-tests/loadtests 
--publish_to_big_query=false  
--input_options={"num_records":20000000,"key_size":10,"value_size":90} 
--iterations=1 --number_of_counter_operations=100 --number_of_counters=1 
--num_workers=5 --autoscaling_algorithm=NONE  --streaming 
--experiments=use_runner_v2, shuffle_mode=appliance --runner=DataflowRunner' 
-PpythonVersion=3.7 --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
-Dorg.gradle.jvmargs=-Xmx4g -Dorg.gradle.vfs.watch=false -Pdocker-pull-licenses 
:sdks:python:apache_beam:testing:load_tests:run{noformat}

> Performance Regressions in streaming python ParDo and GBK Load Tests
> --------------------------------------------------------------------
>
>                 Key: BEAM-14163
>                 URL: https://issues.apache.org/jira/browse/BEAM-14163
>             Project: Beam
>          Issue Type: Bug
>          Components: community-metrics, sdk-py-core
>    Affects Versions: 2.38.0
>            Reporter: Daniel Oliveira
>            Assignee: Valentyn Tymofieiev
>            Priority: P0
>             Fix For: 2.38.0
>
>
> As specified in the [Beam Release 
> Guide|https://beam.apache.org/contribute/release-guide/#4-investigate-performance-regressions],
>  I'm investigating performance regressions. The following load test metrics 
> show a clear and persistant performance regression starting approximately 
> around March 17 and affecting version 2.38.0.
> ParDo Load Tests: 
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=streaming&var-sdk=python
> GBK Load Tests: 
> http://metrics.beam.apache.org/d/UYZ-oJ3Zk/gbk-load-tests?orgId=1&var-processingType=streaming&var-sdk=python&from=now-30d&to=now



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to