[
https://issues.apache.org/jira/browse/BEAM-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928063#comment-16928063
]
Valentyn Tymofieiev commented on BEAM-8198:
-------------------------------------------
It's a bit hard working with PKB, due to issues like:
https://issues.apache.org/jira/browse/BEAM-8215. It may be easier to run the
benchmark directly and measure walltime:
{noformat}
time ./gradlew :sdks:python:test-suites:dataflow:py36:integrationTest \
-Dtests=apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it \
-Dattr=IT -DpipelineOptions="--project=some_project \
--staging_location=gs://some_bucket/ \
--temp_location=gs://some_bucket/ \
--input=gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.0000* \
--output=gs://some_bucket/output \
--expect_checksum=ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710 \
--num_workers=10 \
--autoscaling_algorithm=NONE \
--runner=TestDataflowRunner \
--sdk_location=/full/path/to/apache-beam-2.16.0.dev0.tar.gz" \
--info
{noformat}
> Investigate possible performance regression of Wordcount 1GB batch benchmark
> on Py3.
> ------------------------------------------------------------------------------------
>
> Key: BEAM-8198
> URL: https://issues.apache.org/jira/browse/BEAM-8198
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core, testing
> Reporter: Valentyn Tymofieiev
> Assignee: Valentyn Tymofieiev
> Priority: Major
> Fix For: 2.16.0
>
>
> context:
> https://lists.apache.org/thread.html/51e000f16481451c207c00ac5e881aa4a46fa020922eddffd00ad527@%3Cdev.beam.apache.org%3E
> Setting fix version to 2.16.0 to understand the cause, hopefully before the
> vote.
> cc: [~altay] [~thw] [~markflyhigh]
--
This message was sent by Atlassian Jira
(v8.3.2#803003)