[
https://issues.apache.org/jira/browse/BEAM-10527?focusedWorklogId=463669&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-463669
]
ASF GitHub Bot logged work on BEAM-10527:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Jul/20 07:20
Start Date: 29/Jul/20 07:20
Worklog Time Spent: 10m
Work Description: ibzib opened a new pull request #12385:
URL: https://github.com/apache/beam/pull/12385
# Motivation
The main goals of migrating to pytest are:
1. Get Junit structured test output (BEAM-10527).
2. Replace PortableRunnerTest's bespoke timeout mechanism with pytest's
(BEAM-9011). This also implicitly raises the timeout for all these tests from
60s to 600s, which should reduce the likelihood of timeout flakes (BEAM-8912).
# Implementation
I originally wanted to do this in incremental changes, but I gradually
realized a complete overhaul of these tests' configuration was needed. The main
challenge was that `flink_runner_test.py` expected to be run as `__main__`,
which is impossible with pytest. I basically reworked everything except the
tests themselves; the tests themselves are unchanged.
## Test parametrization
- I left worker configuration in Gradle because it changes the test
dependencies. I moved optimization and streaming into `flink_runner_test.py`
because it removes the need to set up separate tox tasks, separate test result
files, etc.
- Since `flink_job_server_driver`, `environment_type`, and
`environment_config` are all pipeline options, I decided to pass them to
`flink_runner_test.py` by introducing a global pytest option,
`--test-pipeline-options`. `nose` uses `--test-pipeline-options` for
integration tests, so I figured this would be generally useful beyond just
these tests in the future.
# Bonus trivia
Prior to this change, we were running the exact same streaming test suite
*four times* per Jenkins run.
Every `flinkCompatibilityMatrix` task ran the entirety of
`flink_runner_test.py` which contained two classes: `FlinkRunnerTest` and
`FlinkRunnerTestOptimized`. `FlinkRunnerTestOptimized` was basically the same
thing as `FlinkRunnerTest`, but it added the `pre_optimize=all` experiment and
skipped external transform tests, since the Python optimizer breaks external
transforms (BEAM-7252). But we were *also* adding `pre_optimize=all` in Gradle,
redundantly.
The old configuration looks like this:
```groovy
dependsOn flinkCompatibilityMatrix(streaming: false, workerType:
CompatibilityMatrixConfig.SDK_WORKER_TYPE.LOOPBACK)
dependsOn flinkCompatibilityMatrix(streaming: true, workerType:
CompatibilityMatrixConfig.SDK_WORKER_TYPE.LOOPBACK)
dependsOn flinkCompatibilityMatrix(streaming: true, workerType:
CompatibilityMatrixConfig.SDK_WORKER_TYPE.LOOPBACK, preOptimize: true)
```
Notice that pre-optimized batch is missing. This is because
`flinkCompatibilityMatrixBatchPreOptimize*` would run `FlinkRunnerTest` with
`pre_optimize=all` but without skipping the external transform tests, causing
failure.
What about streaming, then? Well, the optimizer *doesn't affect streaming
pipelines at all*:
https://github.com/apache/beam/blob/489cf2cbf335f372060469953ed7599f2838a591/sdks/python/apache_beam/runners/portability/portable_runner.py#L319
So in one invocation of `flinkValidatesRunner`,
`flinkCompatibilityMatrixStreamingLoopback` would run `FlinkRunnerTest`
(without `pre_optimize=all`) and `FlinkRunnerTestOptimized`, then
`flinkCompatibilityMatrixStreamingPreOptimizeLoopback` would run
`FlinkRunnerTest` (with `pre_optimize=all`) and `FlinkRunnerTestOptimized`
(with `pre_optimize=all` twice). Besides the skips in
`FlinkRunnerTestOptimized`, all four tests would be doing the exact same thing.
------------------------
Thank you for your contribution! Follow this checklist to help us
incorporate your contribution quickly and easily:
- [ ] [**Choose
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA
issue, if applicable. This will automatically link the pull request to the
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
See the [Contributor Guide](https://beam.apache.org/contribute) for more
tips on [how to make review process
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
--- | --- | --- | --- | --- | --- | ---
Go | [](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
| ---
Java | [](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
Python | [](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
| ---
XLang | [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/)
| --- | [](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
| ---
Pre-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
--- |Java | Python | Go | Website
--- | --- | --- | --- | ---
Non-portable | [](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/)
| [](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
Portable | --- | [](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/)
| --- | ---
See
[.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md)
for trigger phrase, status and link of all Jenkins jobs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 463669)
Time Spent: 1h 50m (was: 1h 40m)
> Python2_PVR_Flink precommit should publish test results to Jenkins
> ------------------------------------------------------------------
>
> Key: BEAM-10527
> URL: https://issues.apache.org/jira/browse/BEAM-10527
> Project: Beam
> Issue Type: Improvement
> Components: testing
> Reporter: Kyle Weaver
> Assignee: Kyle Weaver
> Priority: P2
> Time Spent: 1h 50m
> Remaining Estimate: 0h
>
> Right now we only have the logs, which often require scrolling up to see the
> failure (which itself often requires curl'ing the logs because they are too
> large for a browser to load comfortably). This causes frequent
> misunderstandings. For example, folks often mistake errors printed by
> pipelines that are meant to fail (e.g. test_error_message_includes_stage) for
> actual test failures.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)