[
https://issues.apache.org/jira/browse/BEAM-14334?focusedWorklogId=770208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770208
]
ASF GitHub Bot logged work on BEAM-14334:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 13/May/22 14:24
Start Date: 13/May/22 14:24
Worklog Time Spent: 10m
Work Description: mosche commented on code in PR #17662:
URL: https://github.com/apache/beam/pull/17662#discussion_r872459752
##########
runners/spark/spark_runner.gradle:
##########
@@ -218,29 +218,17 @@ def validatesRunnerBatch =
tasks.register("validatesRunnerBatch", Test) {
group = "Verification"
// Disable gradle cache
outputs.upToDateWhen { false }
- def pipelineOptions = JsonOutput.toJson([
- "--runner=TestSparkRunner",
- "--streaming=false",
- "--enableSparkMetricSinks=false",
- ])
- systemProperty "beamTestPipelineOptions", pipelineOptions
- systemProperty "beam.spark.test.reuseSparkContext", "true"
- systemProperty "spark.ui.enabled", "false"
- systemProperty "spark.ui.showConsoleProgress", "false"
+ systemProperties sparkTestProperties(["--enableSparkMetricSinks":"false"])
classpath = configurations.validatesRunner
testClassesDirs = files(
project(":sdks:java:core").sourceSets.test.output.classesDirs,
project(":runners:core-java").sourceSets.test.output.classesDirs,
)
- testClassesDirs += files(project.sourceSets.test.output.classesDirs)
Review Comment:
All unit tests should be run as part of the `test` task
##########
runners/spark/spark_runner.gradle:
##########
@@ -218,29 +218,17 @@ def validatesRunnerBatch =
tasks.register("validatesRunnerBatch", Test) {
group = "Verification"
// Disable gradle cache
outputs.upToDateWhen { false }
- def pipelineOptions = JsonOutput.toJson([
- "--runner=TestSparkRunner",
- "--streaming=false",
- "--enableSparkMetricSinks=false",
- ])
- systemProperty "beamTestPipelineOptions", pipelineOptions
- systemProperty "beam.spark.test.reuseSparkContext", "true"
- systemProperty "spark.ui.enabled", "false"
- systemProperty "spark.ui.showConsoleProgress", "false"
+ systemProperties sparkTestProperties(["--enableSparkMetricSinks":"false"])
classpath = configurations.validatesRunner
testClassesDirs = files(
project(":sdks:java:core").sourceSets.test.output.classesDirs,
project(":runners:core-java").sourceSets.test.output.classesDirs,
)
- testClassesDirs += files(project.sourceSets.test.output.classesDirs)
- // Only one SparkContext may be running in a JVM (SPARK-2243)
- forkEvery 1
maxParallelForks 4
useJUnit {
includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
- includeCategories 'org.apache.beam.runners.spark.UsesCheckpointRecovery'
Review Comment:
Unit test!
Issue Time Tracking
-------------------
Worklog Id: (was: 770208)
Time Spent: 9h 20m (was: 9h 10m)
> Avoid using forkEvery in Spark runner tests
> -------------------------------------------
>
> Key: BEAM-14334
> URL: https://issues.apache.org/jira/browse/BEAM-14334
> Project: Beam
> Issue Type: Improvement
> Components: runner-spark, testing
> Reporter: Moritz Mack
> Assignee: Moritz Mack
> Priority: P2
> Time Spent: 9h 20m
> Remaining Estimate: 0h
>
> Usage of *{color:#FF0000}forkEvery 1{color}* is typically a strong sign of
> poor quality / bad code and should be avoided:
> * It significantly impacts performance when running tests.
> * And it often hides resource leaks, either in code or worse in the runner
> itself.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)