[ 
https://issues.apache.org/jira/browse/SPARK-19592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866367#comment-15866367
 ] 

Armin Braun commented on SPARK-19592:
-------------------------------------

[~srowen] 

{quote}
What about tests that make their own conf or need to?
{quote}

Those tests in particular made me interested in this for 
correctness/readability reasons. Maybe an example helps :)

in _org.apache.spark.streaming.InputStreamsSuite_ we have the situation that 
the conf is set up in the parent suit via just

{code}
  val conf = new SparkConf()
    .setMaster(master)
    .setAppName(framework)
{code}

now if you run that suit from the IDE one of the tests fails with an apparent 
error in the logic.

{code}
The code passed to eventually never returned normally. Attempted 664 times over 
10.012607219000001 seconds. Last failure message: 10 did not equal 5.
{code}

You debug it and find out that it's because you get some _StreamingListener_ 
added to the context twice because the tests adds one manually that is already 
on the context.
Reason for that being that it's also added by the UI when you have 
_spark.ui.enabled_ set to default _true_.

So basically you now have a seemingly redundant line of code in a bunch of 
tests:

{code}
ssc.addStreamingListener(ssc.progressListener)
{code}

...  that appears wrong with the configuration (that you see if you just read 
the code) and requires you to also consider (maintain) what Maven or SBT is 
injecting in terms of environment.
-------------------------------------------

So I think those tests that make their own config are the most troublesome 
since they have non-standard defaults injected. In my opinion it would be a lot 
easier to work with if the defaults would just be the standard production env. 
defaults when I create a new instance of the SparkConf and all deviation from 
that would be explicit in the code.

I agree it's not a big pain, still a quality issue worth fixing (imo). Reduces 
maintenance effort from drier test configs and makes tests easier to read like 
in the example above.

> Duplication in Test Configuration Relating to SparkConf Settings Should be 
> Removed
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-19592
>                 URL: https://issues.apache.org/jira/browse/SPARK-19592
>             Project: Spark
>          Issue Type: Improvement
>          Components: Tests
>    Affects Versions: 2.1.0, 2.2.0
>         Environment: Applies to all Environments
>            Reporter: Armin Braun
>            Priority: Minor
>
> This configuration for Surefire, Scalatest is duplicated in the parent POM as 
> well as the SBT build.
> While this duplication cannot be removed in general it can at least be 
> removed for all system properties that simply result in a SparkConf setting I 
> think.
> Instead of having lines like 
> {code}
> <spark.ui.enabled>false</spark.ui.enabled>
> {code}
> twice in the pom.xml
> and once in SBT as
> {code}
> javaOptions in Test += "-Dspark.ui.enabled=false",
> {code}
> it would be a lot cleaner to simply have a 
> {code}
> var conf: SparkConf 
> {code}
> field in 
> {code}
> org.apache.spark.SparkFunSuite
> {code}
>  that has SparkConf set up with all the shared configuration that 
> `systemProperties` currently provide. Obviously this cannot be done straight 
> away given that
> many subclasses of the parent suit do this, so I think it would be best to 
> simply add a method to the parent that provides this configuration for now
> and start refactoring away duplication in other suit setups from there step 
> by step until the sys properties can be removed from the pom and sbt.build.
> This makes the build a lot easier to maintain and makes tests more readable 
> by making the environment setup more explicit in the code.
> (also it would allow running more tests straight from the IDE which is always 
> a nice thing imo)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to