[ 
https://issues.apache.org/jira/browse/SPARK-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236540#comment-14236540
 ] 

Nicholas Chammas commented on SPARK-3431:
-----------------------------------------

Here's an example failure I don't understand.

I fire up {{sbt/sbt}} with {{SparkBuild.scala}} at [this 
version|https://github.com/nchammas/spark/blob/ab127b798dbfa9399833d546e627f9651b060918/project/SparkBuild.scala]:

{code}
  def groupBySuite(tests: Seq[TestDefinition], javaOptions: Seq[String]) = {
    tests groupBy (_.name.split('.').slice(0,4).mkString(".")) map {
      case (suite, tests) =>
        new Group(
          name = suite,
          tests = tests,
          // runPolicy = Tests.InProcess)
          runPolicy = SubProcess(javaOptions = javaOptions))
    } toSeq
  }

<snipped>

    testGrouping in Test <<= (definedTests in Test, javaOptions in Test) map 
groupBySuite,
{code}

Then I run this at the SBT prompt:

{code}
testOnly org.apache.spark.sql.hive.execution.HiveQuerySuite
{code}

I get a lot of errors, but this one stands out:

{code}
21:53:56.662 WARN org.apache.spark.sql.hive.execution.HiveQuerySuite: Running 
query 1/1 with hive.
java.io.IOException: Cannot run program "/usr/bin/hadoop" (in directory 
"/path/to/my/copy/of/spark"): error=2, No such file or directory
{code}

If I comment out [the {{testGrouping in Test}} 
line|https://github.com/nchammas/spark/blob/ab127b798dbfa9399833d546e627f9651b060918/project/SparkBuild.scala#L429],
 the test runs fine.

So it smells like the forked JVMs are somehow not getting passed the 
[configured 
paths|https://github.com/nchammas/spark/blob/ab127b798dbfa9399833d546e627f9651b060918/project/SparkBuild.scala#L403-L418]
 or something. There are some related posts about this [on Stack 
Overflow|http://stackoverflow.com/questions/18002205/sbt-test-only-not-picking-up-jvm-option-when-forking-a-jvm-for-tests]
 and [SBT's issue tracker|https://github.com/sbt/sbt/issues/975].

I'm not sure how to proceed with SBT, or whether I've identified a legitimate 
blocker or not. I may just move on to Maven unless I make some kind of 
breakthrough. Any pointers would be appreciated.

> Parallelize execution of tests
> ------------------------------
>
>                 Key: SPARK-3431
>                 URL: https://issues.apache.org/jira/browse/SPARK-3431
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build
>            Reporter: Nicholas Chammas
>
> Running all the tests in {{dev/run-tests}} takes up to 2 hours. A common 
> strategy to cut test time down is to parallelize the execution of the tests. 
> Doing that may in turn require some prerequisite changes to be made to how 
> certain tests run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to