Hyukjin Kwon created SPARK-21693:

             Summary: AppVeyor tests reach the time limit, 1.5 hours, sometimes 
in SparkR tests
                 Key: SPARK-21693
                 URL: https://issues.apache.org/jira/browse/SPARK-21693
             Project: Spark
          Issue Type: Test
          Components: Build, SparkR
    Affects Versions: 2.3.0
            Reporter: Hyukjin Kwon

We finally sometimes reach the time limit, 1.5 hours, 
I requested to increase this from an hour to 1.5 hours before but it looks we 
should fix this in AppVeyor. I asked this for my account few times before but 
it looks we can't increase this time limit again and again.

I could identify three things that take a quite a bit of times:

1. Disabled cache feature in pull request builder, which ends up downloading 
Maven dependencies (15-20ish mins)


Note: Saving cache is disabled in Pull Request builds.

and also see 

This seems difficult to fix within Spark.

2. "MLlib classification algorithms" tests (30-35ish mins)

This test below looks taking 30-35ish mins.

MLlib classification algorithms, except for tree-based algorithms: Spark 
package found in SPARK_HOME: C:\projects\spark\bin\..

As a (I think) last resort, we could make a matrix for this test alone, so that 
we run the other tests after a build and then run this test after another 
build, for example, I run Scala tests by this workaround - 
https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
with 7 build and test each).

3. Disabled {{spark.sparkr.use.daemon}} on Windows due to the limitation of 

See [this 
 We disabled this feature and currently fork processes from Java that is 
expensive. I haven't tested this yet but maybe reducing 
{{spark.sql.shuffle.partitions}} can be an approach to work around this. 
Currently, if I understood correctly, this is 200 by default in R tests, which 
ends up with 200 Java processes for every shuffle.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to