[ 
https://issues.apache.org/jira/browse/SPARK-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182442#comment-15182442
 ] 

Dongjoon Hyun commented on SPARK-12243:
---------------------------------------

Hi, [~joshrosen].

According to the recent [Running PySpark tests 
log|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52530/console],
 it seems that the long running test starts at the end due to FIFO queue. In 
that case, I think we can reduce the test time by just starting some long 
running tests first with simple priority queue. 

{code}
...
Finished test(python3.4): pyspark.streaming.tests (213s)
Finished test(pypy): pyspark.sql.tests (92s)
Finished test(pypy): pyspark.streaming.tests (280s)
Tests passed in 962 seconds
{code}

The long tests are the followings:
  * pyspark.tests
  * pyspark.mllib.tests
  * pyspark.streaming.tests

I'll make a PR for this as a first attempt to resolve this JIRA issue.

> PySpark tests are slow in Jenkins
> ---------------------------------
>
>                 Key: SPARK-12243
>                 URL: https://issues.apache.org/jira/browse/SPARK-12243
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Project Infra, PySpark, Tests
>            Reporter: Josh Rosen
>
> In the Jenkins pull request builder, it looks like PySpark tests take around 
> 992 seconds (~16.5 minutes) of end-to-end time to run, despite the fact that 
> we run four Python test suites in parallel. We should try to figure out why 
> this is slow and see if there's any easy way to speed things up.
> Note that the PySpark streaming tests take about 5 minutes to run, so 
> best-case we're looking at a 10 minute speedup via further parallelization. 
> We should also try to see whether there are individual slow tests in those 
> Python suites which can be sped up or skipped.
> We could also consider running only the Python 2.6 tests in non-Pyspark pull 
> request builds and reserve testing of all Python versions for builds which 
> touch PySpark-related code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to