[ 
https://issues.apache.org/jira/browse/MESOS-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-1757:
-----------------------------------
    Description: 
The full test suite is exceeding the 9 minute mark (581 seconds on my machine), 
this epic is to track techniques to improve this:

# Now that the master and the slave have to perform sync'ed disk writes, 
consider using tmpfs (e.g. under /dev/shm) to speed up the disk writes. For the 
master, we could also consider defaulting to in-memory state rather than the 
replicated log for most tests.
# -The reaper takes a full second to reap an exited process (MESOS-1199), this 
adds a second to each slave recovery test, and possibly more for things that 
rely on Subprocess.-
# The command executor sleeps for a second when shutting down (MESOS-442), this 
adds a second to every test that uses the command executor.

A big improvement will come from running the tests in parallel, a few options:
# Use automake's parallel test harness to compile tests separately and run 
tests in parallel (see 
[here|http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html]).
# Continue to use one test binary, but leverage google test's ability to shard 
tests across processes/machines (see 
[here|https://code.google.com/p/googletest/wiki/AdvancedGuide#Distributing_Test_Functions_to_Multiple_Machines]).
 This entails writing our own test wrapper script in support to decide many 
workers to use, etc. 
[gtest-parallel|https://github.com/google/gtest-parallel/blob/master/gtest-parallel]
 is an example of a parallel runner, but does not leverage the sharding ability.

  was:
The full test suite is exceeding the 8 minute mark (470 seconds on my machine), 
this epic is to track techniques to improve this:

# Now that the master and the slave have to perform sync'ed disk writes, 
consider using tmpfs (e.g. under /dev/shm) to speed up the disk writes. For the 
master, we could also consider defaulting to in-memory state rather than the 
replicated log for most tests.
# -The reaper takes a full second to reap an exited process (MESOS-1199), this 
adds a second to each slave recovery test, and possibly more for things that 
rely on Subprocess.-
# The command executor sleeps for a second when shutting down (MESOS-442), this 
adds a second to every test that uses the command executor.

A big improvement will come from running the tests in parallel, a few options:
# Use automake's parallel test harness to compile tests separately and run 
tests in parallel (see 
[here|http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html]).
# Continue to use one test binary, but leverage google test's ability to shard 
tests across processes/machines (see 
[here|https://code.google.com/p/googletest/wiki/AdvancedGuide#Distributing_Test_Functions_to_Multiple_Machines]).
 This entails writing our own test wrapper script in support to decide many 
workers to use, etc. 
[gtest-parallel|https://github.com/google/gtest-parallel/blob/master/gtest-parallel]
 is an example of a parallel runner, but does not leverage the sharding ability.


> Speed up the tests.
> -------------------
>
>                 Key: MESOS-1757
>                 URL: https://issues.apache.org/jira/browse/MESOS-1757
>             Project: Mesos
>          Issue Type: Epic
>          Components: technical debt, test
>            Reporter: Benjamin Mahler
>              Labels: twitter
>
> The full test suite is exceeding the 9 minute mark (581 seconds on my 
> machine), this epic is to track techniques to improve this:
> # Now that the master and the slave have to perform sync'ed disk writes, 
> consider using tmpfs (e.g. under /dev/shm) to speed up the disk writes. For 
> the master, we could also consider defaulting to in-memory state rather than 
> the replicated log for most tests.
> # -The reaper takes a full second to reap an exited process (MESOS-1199), 
> this adds a second to each slave recovery test, and possibly more for things 
> that rely on Subprocess.-
> # The command executor sleeps for a second when shutting down (MESOS-442), 
> this adds a second to every test that uses the command executor.
> A big improvement will come from running the tests in parallel, a few options:
> # Use automake's parallel test harness to compile tests separately and run 
> tests in parallel (see 
> [here|http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html]).
> # Continue to use one test binary, but leverage google test's ability to 
> shard tests across processes/machines (see 
> [here|https://code.google.com/p/googletest/wiki/AdvancedGuide#Distributing_Test_Functions_to_Multiple_Machines]).
>  This entails writing our own test wrapper script in support to decide many 
> workers to use, etc. 
> [gtest-parallel|https://github.com/google/gtest-parallel/blob/master/gtest-parallel]
>  is an example of a parallel runner, but does not leverage the sharding 
> ability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to