[ https://issues.apache.org/jira/browse/MESOS-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Mahler updated MESOS-1757: ----------------------------------- Description: The full test suite is exceeding the 9 minute mark (581 seconds on my machine), this epic is to track techniques to improve this: # Now that the master and the slave have to perform sync'ed disk writes, consider using tmpfs (e.g. under /dev/shm) to speed up the disk writes. For the master, we could also consider defaulting to in-memory state rather than the replicated log for most tests. # -The reaper takes a full second to reap an exited process (MESOS-1199), this adds a second to each slave recovery test, and possibly more for things that rely on Subprocess.- # The command executor sleeps for a second when shutting down (MESOS-442), this adds a second to every test that uses the command executor. A big improvement will come from running the tests in parallel, a few options: # Use automake's parallel test harness to compile tests separately and run tests in parallel (see [here|http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html]). # Continue to use one test binary, but leverage google test's ability to shard tests across processes/machines (see [here|https://code.google.com/p/googletest/wiki/AdvancedGuide#Distributing_Test_Functions_to_Multiple_Machines]). This entails writing our own test wrapper script in support to decide many workers to use, etc. [gtest-parallel|https://github.com/google/gtest-parallel/blob/master/gtest-parallel] is an example of a parallel runner, but does not leverage the sharding ability. was: The full test suite is exceeding the 8 minute mark (470 seconds on my machine), this epic is to track techniques to improve this: # Now that the master and the slave have to perform sync'ed disk writes, consider using tmpfs (e.g. under /dev/shm) to speed up the disk writes. For the master, we could also consider defaulting to in-memory state rather than the replicated log for most tests. # -The reaper takes a full second to reap an exited process (MESOS-1199), this adds a second to each slave recovery test, and possibly more for things that rely on Subprocess.- # The command executor sleeps for a second when shutting down (MESOS-442), this adds a second to every test that uses the command executor. A big improvement will come from running the tests in parallel, a few options: # Use automake's parallel test harness to compile tests separately and run tests in parallel (see [here|http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html]). # Continue to use one test binary, but leverage google test's ability to shard tests across processes/machines (see [here|https://code.google.com/p/googletest/wiki/AdvancedGuide#Distributing_Test_Functions_to_Multiple_Machines]). This entails writing our own test wrapper script in support to decide many workers to use, etc. [gtest-parallel|https://github.com/google/gtest-parallel/blob/master/gtest-parallel] is an example of a parallel runner, but does not leverage the sharding ability. > Speed up the tests. > ------------------- > > Key: MESOS-1757 > URL: https://issues.apache.org/jira/browse/MESOS-1757 > Project: Mesos > Issue Type: Epic > Components: technical debt, test > Reporter: Benjamin Mahler > Labels: twitter > > The full test suite is exceeding the 9 minute mark (581 seconds on my > machine), this epic is to track techniques to improve this: > # Now that the master and the slave have to perform sync'ed disk writes, > consider using tmpfs (e.g. under /dev/shm) to speed up the disk writes. For > the master, we could also consider defaulting to in-memory state rather than > the replicated log for most tests. > # -The reaper takes a full second to reap an exited process (MESOS-1199), > this adds a second to each slave recovery test, and possibly more for things > that rely on Subprocess.- > # The command executor sleeps for a second when shutting down (MESOS-442), > this adds a second to every test that uses the command executor. > A big improvement will come from running the tests in parallel, a few options: > # Use automake's parallel test harness to compile tests separately and run > tests in parallel (see > [here|http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html]). > # Continue to use one test binary, but leverage google test's ability to > shard tests across processes/machines (see > [here|https://code.google.com/p/googletest/wiki/AdvancedGuide#Distributing_Test_Functions_to_Multiple_Machines]). > This entails writing our own test wrapper script in support to decide many > workers to use, etc. > [gtest-parallel|https://github.com/google/gtest-parallel/blob/master/gtest-parallel] > is an example of a parallel runner, but does not leverage the sharding > ability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)