Re: proposal to run Ceph tests on pull requests

John Spray Mon, 07 Dec 2015 03:30:08 -0800

On Sat, Dec 5, 2015 at 11:49 AM, Loic Dachary <l...@dachary.org> wrote:
> Hi Ceph,
>
> TL;DR: a ceph-qa-suite bot running on pull requests is sustainable and is an 
> incentive for contributors to use teuthology-openstack independently


A bot for scheduling a named suite on a named PR, and posting the
results back the PR is definitely a good thing.

Thinking further about using commit messages to toggle the testing, I
think that this could get awkward when it's coupled to the human side
of code review.  When someone pushes a "how about this?" modification
they don't necessarily want to re-run the test suite until the
reviewer has okayed it, but then that means that they have to push
again, and the final thing that's tested would be a different SHA1
(hopefully the same code) than what the human last reviewed.  We'll
also have e.g. rebases, where there tends to be some discretion about
whether a rebase requires a re-test.

When you were talking about having the suite selected in the qa: tag,
there was the motivation to put it in the commit message so that it
would be preserved in backports.  However, if the "Needs-qa:" flag is
just a boolean, then I think it makes more sense to control it with a
github label or by posting a command in a PR comment.

I'm not sure how this really helps with the resource issues; for
example with the fs suite we would probably not be able to make a
finer-grained choice about what tests to run based on the diff.  The
part about randomly dropping a subset of tests when resources are low
doesn't make sense to me -- I think the bot should either give up or
enqueue itself.

Cheers,
John

> When a pull request is submitted, it is compiled, some tests are run[1] and 
> the result is added to the pull request to confirm that it does not introduce 
> a trivial problem. Such tests are however limited because they must:
>
> * run within a few minutes at most
> * not require multiple machines
> * not require root privileges
>
> More extensive tests (primarily integration tests) are needed before a 
> contribution can be merged into Ceph [2], to verify it does not introduce a 
> subtle regression. It would be ideal to run these integration tests on each 
> pull request but there are two obstacles:
>
> * each test takes ~ 1.5 hour
> * each test cost ~ 0.30 euros
>
> On the current master, running all tests would require ~1000 jobs [3]. That 
> would cost ~ 300 euros on each pull request and take ~10 hours assuming 100 
> jobs can run in parallel. We could resolve that problem by:
>
> * maintaining a ceph-qa-suite map to be used as a white list mapping a diff 
> to a set of tests. For instance, if the diff modifies the src/ceph-disk file, 
> it outputs the ceph-disk suite[4]. This would effectively trim the tests that 
> are unrelated to the contribution and reduce the number of tests to a maximum 
> of ~100 [4] and most likely a dozen.
> * tests are run if one of the commits of the pull request has the *Needs-qa: 
> true* flag in the commit message[5]
> * limiting the number of tests to fit in the allocated budget. If there was 
> enough funding for 10,000 jobs during the previous period and there was a 
> total of 1,000 test run required (a test run is a set of tests as produced by 
> the ceph-qa-suite map), each run is trimmed to a maximum of ten tests, 
> regardless.
>
> Here is an example:
>
> Joe submits a pull request to fix a bug in the librados API
> The make check bot compiles and fails make check because it introduces a bug
> Joe uses run-make-check.sh locally to repeat the failure, fixes it and repush
> The make check bot compiles and passes make check
> Joe amends the commit message to add *Needs-qa: true* and repushes
> The ceph-qa-suite map script finds a change on the librados API and outputs 
> smoke/basic/tasks/rados_api_tests.yaml
> The ceph-qa-suite bot runs the test smoke/basic/tasks/rados_api_tests.yaml 
> which fails
> Joe examines the logs found at http://teuthology-logs.public.ceph.com/ and 
> decides to debug by running the test himself
> Joe runs teuthology-openstack --suite smoke/basic/tasks/rados_api_tests.yaml 
> against his own OpenStack tenant [6]
> Joe repush with a fix
> The ceph-qa-suite bot runs the test smoke/basic/tasks/rados_api_tests.yaml 
> which succeeds
> Kefu reviews the pull request and has a link to the successful test runs in 
> the comments
>
> This approach scales with the size of the Ceph developer community [7] 
> because regular contributors benefit directly from funding the ceph-qa-suite 
> bot. New contributors can focus on learning how to interpret the 
> ceph-qa-suite error logs for their contribution and learn about how to debug 
> it via teuthology-openstack if needed, which is a better user experience than 
> trying to figure out which ceph-qa-suite job to run, learning about 
> teuthology, schedule the test and interpret the results.
>
> The maintenance workload of a ceph-qa-suite bot probably requires one work 
> day a week, to handle funding, sysadmin of the server where the bot runs but 
> mostly to sort out the false negatives. I believe a pure self-service 
> approach where each contributor would be asked to run teuthology-openstack 
> independently would actually require more work. The ceph-qa-suite bot 
> provides a baseline on which everybody can agree to sort out the false 
> negatives. When a contributor runs teuthology-openstack by herself/himself, 
> it is difficult for her/him to figure out if a failure comes from something 
> she/he did incorrectly because she/he is not familiar with 
> teuthology-openstack or if it is related to her/his contribution. She/He will 
> asks for assistance  in situations where comparing her/his run with the 
> output of the ceph-qa-suite bot would probably give her/him enough hints to 
> fix the problem herself/himself.
>
> If the ceph-qa-suite bot becomes unavailable, the contributors are not 
> blocked because they can run it by themselves on their own OpenStack tenant 
> and link the results to the pull request in the same way the bot would. 
> Debugging a failed test is essentially the same thing as running the 
> ceph-qa-suite bot.
>
> Cheers
>
> [1] run-make-check.sh 
> https://github.com/ceph/ceph/blob/master/run-make-check.sh
> [2] Ceph test suites https://github.com/ceph/ceph-qa-suite/tree/master/suites
> [3] teuthology-suite --suite .  --subset 1/40000
> [4] minimal number of tests to run all tasks at least once: 130 for rados, 76 
> for fs, 113 for upgrade, 18 for rgw, 45 for rbd.
> [5] a former proposal was to include the test suite to run in the commit 
> message, but this is more difficult to maintain that a boolean flag that 
> states a given commit needs to pass all the relevant tests
> [6] teuthology-openstack 
> https://github.com/dachary/teuthology/tree/openstack#openstack-backend
> [7] Scaling out the Ceph community lab http://dachary.org/?p=3852
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: proposal to run Ceph tests on pull requests

Reply via email to