Hey all,

I've been working on test flakiness recently, and I've been trying to
come up with ways to tackle the issue top-down as well as bottom-up,
and I'm interested to hear your thoughts on an idea.

In addition to the current full-suite runs, can we in parallel trigger
a smaller test run which has only a relevant subset of tests? For
example, if someone is working on one sub-module, the CI would only
run tests in that module.

I think this would be more likely to pass than the full suite due to
the fewer tests failing probabilistically, and would improve the
signal-to-noise ratio of the summary pass/fail marker on GitHub. This
should also be shorter to execute than the full suite, allowing for
faster cycle-time than the current full suite encourages.

This would also strengthen the incentive for contributors specializing
in a module to de-flake tests, as they are rewarded with a tangible
improvement within their area of the project. Currently, even the
modules with the most reliable tests receive consistent CI failures
from other less reliable modules.

I believe this is possible, even if there isn't an off-the-shelf
solution for it. We can learn of the changed files via a git diff, map
that to modules containing those files, and then execute the tests
just for those modules with gradle. GitHub also permits showing
multiple "checks" so that we can emit both the full-suite and partial
test results.

Thanks,
Greg

Reply via email to