Those are all helpful tips and all make complete sense to me. Thank you very much for sharing your experience! :)
On 2019/02/27 00:55:34, "d...@yahoo.com.INVALID" <d...@yahoo.com.INVALID> wrote: > +1 to everything Jeff said. As someone who has worked on flaky tests not just > in Cassandra's context, I know it can be hard to deal with them. > > However, it's best to root cause them. I have found some flaky tests were > genuine issues that needed fixing in Cassandra. Sometimes the flakiness is > due to underpowered VMs running low on resources or in one case tests failed > due to the kernel settings different between systems. Explore tuning the VM > settings used for the test execution. I usually don't prefer adding retries > but in some cases retries can be helpful. Rewriting the tests to reduce > dependencies on external systems or using mocks is another useful method in > reducing the flakiness. Try breaking up tests if they're too big. Finally > deleting tests can also be a solution but use it sparingly. > > I am believe in the broken windows theory so it is critical that you spend > time fixing them else everyone ignores them and attributes all failures to > "flakiness" leading to real issues sneaking in.> > Dinesh> > > On Tuesday, February 26, 2019, 12:06:10 PM PST, Jeff Jirsa > <jj...@gmail.com> wrote: > > > > > > > > > On Feb 26, 2019, at 8:26 AM, Stanislav Kozlovski <st...@outlook.com> > > wrote:> > > > > > Hey there Cassandra community,> > > > > > I work on a fellow open-source project - Apache Kafka - and there we have > > been fighting flaky tests a lot. We run Java 8 and Java 11 builds on every > > Pull Request and due to test flakiness, almost all of them turn out red > > with 1 or 2 tests (completely unrelated to the change in the PR) failing. > > This has resulted in committers ignoring them and merging the changes > > either way, or in the worst case - rerunning the hour-long build until it > > becomes green.> > > I hope most committers wont commit unless the flakey rest is definitely not > in the subsystem they touched. But yes, one of the motivations for speeding > up tests (parallelized on a containerized hosted CI platform) was to cut down > the time for (re-)running> > > > > This test flakiness has also slowed down our releases significantly.> > > > > > In general, I was just curious to understand if this is a problem that > > Cassandra faces as well.> > > Yes> > > > > Does your project have a lot of intermittently failing tests,> > > Sometimes more than others. There were a few big pushes to get green, though > it naturally regresses a bit over time > > > > do you have any active process of addressing such tests (during the initial > > review, after realizing it is flaky, etc). Any pointers will be greatly > > appreciated!> > > I don’t think we’ve solved this convincingly. Different large (corporate) > contributors have done long one time passes, and that helped a ton, but I > don’t think there are any silver bullets yet.> > ---------------------------------------------------------------------> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org> > For additional commands, e-mail: dev-h...@cassandra.apache.org> > >