Thanks for the feedback. I'll prepare a PR to add this library to our test dependencies.
There is always a potential that a retry will hide legitimate bugs, and we'd have to make a decision whether a retry is appropriate on a case-by-case basis. The scope of the retry should be as narrow as possible (per test-method, if possible only in certain conditions). I think the key question to ask is: when do we want the failure of this test be actionable? And if the answer is close to "only if it fails all the time", then adding a retry may be better than removing the test altogether. But a better solution is to eliminate flakiness. On Fri, Jan 11, 2019 at 10:54 AM Ahmet Altay <al...@google.com> wrote: > It makes sense to me too. I share the same concerns as the previous > comments. For any flaky test it would be better to re-write it in a way > that does not cause flakes (e.g. avoid depending on precise timing of > things). If it is not possible, using this library sounds like a good idea > and would reduce overall time we spent of triage flaky tests. > > On Fri, Jan 11, 2019 at 10:43 AM Chamikara Jayalath <chamik...@google.com> > wrote: > >> Makes sense in general. However, how do we make sure that this does not >> hide legitimate flakes ? >> >> I think these retries should be applied in a cases by case basis. >> Deflaking the test by updating it is better than a blanket retry. If the >> test is flaky by design and there's no way to update it a blanket retry >> should be OK. Just my 2 cents. >> >> Thanks, >> Cham >> >> >> On Fri, Jan 11, 2019 at 4:30 AM Robert Bradshaw <rober...@google.com> >> wrote: >> >>> I think that makes a lot of sense, and tenacity looks like a decent, >>> maintained library. >>> >>> We should use this sparingly, but it is helpful for algorithms that >>> have an intrinsic amount of randomness/noise (e.g. the sampling code) >>> to reduce a 1% chance of failure to a 1 in a million. >>> >>> On Fri, Jan 11, 2019 at 2:50 AM Valentyn Tymofieiev <valen...@google.com> >>> wrote: >>> > >>> > I have been looking at a few test flakes in Python SDK recently, and >>> some of them can benefit from a simple retry logic. See PR #7455 for an >>> example[1]. >>> > >>> > I would not recommend retrying by default for all tests, or >>> mechanically adding a retry to every test that we see flaking: some >>> legitimate bugs may manifest in flakes that happen rarely, and a retry >>> logic will hide them from us. >>> > >>> > However, in some tests we can consider adding retries, retries with a >>> back-off or retries only on particular exceptions. tenacity[2] offers >>> several decorators for this purpose. It is available under Apache 2.0 >>> license on PyPi [3], and is being maintained. >>> > >>> > What does the community think about adding this library as a test-only >>> dependency to Python SDK? >>> > >>> > Thanks, >>> > Valentyn >>> > >>> > [1] https://github.com/apache/beam/pull/7455 >>> > [2] https://github.com/jd/tenacity >>> > [3] https://pypi.org/project/tenacity/ >>> >>