Re: Disabling flaky tests
On Tue, Jun 4, 2013 at 6:37 PM, Carsten Ziegeler wrote: > ...I would agree if the jenkins we have would be stable; i have the feeling > that a lot of build failures on that instance are simply because of some > problems of the build server itself... There were issues in the last few weeks but https://builds.apache.org/view/S-Z/view/Sling/ looks better now IMO. Builds have been failing there in the last few days for rational reasons - I just fixed SLING-2903 which broke the build yesterday (shame on me ;-) We also have or had tests that fail semi-randomly, like SLING-2818 which looks good now since I added retries. We have quite a lot of asynchronous setup points in Sling, which means our tests must sometimes wait for things to settle after setting up scripts of configs, that complicates things a bit. -Bertrand
Re: Disabling flaky tests
I would agree if the jenkins we have would be stable; i have the feeling that a lot of build failures on that instance are simply because of some problems of the build server itself. I tried to chased too many times such issues. For example, right now, every day we get build errors reported in some modules while at the next day they pass - without any changes at all. So I think the first step is to have a stable build env But I'm fine with any direction as long as we fix known bugs before a release Carsten 2013/6/4 Jeff Young > I used to agree as well, but my opinion is now more nuanced. I've > experienced projects where a test keeps failing day after day, and after a > while developers stop looking at the test results with the same level of > discipline. > > Perhaps Sling is small enough (and the developers are pro-active enough) > that this isn't an issue. But it certainly is on some other, larger, more > disperse projects (such as CQ). In those, moving a failing test to an > issue (which can be assigned to an individual) can produce better results > than everyone simply getting used to the build being red. > > Cheers, > Jeff. > > > > -Original Message- > > From: Carsten Ziegeler [mailto:cziege...@apache.org] > > Sent: 03 June 2013 07:01 > > To: dev@sling.apache.org > > Subject: Re: Disabling flaky tests > > > > I agree as well, especially for the error handling as this is partially > not > > a problem of the test but really a bug in Sling - we have an issue for > > that, it just needs to be done :) > > > > Carsten > > > > > > 2013/6/3 Felix Meschberger > > > > > I agree here: Disabling the test and having an issue keeps the build > green > > > but bears the danger of forgetting about it ... > > > > > > Regards > > > Felix > > > > > > Am 02.06.2013 um 16:04 schrieb Eric Norman: > > > > > > > Personally, I'm not a big fan of hiding flaky/failing tests since it > > > tends > > > > to remove some of the motivation to stabilize/fix them in a timely > > > manner. > > > > > > > > That's my 2 cents. > > > > > > > > Regards, > > > > Eric > > > > > > > > On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu < > romb...@apache.org > > > >wrote: > > > > > > > >> Hi, > > > >> > > > >> It seems that the ErrorHandlingTest fails sporadically when run > inside a > > > >> full maven build. I've tried locating the root cause for a couple of > > > >> hours but failed. For this test, and for future flaky/failing > tests, I > > > >> suggest that we > > > >> > > > >> 1. Create an issue for the failing test > > > >> 2. Disable the test and mark it with the issue key > > > >> 3. Re-enable the test when it is stable/passing ( which may be > > > >> considerably later than step 2) > > > >> 4. Close the issue after the test is re-enabled > > > >> > > > >> This has the advantage of keeping the build green and making it > easier > > > >> to find regressions since a failing or unstable build will actually > mean > > > >> something. > > > >> > > > >> What do you think? > > > >> > > > >> Robert > > > >> > > > >> > > > > > > > > > > > > -- > > Carsten Ziegeler > > cziege...@apache.org > -- Carsten Ziegeler cziege...@apache.org
RE: Disabling flaky tests
I used to agree as well, but my opinion is now more nuanced. I've experienced projects where a test keeps failing day after day, and after a while developers stop looking at the test results with the same level of discipline. Perhaps Sling is small enough (and the developers are pro-active enough) that this isn't an issue. But it certainly is on some other, larger, more disperse projects (such as CQ). In those, moving a failing test to an issue (which can be assigned to an individual) can produce better results than everyone simply getting used to the build being red. Cheers, Jeff. > -Original Message- > From: Carsten Ziegeler [mailto:cziege...@apache.org] > Sent: 03 June 2013 07:01 > To: dev@sling.apache.org > Subject: Re: Disabling flaky tests > > I agree as well, especially for the error handling as this is partially not > a problem of the test but really a bug in Sling - we have an issue for > that, it just needs to be done :) > > Carsten > > > 2013/6/3 Felix Meschberger > > > I agree here: Disabling the test and having an issue keeps the build green > > but bears the danger of forgetting about it ... > > > > Regards > > Felix > > > > Am 02.06.2013 um 16:04 schrieb Eric Norman: > > > > > Personally, I'm not a big fan of hiding flaky/failing tests since it > > tends > > > to remove some of the motivation to stabilize/fix them in a timely > > manner. > > > > > > That's my 2 cents. > > > > > > Regards, > > > Eric > > > > > > On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu > >wrote: > > > > > >> Hi, > > >> > > >> It seems that the ErrorHandlingTest fails sporadically when run inside a > > >> full maven build. I've tried locating the root cause for a couple of > > >> hours but failed. For this test, and for future flaky/failing tests, I > > >> suggest that we > > >> > > >> 1. Create an issue for the failing test > > >> 2. Disable the test and mark it with the issue key > > >> 3. Re-enable the test when it is stable/passing ( which may be > > >> considerably later than step 2) > > >> 4. Close the issue after the test is re-enabled > > >> > > >> This has the advantage of keeping the build green and making it easier > > >> to find regressions since a failing or unstable build will actually mean > > >> something. > > >> > > >> What do you think? > > >> > > >> Robert > > >> > > >> > > > > > > > -- > Carsten Ziegeler > cziege...@apache.org
Re: Disabling flaky tests
Hi, On Mon, Jun 3, 2013 at 2:17 PM, Carsten Ziegeler wrote: > ...SLING-2724 needs to be fixed anyway, I'll try to get this done this > week... Ok - note that the ErrorHandlingTest fails semi-randomly, so not sure if that's related. I've just added retries to those tests (SLING-2818) as I suspect a different problem - you might try removing them if you find and fix the root cause, but ideally after we get a few green builds at https://builds.apache.org/view/S-Z/view/Sling/job/sling-trunk-1.6/ so we can see if that helps. -Bertrand
Re: Disabling flaky tests
Hi, for the error handling tests we should fix SLING-2724 and then revisit the tests. I looked at them once and I think they used some wrong assumptions at that time. But I can't recall this right now. Anyway, SLING-2724 needs to be fixed anyway, I'll try to get this done this week Carsten 2013/6/3 Robert Munteanu > On Mon, 2013-06-03 at 10:00 +0200, Bertrand Delacretaz wrote: > > Hi, > > > > On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu > wrote: > > > ...It seems that the ErrorHandlingTest fails sporadically when run > inside a > > > full maven build... > > > > I fixed a related issue in SLING-2818 a few weeks ago, is it something > else now? > > Yes, the problem persists even after your fix. > > > > > > For this test, and for future flaky/failing tests, I > > > suggest that we > > > > > > 1. Create an issue for the failing test > > > 2. Disable the test and mark it with the issue key > > > 3. Re-enable the test when it is stable/passing ( which may be > > > considerably later than step 2) > > > 4. Close the issue after the test is re-enabled... > > > > I agree with that, and I would make the next release of the module in > > question dependent on that issue, to make sure we look at it before > > releasing. > > > It seems that we don't have a consensus on it, so I'd amend the process > to be > > 1. Create an issue for the failing test > 2. Target it for the next release of the affected module ( if applicable > ) or for the next version of the launchpad > > Hopefully that's something which is agreeable to everyone > > Robert > > -- Carsten Ziegeler cziege...@apache.org
Re: Disabling flaky tests
On Mon, 2013-06-03 at 10:00 +0200, Bertrand Delacretaz wrote: > Hi, > > On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu wrote: > > ...It seems that the ErrorHandlingTest fails sporadically when run inside a > > full maven build... > > I fixed a related issue in SLING-2818 a few weeks ago, is it something else > now? Yes, the problem persists even after your fix. > > > For this test, and for future flaky/failing tests, I > > suggest that we > > > > 1. Create an issue for the failing test > > 2. Disable the test and mark it with the issue key > > 3. Re-enable the test when it is stable/passing ( which may be > > considerably later than step 2) > > 4. Close the issue after the test is re-enabled... > > I agree with that, and I would make the next release of the module in > question dependent on that issue, to make sure we look at it before > releasing. It seems that we don't have a consensus on it, so I'd amend the process to be 1. Create an issue for the failing test 2. Target it for the next release of the affected module ( if applicable ) or for the next version of the launchpad Hopefully that's something which is agreeable to everyone Robert
Re: Disabling flaky tests
Hm, my proposal isn't winning any popularity contests, so I won't go through with it :-)
Re: Disabling flaky tests
On Mon, Jun 3, 2013 at 10:00 AM, Bertrand Delacretaz wrote: > ...I fixed a related issue in SLING-2818 a few weeks ago, is it something > else now?... Looking at it again there might be some latency after uploading an error handler script, before it is activated. It's not a problem in normal use but the tests might be failing for that reason - I have reopened the above issue, looking at it now. -Bertrand
Re: Disabling flaky tests
Hi, On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu wrote: > ...It seems that the ErrorHandlingTest fails sporadically when run inside a > full maven build... I fixed a related issue in SLING-2818 a few weeks ago, is it something else now? > For this test, and for future flaky/failing tests, I > suggest that we > > 1. Create an issue for the failing test > 2. Disable the test and mark it with the issue key > 3. Re-enable the test when it is stable/passing ( which may be > considerably later than step 2) > 4. Close the issue after the test is re-enabled... I agree with that, and I would make the next release of the module in question dependent on that issue, to make sure we look at it before releasing. > > ...This has the advantage of keeping the build green and making it easier > to find regressions since a failing or unstable build will actually mean > something I agree - Jenkins flakyness and some semi-randomly failing tests lead us to neglect the "jenkins build must be green" rule and that's not good. -Bertrand
Re: Disabling flaky tests
I agree as well, especially for the error handling as this is partially not a problem of the test but really a bug in Sling - we have an issue for that, it just needs to be done :) Carsten 2013/6/3 Felix Meschberger > I agree here: Disabling the test and having an issue keeps the build green > but bears the danger of forgetting about it ... > > Regards > Felix > > Am 02.06.2013 um 16:04 schrieb Eric Norman: > > > Personally, I'm not a big fan of hiding flaky/failing tests since it > tends > > to remove some of the motivation to stabilize/fix them in a timely > manner. > > > > That's my 2 cents. > > > > Regards, > > Eric > > > > On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu >wrote: > > > >> Hi, > >> > >> It seems that the ErrorHandlingTest fails sporadically when run inside a > >> full maven build. I've tried locating the root cause for a couple of > >> hours but failed. For this test, and for future flaky/failing tests, I > >> suggest that we > >> > >> 1. Create an issue for the failing test > >> 2. Disable the test and mark it with the issue key > >> 3. Re-enable the test when it is stable/passing ( which may be > >> considerably later than step 2) > >> 4. Close the issue after the test is re-enabled > >> > >> This has the advantage of keeping the build green and making it easier > >> to find regressions since a failing or unstable build will actually mean > >> something. > >> > >> What do you think? > >> > >> Robert > >> > >> > > -- Carsten Ziegeler cziege...@apache.org
Re: Disabling flaky tests
I agree here: Disabling the test and having an issue keeps the build green but bears the danger of forgetting about it ... Regards Felix Am 02.06.2013 um 16:04 schrieb Eric Norman: > Personally, I'm not a big fan of hiding flaky/failing tests since it tends > to remove some of the motivation to stabilize/fix them in a timely manner. > > That's my 2 cents. > > Regards, > Eric > > On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu wrote: > >> Hi, >> >> It seems that the ErrorHandlingTest fails sporadically when run inside a >> full maven build. I've tried locating the root cause for a couple of >> hours but failed. For this test, and for future flaky/failing tests, I >> suggest that we >> >> 1. Create an issue for the failing test >> 2. Disable the test and mark it with the issue key >> 3. Re-enable the test when it is stable/passing ( which may be >> considerably later than step 2) >> 4. Close the issue after the test is re-enabled >> >> This has the advantage of keeping the build green and making it easier >> to find regressions since a failing or unstable build will actually mean >> something. >> >> What do you think? >> >> Robert >> >>
Re: Disabling flaky tests
Personally, I'm not a big fan of hiding flaky/failing tests since it tends to remove some of the motivation to stabilize/fix them in a timely manner. That's my 2 cents. Regards, Eric On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu wrote: > Hi, > > It seems that the ErrorHandlingTest fails sporadically when run inside a > full maven build. I've tried locating the root cause for a couple of > hours but failed. For this test, and for future flaky/failing tests, I > suggest that we > > 1. Create an issue for the failing test > 2. Disable the test and mark it with the issue key > 3. Re-enable the test when it is stable/passing ( which may be > considerably later than step 2) > 4. Close the issue after the test is re-enabled > > This has the advantage of keeping the build green and making it easier > to find regressions since a failing or unstable build will actually mean > something. > > What do you think? > > Robert > >
Disabling flaky tests
Hi, It seems that the ErrorHandlingTest fails sporadically when run inside a full maven build. I've tried locating the root cause for a couple of hours but failed. For this test, and for future flaky/failing tests, I suggest that we 1. Create an issue for the failing test 2. Disable the test and mark it with the issue key 3. Re-enable the test when it is stable/passing ( which may be considerably later than step 2) 4. Close the issue after the test is re-enabled This has the advantage of keeping the build green and making it easier to find regressions since a failing or unstable build will actually mean something. What do you think? Robert