Re: Disabling flaky tests

2013-06-05 Thread Bertrand Delacretaz
On Tue, Jun 4, 2013 at 6:37 PM, Carsten Ziegeler  wrote:
> ...I would agree if the jenkins we have would be stable; i have the feeling
> that a lot of build failures on that instance are simply because of some
> problems of the build server itself...

There were issues in the last few weeks but
https://builds.apache.org/view/S-Z/view/Sling/ looks better now IMO.

Builds have been failing there in the last few days for rational
reasons - I just fixed SLING-2903 which broke the build yesterday
(shame on me ;-)

We also have or had tests that fail semi-randomly, like SLING-2818
which looks good now since I added retries. We have quite a lot of
asynchronous setup points in Sling, which means our tests must
sometimes wait for things to settle after setting up scripts of
configs, that complicates things a bit.

-Bertrand


Re: Disabling flaky tests

2013-06-04 Thread Carsten Ziegeler
I would agree if the jenkins we have would be stable; i have the feeling
that a lot of build failures on that instance are simply because of some
problems of the build server itself. I tried to chased too many times such
issues. For example, right now, every day we get build errors reported in
some modules while at the next day they pass - without any changes at all.

So I think the first step is to have a stable build env

But I'm fine with any direction as long as we fix known bugs before a
release

Carsten


2013/6/4 Jeff Young 

> I used to agree as well, but my opinion is now more nuanced.  I've
> experienced projects where a test keeps failing day after day, and after a
> while developers stop looking at the test results with the same level of
> discipline.
>
> Perhaps Sling is small enough (and the developers are pro-active enough)
> that this isn't an issue.  But it certainly is on some other, larger, more
> disperse projects (such as CQ).  In those, moving a failing test to an
> issue (which can be assigned to an individual) can produce better results
> than everyone simply getting used to the build being red.
>
> Cheers,
> Jeff.
>
>
> > -Original Message-
> > From: Carsten Ziegeler [mailto:cziege...@apache.org]
> > Sent: 03 June 2013 07:01
> > To: dev@sling.apache.org
> > Subject: Re: Disabling flaky tests
> >
> > I agree as well, especially for the error handling as this is partially
> not
> > a problem of the test but really a bug in Sling - we have an issue for
> > that, it just needs to be done :)
> >
> > Carsten
> >
> >
> > 2013/6/3 Felix Meschberger 
> >
> > > I agree here: Disabling the test and having an issue keeps the build
> green
> > > but bears the danger of forgetting about it ...
> > >
> > > Regards
> > > Felix
> > >
> > > Am 02.06.2013 um 16:04 schrieb Eric Norman:
> > >
> > > > Personally, I'm not a big fan of hiding flaky/failing tests since it
> > > tends
> > > > to remove some of the motivation to stabilize/fix them in a timely
> > > manner.
> > > >
> > > > That's my 2 cents.
> > > >
> > > > Regards,
> > > > Eric
> > > >
> > > > On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu <
> romb...@apache.org
> > > >wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> It seems that the ErrorHandlingTest fails sporadically when run
> inside a
> > > >> full maven build. I've tried locating the root cause for a couple of
> > > >> hours but failed. For this test, and for future flaky/failing
> tests, I
> > > >> suggest that we
> > > >>
> > > >> 1. Create an issue for the failing test
> > > >> 2. Disable the test and mark it with the issue key
> > > >> 3. Re-enable the test when it is stable/passing ( which may be
> > > >> considerably later than step 2)
> > > >> 4. Close the issue after the test is re-enabled
> > > >>
> > > >> This has the advantage of keeping the build green and making it
> easier
> > > >> to find regressions since a failing or unstable build will actually
> mean
> > > >> something.
> > > >>
> > > >> What do you think?
> > > >>
> > > >> Robert
> > > >>
> > > >>
> > >
> > >
> >
> >
> > --
> > Carsten Ziegeler
> > cziege...@apache.org
>



-- 
Carsten Ziegeler
cziege...@apache.org


RE: Disabling flaky tests

2013-06-04 Thread Jeff Young
I used to agree as well, but my opinion is now more nuanced.  I've experienced 
projects where a test keeps failing day after day, and after a while developers 
stop looking at the test results with the same level of discipline.

Perhaps Sling is small enough (and the developers are pro-active enough) that 
this isn't an issue.  But it certainly is on some other, larger, more disperse 
projects (such as CQ).  In those, moving a failing test to an issue (which can 
be assigned to an individual) can produce better results than everyone simply 
getting used to the build being red.

Cheers,
Jeff.


> -Original Message-
> From: Carsten Ziegeler [mailto:cziege...@apache.org]
> Sent: 03 June 2013 07:01
> To: dev@sling.apache.org
> Subject: Re: Disabling flaky tests
> 
> I agree as well, especially for the error handling as this is partially not
> a problem of the test but really a bug in Sling - we have an issue for
> that, it just needs to be done :)
> 
> Carsten
> 
> 
> 2013/6/3 Felix Meschberger 
> 
> > I agree here: Disabling the test and having an issue keeps the build green
> > but bears the danger of forgetting about it ...
> >
> > Regards
> > Felix
> >
> > Am 02.06.2013 um 16:04 schrieb Eric Norman:
> >
> > > Personally, I'm not a big fan of hiding flaky/failing tests since it
> > tends
> > > to remove some of the motivation to stabilize/fix them in a timely
> > manner.
> > >
> > > That's my 2 cents.
> > >
> > > Regards,
> > > Eric
> > >
> > > On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu  > >wrote:
> > >
> > >> Hi,
> > >>
> > >> It seems that the ErrorHandlingTest fails sporadically when run inside a
> > >> full maven build. I've tried locating the root cause for a couple of
> > >> hours but failed. For this test, and for future flaky/failing tests, I
> > >> suggest that we
> > >>
> > >> 1. Create an issue for the failing test
> > >> 2. Disable the test and mark it with the issue key
> > >> 3. Re-enable the test when it is stable/passing ( which may be
> > >> considerably later than step 2)
> > >> 4. Close the issue after the test is re-enabled
> > >>
> > >> This has the advantage of keeping the build green and making it easier
> > >> to find regressions since a failing or unstable build will actually mean
> > >> something.
> > >>
> > >> What do you think?
> > >>
> > >> Robert
> > >>
> > >>
> >
> >
> 
> 
> --
> Carsten Ziegeler
> cziege...@apache.org


Re: Disabling flaky tests

2013-06-03 Thread Bertrand Delacretaz
Hi,

On Mon, Jun 3, 2013 at 2:17 PM, Carsten Ziegeler  wrote:
> ...SLING-2724 needs to be fixed anyway, I'll try to get this done this
> week...

Ok - note that the ErrorHandlingTest fails semi-randomly, so not sure
if that's related.

I've just added retries to those tests (SLING-2818) as I suspect a
different problem - you might try removing them if you find and fix
the root cause, but ideally after we get a few green builds at
https://builds.apache.org/view/S-Z/view/Sling/job/sling-trunk-1.6/ so
we can see if that helps.

-Bertrand


Re: Disabling flaky tests

2013-06-03 Thread Carsten Ziegeler
Hi,

for the error handling tests we should fix SLING-2724 and then revisit the
tests. I looked at them once and I think they used some wrong assumptions
at that time. But I can't recall this right now.
Anyway, SLING-2724 needs to be fixed anyway, I'll try to get this done this
week

Carsten


2013/6/3 Robert Munteanu 

> On Mon, 2013-06-03 at 10:00 +0200, Bertrand Delacretaz wrote:
> > Hi,
> >
> > On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu 
> wrote:
> > > ...It seems that the ErrorHandlingTest fails sporadically when run
> inside a
> > > full maven build...
> >
> > I fixed a related issue in SLING-2818 a few weeks ago, is it something
> else now?
>
> Yes, the problem persists even after your fix.
>
> >
> > > For this test, and for future flaky/failing tests, I
> > > suggest that we
> > >
> > > 1. Create an issue for the failing test
> > > 2. Disable the test and mark it with the issue key
> > > 3. Re-enable the test when it is stable/passing ( which may be
> > > considerably later than step 2)
> > > 4. Close the issue after the test is re-enabled...
> >
> > I agree with that, and I would make the next release of the module in
> > question dependent on that issue, to make sure we look at it before
> > releasing.
>
>
> It seems that we don't have a consensus on it, so I'd amend the process
> to be
>
> 1. Create an issue for the failing test
> 2. Target it for the next release of the affected module ( if applicable
> ) or for the next version of the launchpad
>
> Hopefully that's something which is agreeable to everyone
>
> Robert
>
>


-- 
Carsten Ziegeler
cziege...@apache.org


Re: Disabling flaky tests

2013-06-03 Thread Robert Munteanu
On Mon, 2013-06-03 at 10:00 +0200, Bertrand Delacretaz wrote:
> Hi,
> 
> On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu  wrote:
> > ...It seems that the ErrorHandlingTest fails sporadically when run inside a
> > full maven build...
> 
> I fixed a related issue in SLING-2818 a few weeks ago, is it something else 
> now?

Yes, the problem persists even after your fix.

> 
> > For this test, and for future flaky/failing tests, I
> > suggest that we
> >
> > 1. Create an issue for the failing test
> > 2. Disable the test and mark it with the issue key
> > 3. Re-enable the test when it is stable/passing ( which may be
> > considerably later than step 2)
> > 4. Close the issue after the test is re-enabled...
> 
> I agree with that, and I would make the next release of the module in
> question dependent on that issue, to make sure we look at it before
> releasing.


It seems that we don't have a consensus on it, so I'd amend the process
to be

1. Create an issue for the failing test
2. Target it for the next release of the affected module ( if applicable
) or for the next version of the launchpad

Hopefully that's something which is agreeable to everyone

Robert



Re: Disabling flaky tests

2013-06-03 Thread Robert Munteanu
Hm, my proposal isn't winning any popularity contests, so I won't go
through with it :-)



Re: Disabling flaky tests

2013-06-03 Thread Bertrand Delacretaz
On Mon, Jun 3, 2013 at 10:00 AM, Bertrand Delacretaz
 wrote:
> ...I fixed a related issue in SLING-2818 a few weeks ago, is it something 
> else now?...

Looking at it again there might be some latency after uploading an
error handler script, before it is activated.

It's not a problem in normal use but the tests might be failing for
that reason - I have reopened the above issue, looking at it now.

-Bertrand


Re: Disabling flaky tests

2013-06-03 Thread Bertrand Delacretaz
Hi,

On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu  wrote:
> ...It seems that the ErrorHandlingTest fails sporadically when run inside a
> full maven build...

I fixed a related issue in SLING-2818 a few weeks ago, is it something else now?

> For this test, and for future flaky/failing tests, I
> suggest that we
>
> 1. Create an issue for the failing test
> 2. Disable the test and mark it with the issue key
> 3. Re-enable the test when it is stable/passing ( which may be
> considerably later than step 2)
> 4. Close the issue after the test is re-enabled...

I agree with that, and I would make the next release of the module in
question dependent on that issue, to make sure we look at it before
releasing.

>
> ...This has the advantage of keeping the build green and making it easier
> to find regressions since a failing or unstable build will actually mean
> something

I agree - Jenkins flakyness and some semi-randomly failing tests lead
us to neglect the "jenkins build must be green" rule and that's not
good.

-Bertrand


Re: Disabling flaky tests

2013-06-02 Thread Carsten Ziegeler
I agree as well, especially for the error handling as this is partially not
a problem of the test but really a bug in Sling - we have an issue for
that, it just needs to be done :)

Carsten


2013/6/3 Felix Meschberger 

> I agree here: Disabling the test and having an issue keeps the build green
> but bears the danger of forgetting about it ...
>
> Regards
> Felix
>
> Am 02.06.2013 um 16:04 schrieb Eric Norman:
>
> > Personally, I'm not a big fan of hiding flaky/failing tests since it
> tends
> > to remove some of the motivation to stabilize/fix them in a timely
> manner.
> >
> > That's my 2 cents.
> >
> > Regards,
> > Eric
> >
> > On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu  >wrote:
> >
> >> Hi,
> >>
> >> It seems that the ErrorHandlingTest fails sporadically when run inside a
> >> full maven build. I've tried locating the root cause for a couple of
> >> hours but failed. For this test, and for future flaky/failing tests, I
> >> suggest that we
> >>
> >> 1. Create an issue for the failing test
> >> 2. Disable the test and mark it with the issue key
> >> 3. Re-enable the test when it is stable/passing ( which may be
> >> considerably later than step 2)
> >> 4. Close the issue after the test is re-enabled
> >>
> >> This has the advantage of keeping the build green and making it easier
> >> to find regressions since a failing or unstable build will actually mean
> >> something.
> >>
> >> What do you think?
> >>
> >> Robert
> >>
> >>
>
>


-- 
Carsten Ziegeler
cziege...@apache.org


Re: Disabling flaky tests

2013-06-02 Thread Felix Meschberger
I agree here: Disabling the test and having an issue keeps the build green but 
bears the danger of forgetting about it ...

Regards
Felix

Am 02.06.2013 um 16:04 schrieb Eric Norman:

> Personally, I'm not a big fan of hiding flaky/failing tests since it tends
> to remove some of the motivation to stabilize/fix them in a timely manner.
> 
> That's my 2 cents.
> 
> Regards,
> Eric
> 
> On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu wrote:
> 
>> Hi,
>> 
>> It seems that the ErrorHandlingTest fails sporadically when run inside a
>> full maven build. I've tried locating the root cause for a couple of
>> hours but failed. For this test, and for future flaky/failing tests, I
>> suggest that we
>> 
>> 1. Create an issue for the failing test
>> 2. Disable the test and mark it with the issue key
>> 3. Re-enable the test when it is stable/passing ( which may be
>> considerably later than step 2)
>> 4. Close the issue after the test is re-enabled
>> 
>> This has the advantage of keeping the build green and making it easier
>> to find regressions since a failing or unstable build will actually mean
>> something.
>> 
>> What do you think?
>> 
>> Robert
>> 
>> 



Re: Disabling flaky tests

2013-06-02 Thread Eric Norman
Personally, I'm not a big fan of hiding flaky/failing tests since it tends
to remove some of the motivation to stabilize/fix them in a timely manner.

That's my 2 cents.

Regards,
Eric

On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu wrote:

> Hi,
>
> It seems that the ErrorHandlingTest fails sporadically when run inside a
> full maven build. I've tried locating the root cause for a couple of
> hours but failed. For this test, and for future flaky/failing tests, I
> suggest that we
>
> 1. Create an issue for the failing test
> 2. Disable the test and mark it with the issue key
> 3. Re-enable the test when it is stable/passing ( which may be
> considerably later than step 2)
> 4. Close the issue after the test is re-enabled
>
> This has the advantage of keeping the build green and making it easier
> to find regressions since a failing or unstable build will actually mean
> something.
>
> What do you think?
>
> Robert
>
>


Disabling flaky tests

2013-05-31 Thread Robert Munteanu
Hi,

It seems that the ErrorHandlingTest fails sporadically when run inside a
full maven build. I've tried locating the root cause for a couple of
hours but failed. For this test, and for future flaky/failing tests, I
suggest that we

1. Create an issue for the failing test
2. Disable the test and mark it with the issue key
3. Re-enable the test when it is stable/passing ( which may be
considerably later than step 2)
4. Close the issue after the test is re-enabled

This has the advantage of keeping the build green and making it easier
to find regressions since a failing or unstable build will actually mean
something.

What do you think?

Robert