Re: Disabling flaky tests

2013-06-05 Thread Bertrand Delacretaz
On Tue, Jun 4, 2013 at 6:37 PM, Carsten Ziegeler cziege...@apache.org wrote:
 ...I would agree if the jenkins we have would be stable; i have the feeling
 that a lot of build failures on that instance are simply because of some
 problems of the build server itself...

There were issues in the last few weeks but
https://builds.apache.org/view/S-Z/view/Sling/ looks better now IMO.

Builds have been failing there in the last few days for rational
reasons - I just fixed SLING-2903 which broke the build yesterday
(shame on me ;-)

We also have or had tests that fail semi-randomly, like SLING-2818
which looks good now since I added retries. We have quite a lot of
asynchronous setup points in Sling, which means our tests must
sometimes wait for things to settle after setting up scripts of
configs, that complicates things a bit.

-Bertrand


RE: Disabling flaky tests

2013-06-04 Thread Jeff Young
I used to agree as well, but my opinion is now more nuanced.  I've experienced 
projects where a test keeps failing day after day, and after a while developers 
stop looking at the test results with the same level of discipline.

Perhaps Sling is small enough (and the developers are pro-active enough) that 
this isn't an issue.  But it certainly is on some other, larger, more disperse 
projects (such as CQ).  In those, moving a failing test to an issue (which can 
be assigned to an individual) can produce better results than everyone simply 
getting used to the build being red.

Cheers,
Jeff.


 -Original Message-
 From: Carsten Ziegeler [mailto:cziege...@apache.org]
 Sent: 03 June 2013 07:01
 To: dev@sling.apache.org
 Subject: Re: Disabling flaky tests
 
 I agree as well, especially for the error handling as this is partially not
 a problem of the test but really a bug in Sling - we have an issue for
 that, it just needs to be done :)
 
 Carsten
 
 
 2013/6/3 Felix Meschberger fmesc...@adobe.com
 
  I agree here: Disabling the test and having an issue keeps the build green
  but bears the danger of forgetting about it ...
 
  Regards
  Felix
 
  Am 02.06.2013 um 16:04 schrieb Eric Norman:
 
   Personally, I'm not a big fan of hiding flaky/failing tests since it
  tends
   to remove some of the motivation to stabilize/fix them in a timely
  manner.
  
   That's my 2 cents.
  
   Regards,
   Eric
  
   On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu romb...@apache.org
  wrote:
  
   Hi,
  
   It seems that the ErrorHandlingTest fails sporadically when run inside a
   full maven build. I've tried locating the root cause for a couple of
   hours but failed. For this test, and for future flaky/failing tests, I
   suggest that we
  
   1. Create an issue for the failing test
   2. Disable the test and mark it with the issue key
   3. Re-enable the test when it is stable/passing ( which may be
   considerably later than step 2)
   4. Close the issue after the test is re-enabled
  
   This has the advantage of keeping the build green and making it easier
   to find regressions since a failing or unstable build will actually mean
   something.
  
   What do you think?
  
   Robert
  
  
 
 
 
 
 --
 Carsten Ziegeler
 cziege...@apache.org


Re: Disabling flaky tests

2013-06-04 Thread Carsten Ziegeler
I would agree if the jenkins we have would be stable; i have the feeling
that a lot of build failures on that instance are simply because of some
problems of the build server itself. I tried to chased too many times such
issues. For example, right now, every day we get build errors reported in
some modules while at the next day they pass - without any changes at all.

So I think the first step is to have a stable build env

But I'm fine with any direction as long as we fix known bugs before a
release

Carsten


2013/6/4 Jeff Young j...@adobe.com

 I used to agree as well, but my opinion is now more nuanced.  I've
 experienced projects where a test keeps failing day after day, and after a
 while developers stop looking at the test results with the same level of
 discipline.

 Perhaps Sling is small enough (and the developers are pro-active enough)
 that this isn't an issue.  But it certainly is on some other, larger, more
 disperse projects (such as CQ).  In those, moving a failing test to an
 issue (which can be assigned to an individual) can produce better results
 than everyone simply getting used to the build being red.

 Cheers,
 Jeff.


  -Original Message-
  From: Carsten Ziegeler [mailto:cziege...@apache.org]
  Sent: 03 June 2013 07:01
  To: dev@sling.apache.org
  Subject: Re: Disabling flaky tests
 
  I agree as well, especially for the error handling as this is partially
 not
  a problem of the test but really a bug in Sling - we have an issue for
  that, it just needs to be done :)
 
  Carsten
 
 
  2013/6/3 Felix Meschberger fmesc...@adobe.com
 
   I agree here: Disabling the test and having an issue keeps the build
 green
   but bears the danger of forgetting about it ...
  
   Regards
   Felix
  
   Am 02.06.2013 um 16:04 schrieb Eric Norman:
  
Personally, I'm not a big fan of hiding flaky/failing tests since it
   tends
to remove some of the motivation to stabilize/fix them in a timely
   manner.
   
That's my 2 cents.
   
Regards,
Eric
   
On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu 
 romb...@apache.org
   wrote:
   
Hi,
   
It seems that the ErrorHandlingTest fails sporadically when run
 inside a
full maven build. I've tried locating the root cause for a couple of
hours but failed. For this test, and for future flaky/failing
 tests, I
suggest that we
   
1. Create an issue for the failing test
2. Disable the test and mark it with the issue key
3. Re-enable the test when it is stable/passing ( which may be
considerably later than step 2)
4. Close the issue after the test is re-enabled
   
This has the advantage of keeping the build green and making it
 easier
to find regressions since a failing or unstable build will actually
 mean
something.
   
What do you think?
   
Robert
   
   
  
  
 
 
  --
  Carsten Ziegeler
  cziege...@apache.org




-- 
Carsten Ziegeler
cziege...@apache.org


Re: Disabling flaky tests

2013-06-03 Thread Carsten Ziegeler
I agree as well, especially for the error handling as this is partially not
a problem of the test but really a bug in Sling - we have an issue for
that, it just needs to be done :)

Carsten


2013/6/3 Felix Meschberger fmesc...@adobe.com

 I agree here: Disabling the test and having an issue keeps the build green
 but bears the danger of forgetting about it ...

 Regards
 Felix

 Am 02.06.2013 um 16:04 schrieb Eric Norman:

  Personally, I'm not a big fan of hiding flaky/failing tests since it
 tends
  to remove some of the motivation to stabilize/fix them in a timely
 manner.
 
  That's my 2 cents.
 
  Regards,
  Eric
 
  On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu romb...@apache.org
 wrote:
 
  Hi,
 
  It seems that the ErrorHandlingTest fails sporadically when run inside a
  full maven build. I've tried locating the root cause for a couple of
  hours but failed. For this test, and for future flaky/failing tests, I
  suggest that we
 
  1. Create an issue for the failing test
  2. Disable the test and mark it with the issue key
  3. Re-enable the test when it is stable/passing ( which may be
  considerably later than step 2)
  4. Close the issue after the test is re-enabled
 
  This has the advantage of keeping the build green and making it easier
  to find regressions since a failing or unstable build will actually mean
  something.
 
  What do you think?
 
  Robert
 
 




-- 
Carsten Ziegeler
cziege...@apache.org


Re: Disabling flaky tests

2013-06-03 Thread Bertrand Delacretaz
Hi,

On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu romb...@apache.org wrote:
 ...It seems that the ErrorHandlingTest fails sporadically when run inside a
 full maven build...

I fixed a related issue in SLING-2818 a few weeks ago, is it something else now?

 For this test, and for future flaky/failing tests, I
 suggest that we

 1. Create an issue for the failing test
 2. Disable the test and mark it with the issue key
 3. Re-enable the test when it is stable/passing ( which may be
 considerably later than step 2)
 4. Close the issue after the test is re-enabled...

I agree with that, and I would make the next release of the module in
question dependent on that issue, to make sure we look at it before
releasing.


 ...This has the advantage of keeping the build green and making it easier
 to find regressions since a failing or unstable build will actually mean
 something

I agree - Jenkins flakyness and some semi-randomly failing tests lead
us to neglect the jenkins build must be green rule and that's not
good.

-Bertrand


Re: Disabling flaky tests

2013-06-03 Thread Bertrand Delacretaz
On Mon, Jun 3, 2013 at 10:00 AM, Bertrand Delacretaz
bdelacre...@apache.org wrote:
 ...I fixed a related issue in SLING-2818 a few weeks ago, is it something 
 else now?...

Looking at it again there might be some latency after uploading an
error handler script, before it is activated.

It's not a problem in normal use but the tests might be failing for
that reason - I have reopened the above issue, looking at it now.

-Bertrand


Re: Disabling flaky tests

2013-06-03 Thread Robert Munteanu
Hm, my proposal isn't winning any popularity contests, so I won't go
through with it :-)



Re: Disabling flaky tests

2013-06-03 Thread Robert Munteanu
On Mon, 2013-06-03 at 10:00 +0200, Bertrand Delacretaz wrote:
 Hi,
 
 On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu romb...@apache.org wrote:
  ...It seems that the ErrorHandlingTest fails sporadically when run inside a
  full maven build...
 
 I fixed a related issue in SLING-2818 a few weeks ago, is it something else 
 now?

Yes, the problem persists even after your fix.

 
  For this test, and for future flaky/failing tests, I
  suggest that we
 
  1. Create an issue for the failing test
  2. Disable the test and mark it with the issue key
  3. Re-enable the test when it is stable/passing ( which may be
  considerably later than step 2)
  4. Close the issue after the test is re-enabled...
 
 I agree with that, and I would make the next release of the module in
 question dependent on that issue, to make sure we look at it before
 releasing.


It seems that we don't have a consensus on it, so I'd amend the process
to be

1. Create an issue for the failing test
2. Target it for the next release of the affected module ( if applicable
) or for the next version of the launchpad

Hopefully that's something which is agreeable to everyone

Robert



Re: Disabling flaky tests

2013-06-03 Thread Carsten Ziegeler
Hi,

for the error handling tests we should fix SLING-2724 and then revisit the
tests. I looked at them once and I think they used some wrong assumptions
at that time. But I can't recall this right now.
Anyway, SLING-2724 needs to be fixed anyway, I'll try to get this done this
week

Carsten


2013/6/3 Robert Munteanu romb...@apache.org

 On Mon, 2013-06-03 at 10:00 +0200, Bertrand Delacretaz wrote:
  Hi,
 
  On Fri, May 31, 2013 at 9:14 PM, Robert Munteanu romb...@apache.org
 wrote:
   ...It seems that the ErrorHandlingTest fails sporadically when run
 inside a
   full maven build...
 
  I fixed a related issue in SLING-2818 a few weeks ago, is it something
 else now?

 Yes, the problem persists even after your fix.

 
   For this test, and for future flaky/failing tests, I
   suggest that we
  
   1. Create an issue for the failing test
   2. Disable the test and mark it with the issue key
   3. Re-enable the test when it is stable/passing ( which may be
   considerably later than step 2)
   4. Close the issue after the test is re-enabled...
 
  I agree with that, and I would make the next release of the module in
  question dependent on that issue, to make sure we look at it before
  releasing.


 It seems that we don't have a consensus on it, so I'd amend the process
 to be

 1. Create an issue for the failing test
 2. Target it for the next release of the affected module ( if applicable
 ) or for the next version of the launchpad

 Hopefully that's something which is agreeable to everyone

 Robert




-- 
Carsten Ziegeler
cziege...@apache.org


Re: Disabling flaky tests

2013-06-03 Thread Bertrand Delacretaz
Hi,

On Mon, Jun 3, 2013 at 2:17 PM, Carsten Ziegeler cziege...@apache.org wrote:
 ...SLING-2724 needs to be fixed anyway, I'll try to get this done this
 week...

Ok - note that the ErrorHandlingTest fails semi-randomly, so not sure
if that's related.

I've just added retries to those tests (SLING-2818) as I suspect a
different problem - you might try removing them if you find and fix
the root cause, but ideally after we get a few green builds at
https://builds.apache.org/view/S-Z/view/Sling/job/sling-trunk-1.6/ so
we can see if that helps.

-Bertrand


Re: Disabling flaky tests

2013-06-02 Thread Eric Norman
Personally, I'm not a big fan of hiding flaky/failing tests since it tends
to remove some of the motivation to stabilize/fix them in a timely manner.

That's my 2 cents.

Regards,
Eric

On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu romb...@apache.orgwrote:

 Hi,

 It seems that the ErrorHandlingTest fails sporadically when run inside a
 full maven build. I've tried locating the root cause for a couple of
 hours but failed. For this test, and for future flaky/failing tests, I
 suggest that we

 1. Create an issue for the failing test
 2. Disable the test and mark it with the issue key
 3. Re-enable the test when it is stable/passing ( which may be
 considerably later than step 2)
 4. Close the issue after the test is re-enabled

 This has the advantage of keeping the build green and making it easier
 to find regressions since a failing or unstable build will actually mean
 something.

 What do you think?

 Robert




Re: Disabling flaky tests

2013-06-02 Thread Felix Meschberger
I agree here: Disabling the test and having an issue keeps the build green but 
bears the danger of forgetting about it ...

Regards
Felix

Am 02.06.2013 um 16:04 schrieb Eric Norman:

 Personally, I'm not a big fan of hiding flaky/failing tests since it tends
 to remove some of the motivation to stabilize/fix them in a timely manner.
 
 That's my 2 cents.
 
 Regards,
 Eric
 
 On Fri, May 31, 2013 at 12:14 PM, Robert Munteanu romb...@apache.orgwrote:
 
 Hi,
 
 It seems that the ErrorHandlingTest fails sporadically when run inside a
 full maven build. I've tried locating the root cause for a couple of
 hours but failed. For this test, and for future flaky/failing tests, I
 suggest that we
 
 1. Create an issue for the failing test
 2. Disable the test and mark it with the issue key
 3. Re-enable the test when it is stable/passing ( which may be
 considerably later than step 2)
 4. Close the issue after the test is re-enabled
 
 This has the advantage of keeping the build green and making it easier
 to find regressions since a failing or unstable build will actually mean
 something.
 
 What do you think?
 
 Robert
 
 



Disabling flaky tests

2013-05-31 Thread Robert Munteanu
Hi,

It seems that the ErrorHandlingTest fails sporadically when run inside a
full maven build. I've tried locating the root cause for a couple of
hours but failed. For this test, and for future flaky/failing tests, I
suggest that we

1. Create an issue for the failing test
2. Disable the test and mark it with the issue key
3. Re-enable the test when it is stable/passing ( which may be
considerably later than step 2)
4. Close the issue after the test is re-enabled

This has the advantage of keeping the build green and making it easier
to find regressions since a failing or unstable build will actually mean
something.

What do you think?

Robert