Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Nabarun Nag
What help is needed in this effort?

Regards
Naba

On Fri, Feb 28, 2020 at 11:35 AM Mark Hanson  wrote:

> Alright, so basically it seems like people are not ok with the not
> requiring stressnewtest approach. So I guess that means that we need to
> identify -1s willing to help resolve this problem…
>
> Who would like to help?
>
> Thanks,
> Mark
>
> > On Feb 28, 2020, at 11:12 AM, Ernest Burghardt 
> wrote:
> >
> > -1 to limiting any tests... if there are issues with the tests let's fix
> > that.  we have too many commits coming in with little or no testing over
> > new/changed code, so I can't see how removing any existing test coverage
> as
> > a good idea
> >
> > Best,
> > EB
> >
> > On Fri, Feb 28, 2020 at 10:58 AM Mark Hanson  wrote:
> >
> >> Just to make sure we are clear, I am not suggesting that we disable
> >> stressnewtest, but that we make it not required. It would still run and
> >> provide feedback, but it would not give us an unwarranted green in my
> >> approach.
> >>
> >>> On Feb 28, 2020, at 10:49 AM, Ju@N  wrote:
> >>>
> >>> +1 to what Owen said, I don't think disabling *StressNewTest* is a
> >>> good idea.
> >>>
> >>> On Fri, 28 Feb 2020 at 18:35, Owen Nichols 
> wrote:
> >>>
>  -1 to making StressNew not required
> 
>  +1 to eliminating the current loophole — StressNew should never give a
>  free pass.
> 
>  Any time your PR is having trouble passing StressNew, please bring it
> up
>  on the dev list. We can review on a case-by-case basis and decide
> >> whether
>  to try increasing the timeout, changing the repeat count, refactoring
> >> the
>  PR, or as an absolute last resort requesting authorization for an
> >> override
>  (for example, a change to spotless rules might touch a huge number of
> >> files
>  but carry no risk).
> 
>  One bug we should fix is that StressNew sometimes counts more files
>  touched than really were, especially if you had many commits or merges
> >> or
>  rebases on your PR branch.  Possible workarounds there include
> squashing
>  and/or creating a new PR and/or splitting into multiple PRs.  I’ve
> spent
>  some time trying to reproduce why files are mis-counted, with no
> >> success,
>  but perhaps someone cleverer with git could provide a fix.
> 
>  Another issue is that StressNew is only in the PR pipeline, not the
> main
>  develop pipeline.  This feels like an asymmetry where PRs must pass a
>  “higher” standard.  We should consider adding some form of StressNew
> to
> >> the
>  main pipeline as well (maybe compare to the previous SHA that
> passed?).
> 
>  The original motivation for the 25-file limit was an attempt to limit
> >> how
>  long StressNew might run for.  Since concourse already applies a
> >> timeout,
>  that check is unnecessary.  However, a compromise solution might be to
> >> use
>  the number of files changed to dial back the number of repeats, e.g.
> >> stay
>  with 50 repeats if fewer than 25 files changed, otherwise compute
> 1250 /
>  <#-files-changed> and do only that many repeats (e.g. if 50 files
> >> changed,
>  run all tests 25x instead of 50x).
> 
>  While StressNew is intended to catch new flaky tests, it can also
> catch
>  poorly-designed tests that fail just by running twice in the same VM.
> >> This
>  may be a sign that the test does not clean up properly and could be
>  polluting other tests in unexpected ways?  It might be useful to run a
>  “StressNew” with repeats=2 over a much broader scope, maybe even all
> >> tests,
>  at least once in a while?
> 
> 
> > On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> >
> > Hi All,
> >
> > Proposal: Force StressNewTest to fail a change with 25 or more files
>  rather than pass it without running it.
> >
> > Currently, the StressNewTest job in the pipeline will just pass a job
>  that has more than 25 files changed. It will be marked as green with
> no
>  work done. There are reasons, relating to run time being too long to
> be
>  tracked by concourse, so we just let it through as a pass. I think
> this
> >> is
>  a bad signal. I think that this should automatically be a failure in
> the
>  short term. As a result, I also think it should not be required. It
> is a
>  bad signal, and I think that by making it a fail, this will at least
> not
>  give us a false sense of security. I understand that this opens the
> >> flood
>  gates so to speak, but I don’t think as a community it is a big
> problem
>  because we can still say that you should not merge if the
> StressNewTest
>  fails because of your test.
> >
> > I personally find the false sense of security more troubling than
>  anything. Hence the reason, I would like this to be
> >
> > Let me know what you think..
> >
> > Thanks,
> > Mark
> 
> 
> >>>
> >>> 

Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Mark Hanson
Alright, so basically it seems like people are not ok with the not requiring 
stressnewtest approach. So I guess that means that we need to identify -1s 
willing to help resolve this problem…

Who would like to help?

Thanks,
Mark

> On Feb 28, 2020, at 11:12 AM, Ernest Burghardt  wrote:
> 
> -1 to limiting any tests... if there are issues with the tests let's fix
> that.  we have too many commits coming in with little or no testing over
> new/changed code, so I can't see how removing any existing test coverage as
> a good idea
> 
> Best,
> EB
> 
> On Fri, Feb 28, 2020 at 10:58 AM Mark Hanson  wrote:
> 
>> Just to make sure we are clear, I am not suggesting that we disable
>> stressnewtest, but that we make it not required. It would still run and
>> provide feedback, but it would not give us an unwarranted green in my
>> approach.
>> 
>>> On Feb 28, 2020, at 10:49 AM, Ju@N  wrote:
>>> 
>>> +1 to what Owen said, I don't think disabling *StressNewTest* is a
>>> good idea.
>>> 
>>> On Fri, 28 Feb 2020 at 18:35, Owen Nichols  wrote:
>>> 
 -1 to making StressNew not required
 
 +1 to eliminating the current loophole — StressNew should never give a
 free pass.
 
 Any time your PR is having trouble passing StressNew, please bring it up
 on the dev list. We can review on a case-by-case basis and decide
>> whether
 to try increasing the timeout, changing the repeat count, refactoring
>> the
 PR, or as an absolute last resort requesting authorization for an
>> override
 (for example, a change to spotless rules might touch a huge number of
>> files
 but carry no risk).
 
 One bug we should fix is that StressNew sometimes counts more files
 touched than really were, especially if you had many commits or merges
>> or
 rebases on your PR branch.  Possible workarounds there include squashing
 and/or creating a new PR and/or splitting into multiple PRs.  I’ve spent
 some time trying to reproduce why files are mis-counted, with no
>> success,
 but perhaps someone cleverer with git could provide a fix.
 
 Another issue is that StressNew is only in the PR pipeline, not the main
 develop pipeline.  This feels like an asymmetry where PRs must pass a
 “higher” standard.  We should consider adding some form of StressNew to
>> the
 main pipeline as well (maybe compare to the previous SHA that passed?).
 
 The original motivation for the 25-file limit was an attempt to limit
>> how
 long StressNew might run for.  Since concourse already applies a
>> timeout,
 that check is unnecessary.  However, a compromise solution might be to
>> use
 the number of files changed to dial back the number of repeats, e.g.
>> stay
 with 50 repeats if fewer than 25 files changed, otherwise compute 1250 /
 <#-files-changed> and do only that many repeats (e.g. if 50 files
>> changed,
 run all tests 25x instead of 50x).
 
 While StressNew is intended to catch new flaky tests, it can also catch
 poorly-designed tests that fail just by running twice in the same VM.
>> This
 may be a sign that the test does not clean up properly and could be
 polluting other tests in unexpected ways?  It might be useful to run a
 “StressNew” with repeats=2 over a much broader scope, maybe even all
>> tests,
 at least once in a while?
 
 
> On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> 
> Hi All,
> 
> Proposal: Force StressNewTest to fail a change with 25 or more files
 rather than pass it without running it.
> 
> Currently, the StressNewTest job in the pipeline will just pass a job
 that has more than 25 files changed. It will be marked as green with no
 work done. There are reasons, relating to run time being too long to be
 tracked by concourse, so we just let it through as a pass. I think this
>> is
 a bad signal. I think that this should automatically be a failure in the
 short term. As a result, I also think it should not be required. It is a
 bad signal, and I think that by making it a fail, this will at least not
 give us a false sense of security. I understand that this opens the
>> flood
 gates so to speak, but I don’t think as a community it is a big problem
 because we can still say that you should not merge if the StressNewTest
 fails because of your test.
> 
> I personally find the false sense of security more troubling than
 anything. Hence the reason, I would like this to be
> 
> Let me know what you think..
> 
> Thanks,
> Mark
 
 
>>> 
>>> --
>>> Ju@N
>> 
>> 



Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Ernest Burghardt
-1 to limiting any tests... if there are issues with the tests let's fix
that.  we have too many commits coming in with little or no testing over
new/changed code, so I can't see how removing any existing test coverage as
a good idea

Best,
EB

On Fri, Feb 28, 2020 at 10:58 AM Mark Hanson  wrote:

> Just to make sure we are clear, I am not suggesting that we disable
> stressnewtest, but that we make it not required. It would still run and
> provide feedback, but it would not give us an unwarranted green in my
> approach.
>
> > On Feb 28, 2020, at 10:49 AM, Ju@N  wrote:
> >
> > +1 to what Owen said, I don't think disabling *StressNewTest* is a
> > good idea.
> >
> > On Fri, 28 Feb 2020 at 18:35, Owen Nichols  wrote:
> >
> >> -1 to making StressNew not required
> >>
> >> +1 to eliminating the current loophole — StressNew should never give a
> >> free pass.
> >>
> >> Any time your PR is having trouble passing StressNew, please bring it up
> >> on the dev list. We can review on a case-by-case basis and decide
> whether
> >> to try increasing the timeout, changing the repeat count, refactoring
> the
> >> PR, or as an absolute last resort requesting authorization for an
> override
> >> (for example, a change to spotless rules might touch a huge number of
> files
> >> but carry no risk).
> >>
> >> One bug we should fix is that StressNew sometimes counts more files
> >> touched than really were, especially if you had many commits or merges
> or
> >> rebases on your PR branch.  Possible workarounds there include squashing
> >> and/or creating a new PR and/or splitting into multiple PRs.  I’ve spent
> >> some time trying to reproduce why files are mis-counted, with no
> success,
> >> but perhaps someone cleverer with git could provide a fix.
> >>
> >> Another issue is that StressNew is only in the PR pipeline, not the main
> >> develop pipeline.  This feels like an asymmetry where PRs must pass a
> >> “higher” standard.  We should consider adding some form of StressNew to
> the
> >> main pipeline as well (maybe compare to the previous SHA that passed?).
> >>
> >> The original motivation for the 25-file limit was an attempt to limit
> how
> >> long StressNew might run for.  Since concourse already applies a
> timeout,
> >> that check is unnecessary.  However, a compromise solution might be to
> use
> >> the number of files changed to dial back the number of repeats, e.g.
> stay
> >> with 50 repeats if fewer than 25 files changed, otherwise compute 1250 /
> >> <#-files-changed> and do only that many repeats (e.g. if 50 files
> changed,
> >> run all tests 25x instead of 50x).
> >>
> >> While StressNew is intended to catch new flaky tests, it can also catch
> >> poorly-designed tests that fail just by running twice in the same VM.
> This
> >> may be a sign that the test does not clean up properly and could be
> >> polluting other tests in unexpected ways?  It might be useful to run a
> >> “StressNew” with repeats=2 over a much broader scope, maybe even all
> tests,
> >> at least once in a while?
> >>
> >>
> >>> On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> >>>
> >>> Hi All,
> >>>
> >>> Proposal: Force StressNewTest to fail a change with 25 or more files
> >> rather than pass it without running it.
> >>>
> >>> Currently, the StressNewTest job in the pipeline will just pass a job
> >> that has more than 25 files changed. It will be marked as green with no
> >> work done. There are reasons, relating to run time being too long to be
> >> tracked by concourse, so we just let it through as a pass. I think this
> is
> >> a bad signal. I think that this should automatically be a failure in the
> >> short term. As a result, I also think it should not be required. It is a
> >> bad signal, and I think that by making it a fail, this will at least not
> >> give us a false sense of security. I understand that this opens the
> flood
> >> gates so to speak, but I don’t think as a community it is a big problem
> >> because we can still say that you should not merge if the StressNewTest
> >> fails because of your test.
> >>>
> >>> I personally find the false sense of security more troubling than
> >> anything. Hence the reason, I would like this to be
> >>>
> >>> Let me know what you think..
> >>>
> >>> Thanks,
> >>> Mark
> >>
> >>
> >
> > --
> > Ju@N
>
>


Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Mark Hanson
Just to make sure we are clear, I am not suggesting that we disable 
stressnewtest, but that we make it not required. It would still run and provide 
feedback, but it would not give us an unwarranted green in my approach.

> On Feb 28, 2020, at 10:49 AM, Ju@N  wrote:
> 
> +1 to what Owen said, I don't think disabling *StressNewTest* is a
> good idea.
> 
> On Fri, 28 Feb 2020 at 18:35, Owen Nichols  wrote:
> 
>> -1 to making StressNew not required
>> 
>> +1 to eliminating the current loophole — StressNew should never give a
>> free pass.
>> 
>> Any time your PR is having trouble passing StressNew, please bring it up
>> on the dev list. We can review on a case-by-case basis and decide whether
>> to try increasing the timeout, changing the repeat count, refactoring the
>> PR, or as an absolute last resort requesting authorization for an override
>> (for example, a change to spotless rules might touch a huge number of files
>> but carry no risk).
>> 
>> One bug we should fix is that StressNew sometimes counts more files
>> touched than really were, especially if you had many commits or merges or
>> rebases on your PR branch.  Possible workarounds there include squashing
>> and/or creating a new PR and/or splitting into multiple PRs.  I’ve spent
>> some time trying to reproduce why files are mis-counted, with no success,
>> but perhaps someone cleverer with git could provide a fix.
>> 
>> Another issue is that StressNew is only in the PR pipeline, not the main
>> develop pipeline.  This feels like an asymmetry where PRs must pass a
>> “higher” standard.  We should consider adding some form of StressNew to the
>> main pipeline as well (maybe compare to the previous SHA that passed?).
>> 
>> The original motivation for the 25-file limit was an attempt to limit how
>> long StressNew might run for.  Since concourse already applies a timeout,
>> that check is unnecessary.  However, a compromise solution might be to use
>> the number of files changed to dial back the number of repeats, e.g. stay
>> with 50 repeats if fewer than 25 files changed, otherwise compute 1250 /
>> <#-files-changed> and do only that many repeats (e.g. if 50 files changed,
>> run all tests 25x instead of 50x).
>> 
>> While StressNew is intended to catch new flaky tests, it can also catch
>> poorly-designed tests that fail just by running twice in the same VM.  This
>> may be a sign that the test does not clean up properly and could be
>> polluting other tests in unexpected ways?  It might be useful to run a
>> “StressNew” with repeats=2 over a much broader scope, maybe even all tests,
>> at least once in a while?
>> 
>> 
>>> On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
>>> 
>>> Hi All,
>>> 
>>> Proposal: Force StressNewTest to fail a change with 25 or more files
>> rather than pass it without running it.
>>> 
>>> Currently, the StressNewTest job in the pipeline will just pass a job
>> that has more than 25 files changed. It will be marked as green with no
>> work done. There are reasons, relating to run time being too long to be
>> tracked by concourse, so we just let it through as a pass. I think this is
>> a bad signal. I think that this should automatically be a failure in the
>> short term. As a result, I also think it should not be required. It is a
>> bad signal, and I think that by making it a fail, this will at least not
>> give us a false sense of security. I understand that this opens the flood
>> gates so to speak, but I don’t think as a community it is a big problem
>> because we can still say that you should not merge if the StressNewTest
>> fails because of your test.
>>> 
>>> I personally find the false sense of security more troubling than
>> anything. Hence the reason, I would like this to be
>>> 
>>> Let me know what you think..
>>> 
>>> Thanks,
>>> Mark
>> 
>> 
> 
> -- 
> Ju@N



Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Ju@N
+1 to what Owen said, I don't think disabling *StressNewTest* is a
good idea.

On Fri, 28 Feb 2020 at 18:35, Owen Nichols  wrote:

> -1 to making StressNew not required
>
> +1 to eliminating the current loophole — StressNew should never give a
> free pass.
>
> Any time your PR is having trouble passing StressNew, please bring it up
> on the dev list. We can review on a case-by-case basis and decide whether
> to try increasing the timeout, changing the repeat count, refactoring the
> PR, or as an absolute last resort requesting authorization for an override
> (for example, a change to spotless rules might touch a huge number of files
> but carry no risk).
>
> One bug we should fix is that StressNew sometimes counts more files
> touched than really were, especially if you had many commits or merges or
> rebases on your PR branch.  Possible workarounds there include squashing
> and/or creating a new PR and/or splitting into multiple PRs.  I’ve spent
> some time trying to reproduce why files are mis-counted, with no success,
> but perhaps someone cleverer with git could provide a fix.
>
> Another issue is that StressNew is only in the PR pipeline, not the main
> develop pipeline.  This feels like an asymmetry where PRs must pass a
> “higher” standard.  We should consider adding some form of StressNew to the
> main pipeline as well (maybe compare to the previous SHA that passed?).
>
> The original motivation for the 25-file limit was an attempt to limit how
> long StressNew might run for.  Since concourse already applies a timeout,
> that check is unnecessary.  However, a compromise solution might be to use
> the number of files changed to dial back the number of repeats, e.g. stay
> with 50 repeats if fewer than 25 files changed, otherwise compute 1250 /
> <#-files-changed> and do only that many repeats (e.g. if 50 files changed,
> run all tests 25x instead of 50x).
>
> While StressNew is intended to catch new flaky tests, it can also catch
> poorly-designed tests that fail just by running twice in the same VM.  This
> may be a sign that the test does not clean up properly and could be
> polluting other tests in unexpected ways?  It might be useful to run a
> “StressNew” with repeats=2 over a much broader scope, maybe even all tests,
> at least once in a while?
>
>
> > On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> >
> > Hi All,
> >
> > Proposal: Force StressNewTest to fail a change with 25 or more files
> rather than pass it without running it.
> >
> > Currently, the StressNewTest job in the pipeline will just pass a job
> that has more than 25 files changed. It will be marked as green with no
> work done. There are reasons, relating to run time being too long to be
> tracked by concourse, so we just let it through as a pass. I think this is
> a bad signal. I think that this should automatically be a failure in the
> short term. As a result, I also think it should not be required. It is a
> bad signal, and I think that by making it a fail, this will at least not
> give us a false sense of security. I understand that this opens the flood
> gates so to speak, but I don’t think as a community it is a big problem
> because we can still say that you should not merge if the StressNewTest
> fails because of your test.
> >
> > I personally find the false sense of security more troubling than
> anything. Hence the reason, I would like this to be
> >
> > Let me know what you think..
> >
> > Thanks,
> > Mark
>
>

-- 
Ju@N


Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Owen Nichols
-1 to making StressNew not required

+1 to eliminating the current loophole — StressNew should never give a free 
pass.  

Any time your PR is having trouble passing StressNew, please bring it up on the 
dev list. We can review on a case-by-case basis and decide whether to try 
increasing the timeout, changing the repeat count, refactoring the PR, or as an 
absolute last resort requesting authorization for an override (for example, a 
change to spotless rules might touch a huge number of files but carry no risk).

One bug we should fix is that StressNew sometimes counts more files touched 
than really were, especially if you had many commits or merges or rebases on 
your PR branch.  Possible workarounds there include squashing and/or creating a 
new PR and/or splitting into multiple PRs.  I’ve spent some time trying to 
reproduce why files are mis-counted, with no success, but perhaps someone 
cleverer with git could provide a fix.

Another issue is that StressNew is only in the PR pipeline, not the main 
develop pipeline.  This feels like an asymmetry where PRs must pass a “higher” 
standard.  We should consider adding some form of StressNew to the main 
pipeline as well (maybe compare to the previous SHA that passed?).

The original motivation for the 25-file limit was an attempt to limit how long 
StressNew might run for.  Since concourse already applies a timeout, that check 
is unnecessary.  However, a compromise solution might be to use the number of 
files changed to dial back the number of repeats, e.g. stay with 50 repeats if 
fewer than 25 files changed, otherwise compute 1250 / <#-files-changed> and do 
only that many repeats (e.g. if 50 files changed, run all tests 25x instead of 
50x).

While StressNew is intended to catch new flaky tests, it can also catch 
poorly-designed tests that fail just by running twice in the same VM.  This may 
be a sign that the test does not clean up properly and could be polluting other 
tests in unexpected ways?  It might be useful to run a “StressNew” with 
repeats=2 over a much broader scope, maybe even all tests, at least once in a 
while?


> On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> 
> Hi All,
> 
> Proposal: Force StressNewTest to fail a change with 25 or more files rather 
> than pass it without running it.
> 
> Currently, the StressNewTest job in the pipeline will just pass a job that 
> has more than 25 files changed. It will be marked as green with no work done. 
> There are reasons, relating to run time being too long to be tracked by 
> concourse, so we just let it through as a pass. I think this is a bad signal. 
> I think that this should automatically be a failure in the short term. As a 
> result, I also think it should not be required. It is a bad signal, and I 
> think that by making it a fail, this will at least not give us a false sense 
> of security. I understand that this opens the flood gates so to speak, but I 
> don’t think as a community it is a big problem because we can still say that 
> you should not merge if the StressNewTest fails because of your test.
> 
> I personally find the false sense of security more troubling than anything. 
> Hence the reason, I would like this to be
> 
> Let me know what you think..
> 
> Thanks,
> Mark



[PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Mark Hanson
Hi All,

Proposal: Force StressNewTest to fail a change with 25 or more files rather 
than pass it without running it.

Currently, the StressNewTest job in the pipeline will just pass a job that has 
more than 25 files changed. It will be marked as green with no work done. There 
are reasons, relating to run time being too long to be tracked by concourse, so 
we just let it through as a pass. I think this is a bad signal. I think that 
this should automatically be a failure in the short term. As a result, I also 
think it should not be required. It is a bad signal, and I think that by making 
it a fail, this will at least not give us a false sense of security. I 
understand that this opens the flood gates so to speak, but I don’t think as a 
community it is a big problem because we can still say that you should not 
merge if the StressNewTest fails because of your test.

I personally find the false sense of security more troubling than anything. 
Hence the reason, I would like this to be

Let me know what you think..

Thanks,
Mark