Re: [openstack-dev] [Neutron] How to handle blocking bugs/changes in Neutron 3rd party CI

2014-08-21 Thread Kevin Benton
I'm not sure if this is possible with a Zuul setup, but once we identify a
failure causing commit, we change the reported job status to (skipped)
for any patches that contain the commit but not the fix. It's a relatively
straightforward way to communicate that the CI system is still operational
but voting was intentionally bypassed. This logic is handled in the script
that determines and posts the results of the tests to gerrit.


On Wed, Aug 20, 2014 at 3:27 PM, Dane Leblanc (leblancd) lebla...@cisco.com
 wrote:

 Preface: I posed this problem on the #openstack-infra IRC, and they
 couldn't offer an easy or obvious solution, and suggested that I get some
 consensus from the Neutron community as to how we want to handle this
 situation. So I'd like to bounce this around, get some ideas, and maybe
 bring this up in the 3rd party CI IRC.

 The challenge is this: Occasionally, a blocking bug is introduced which
 causes our 3rd party CI tests to consistently fail on every change set that
 we're testing against. We can develop a fix for the problem, but until that
 fix gets merged upstream, tests against all other change sets are seen to
 fail.

 (Note that we have a similar situation whenever we introduce a completely
 new plugin with its associated 3rd party CI... until the plugin code, or an
 enabling subset of that plugin code is merged upstream, then typically
 all other commits would fail on that CI setup.)

 In the past, we've tried dynamically patching the fix(es) on top of the
 fetched code being reviewed, but this isn't always reliable due to merge
 conflicts, and we've had to monkey patch DevStack to apply the fixes after
 cloning Neutron but before installing Neutron.

 So we'd prefer to enter a throttled or filtering CI mode when we hit
 this situation, where we're (temporarily) only testing against commits
 related to our plugin/driver which contain (or have a dependency on) the
 fix for the blocking bug until the fix is merged.

 In an ideal world, for the sake of transparency, we would love to be able
 to have Jenkins/Zuul report back to Gerrit with a descriptive test result
 such as N/A, Not tested, or even Aborted for all other change sets,
 letting the committer know that, Yeah, we see your review, but we're
 unable to test it at the moment. Zuul does have the ability to report
 Aborted status to Gerrit, but this is sent e.g. when Zuul decides to
 abort change set 'N' for a review when change set 'N+1' has just been
 submitted, or when a Jenkins admin manually aborts a Jenkins job.
 Unfortunately, this type of status is not available programmatically within
 a Jenkins job script; the only outcomes are pass (zero RC) or fail
 (non-zero RC). (Note that we can't directly filter at the Zuul level in our
 topology, since we have one Zuul server servicing multiple 3rd party CI
 setups.)

 As a second option, we'd like to not run any tests for the other changes,
 and report NOTHING to Gerrit, while continuing to run against changes
 related to our plugin (as required for the plugin changes to be approved).
 This was the favored approach discussed in the Neutron IRC on Monday. But
 herein lies the rub. By the time our Jenkins job script discovers that the
 change set that is being tested is not in a list of preferred/allowed
 change sets, the script has 2 options: pass or fail. With the current
 Jenkins, there is no programmatic way for a Jenkins script to signal to
 Gearman/Zuul that the job should be aborted.

 There was supposedly a bug filed with Jenkins to allow it to interpret
 different exit codes from job scripts as different result values, but this
 hasn't made any progress.

 There may be something that can be changed in Zuul to allow it to
 interpret different result codes other than success/fail, or maybe to allow
 Zuul to do change ID filtering on a per Jenkins job basis, but this would
 require the infra team to make changes to Zuul.

 The bottom line is that based on the current Zuul/Jenkins infrastructure,
 whenever our 3rd party CI is blocked by a bug, I'm struggling with the
 conflicting requirements:
 * Continue testing against change sets for the blocking bug (or plugin
 related changes)
 * Don't report anything to Gerrit for all other change sets, since these
 can't be meaningfully tested against the CI hardware

 Let me know if I'm missing a solution to this. I appreciate any
 suggestions!

 -Dane


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Kevin Benton
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] How to handle blocking bugs/changes in Neutron 3rd party CI

2014-08-21 Thread Dane Leblanc (leblancd)
That makes sense for setups that don’t use Zuul.

But for setups using Zuul/Jenkins, and for a vendor who is introducing a new 
plugin which has initial hardware-enabling commits which haven’t been merged 
yet, I don’t see how we can meet Neutron 3rd party testing requirements. The 
requirements and the tools just seem to be at odds in this situation.

From: Kevin Benton [mailto:blak...@gmail.com]
Sent: Thursday, August 21, 2014 3:25 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron] How to handle blocking bugs/changes in 
Neutron 3rd party CI

I'm not sure if this is possible with a Zuul setup, but once we identify a 
failure causing commit, we change the reported job status to (skipped) for 
any patches that contain the commit but not the fix. It's a relatively 
straightforward way to communicate that the CI system is still operational but 
voting was intentionally bypassed. This logic is handled in the script that 
determines and posts the results of the tests to gerrit.

On Wed, Aug 20, 2014 at 3:27 PM, Dane Leblanc (leblancd) 
lebla...@cisco.commailto:lebla...@cisco.com wrote:
Preface: I posed this problem on the #openstack-infra IRC, and they couldn't 
offer an easy or obvious solution, and suggested that I get some consensus from 
the Neutron community as to how we want to handle this situation. So I'd like 
to bounce this around, get some ideas, and maybe bring this up in the 3rd party 
CI IRC.

The challenge is this: Occasionally, a blocking bug is introduced which causes 
our 3rd party CI tests to consistently fail on every change set that we're 
testing against. We can develop a fix for the problem, but until that fix gets 
merged upstream, tests against all other change sets are seen to fail.

(Note that we have a similar situation whenever we introduce a completely new 
plugin with its associated 3rd party CI... until the plugin code, or an 
enabling subset of that plugin code is merged upstream, then typically all 
other commits would fail on that CI setup.)

In the past, we've tried dynamically patching the fix(es) on top of the fetched 
code being reviewed, but this isn't always reliable due to merge conflicts, and 
we've had to monkey patch DevStack to apply the fixes after cloning Neutron but 
before installing Neutron.

So we'd prefer to enter a throttled or filtering CI mode when we hit this 
situation, where we're (temporarily) only testing against commits related to 
our plugin/driver which contain (or have a dependency on) the fix for the 
blocking bug until the fix is merged.

In an ideal world, for the sake of transparency, we would love to be able to 
have Jenkins/Zuul report back to Gerrit with a descriptive test result such as 
N/A, Not tested, or even Aborted for all other change sets, letting the 
committer know that, Yeah, we see your review, but we're unable to test it at 
the moment. Zuul does have the ability to report Aborted status to Gerrit, 
but this is sent e.g. when Zuul decides to abort change set 'N' for a review 
when change set 'N+1' has just been submitted, or when a Jenkins admin manually 
aborts a Jenkins job.  Unfortunately, this type of status is not available 
programmatically within a Jenkins job script; the only outcomes are pass (zero 
RC) or fail (non-zero RC). (Note that we can't directly filter at the Zuul 
level in our topology, since we have one Zuul server servicing multiple 3rd 
party CI setups.)

As a second option, we'd like to not run any tests for the other changes, and 
report NOTHING to Gerrit, while continuing to run against changes related to 
our plugin (as required for the plugin changes to be approved).  This was the 
favored approach discussed in the Neutron IRC on Monday. But herein lies the 
rub. By the time our Jenkins job script discovers that the change set that is 
being tested is not in a list of preferred/allowed change sets, the script has 
2 options: pass or fail. With the current Jenkins, there is no programmatic way 
for a Jenkins script to signal to Gearman/Zuul that the job should be aborted.

There was supposedly a bug filed with Jenkins to allow it to interpret 
different exit codes from job scripts as different result values, but this 
hasn't made any progress.

There may be something that can be changed in Zuul to allow it to interpret 
different result codes other than success/fail, or maybe to allow Zuul to do 
change ID filtering on a per Jenkins job basis, but this would require the 
infra team to make changes to Zuul.

The bottom line is that based on the current Zuul/Jenkins infrastructure, 
whenever our 3rd party CI is blocked by a bug, I'm struggling with the 
conflicting requirements:
* Continue testing against change sets for the blocking bug (or plugin related 
changes)
* Don't report anything to Gerrit for all other change sets, since these can't 
be meaningfully tested against the CI hardware

Let me know if I'm missing a solution to this. I

[openstack-dev] [Neutron] How to handle blocking bugs/changes in Neutron 3rd party CI

2014-08-20 Thread Dane Leblanc (leblancd)
Preface: I posed this problem on the #openstack-infra IRC, and they couldn't 
offer an easy or obvious solution, and suggested that I get some consensus from 
the Neutron community as to how we want to handle this situation. So I'd like 
to bounce this around, get some ideas, and maybe bring this up in the 3rd party 
CI IRC.

The challenge is this: Occasionally, a blocking bug is introduced which causes 
our 3rd party CI tests to consistently fail on every change set that we're 
testing against. We can develop a fix for the problem, but until that fix gets 
merged upstream, tests against all other change sets are seen to fail.

(Note that we have a similar situation whenever we introduce a completely new 
plugin with its associated 3rd party CI... until the plugin code, or an 
enabling subset of that plugin code is merged upstream, then typically all 
other commits would fail on that CI setup.)

In the past, we've tried dynamically patching the fix(es) on top of the fetched 
code being reviewed, but this isn't always reliable due to merge conflicts, and 
we've had to monkey patch DevStack to apply the fixes after cloning Neutron but 
before installing Neutron.

So we'd prefer to enter a throttled or filtering CI mode when we hit this 
situation, where we're (temporarily) only testing against commits related to 
our plugin/driver which contain (or have a dependency on) the fix for the 
blocking bug until the fix is merged.

In an ideal world, for the sake of transparency, we would love to be able to 
have Jenkins/Zuul report back to Gerrit with a descriptive test result such as 
N/A, Not tested, or even Aborted for all other change sets, letting the 
committer know that, Yeah, we see your review, but we're unable to test it at 
the moment. Zuul does have the ability to report Aborted status to Gerrit, 
but this is sent e.g. when Zuul decides to abort change set 'N' for a review 
when change set 'N+1' has just been submitted, or when a Jenkins admin manually 
aborts a Jenkins job.  Unfortunately, this type of status is not available 
programmatically within a Jenkins job script; the only outcomes are pass (zero 
RC) or fail (non-zero RC). (Note that we can't directly filter at the Zuul 
level in our topology, since we have one Zuul server servicing multiple 3rd 
party CI setups.)

As a second option, we'd like to not run any tests for the other changes, and 
report NOTHING to Gerrit, while continuing to run against changes related to 
our plugin (as required for the plugin changes to be approved).  This was the 
favored approach discussed in the Neutron IRC on Monday. But herein lies the 
rub. By the time our Jenkins job script discovers that the change set that is 
being tested is not in a list of preferred/allowed change sets, the script has 
2 options: pass or fail. With the current Jenkins, there is no programmatic way 
for a Jenkins script to signal to Gearman/Zuul that the job should be aborted.

There was supposedly a bug filed with Jenkins to allow it to interpret 
different exit codes from job scripts as different result values, but this 
hasn't made any progress.

There may be something that can be changed in Zuul to allow it to interpret 
different result codes other than success/fail, or maybe to allow Zuul to do 
change ID filtering on a per Jenkins job basis, but this would require the 
infra team to make changes to Zuul.

The bottom line is that based on the current Zuul/Jenkins infrastructure, 
whenever our 3rd party CI is blocked by a bug, I'm struggling with the 
conflicting requirements:
* Continue testing against change sets for the blocking bug (or plugin related 
changes)
* Don't report anything to Gerrit for all other change sets, since these can't 
be meaningfully tested against the CI hardware

Let me know if I'm missing a solution to this. I appreciate any suggestions!

-Dane


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev