Re: [openstack-dev] [Fuel] fuel-library merge policy and Fuel CI

2014-11-01 Thread Jay Pipes

On 10/28/2014 02:15 PM, Aleksandra Fedorova wrote:

Vitaly,

though comments like this are definitely better than nothing, I think
we should address these issues in a more formal way.

For random failures we have to retrigger the build until it passes.
Yes, it could take some time (two-three rebuilds?), but it is the only
reliable way which shows that it is indeed random and hasn't suddenly
become permanent. If it fails three times in a row, this issue is
probably bigger than you think. Should we really ignore/postpone it
then?

And if it is really the known issue, we need to fix or disable this
particular test. And I think that this fix should be merged in the
repo via the general workflow.

It doesn't only make everything pass the CI properly, it also adds
this necessary step where you announce the issue publicly and it gets
approved as the official known issue. I would even add certain
keyword for the commit message to mark this temporary fixes to
simplify tracking.


Aleksandra,

You are 100% correct here. Under no circumstances should any human be 
able to merge code into a master source tree. Only the CI system, after 
a successful run of tests, should be able to merge code into master. If 
there are, as Vitaly says, issues with a nailgun test that cause random 
failures, then the test (or nailgun, whichever is the cause) should be 
fixed ASAP.


We deal with similar issues in the main OpenStack gate, and luckily 
there we don't allow humans to merge code directly into a branch. Only 
the CI system can do that, which means that although at times we get 
frustrated developers who must do the recheck dance a bit, there is a 
forcing function to have developers fix bugs in tests and server code 
that trigger false failures.


All the best, and keep up the good work.

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Fuel] fuel-library merge policy and Fuel CI

2014-10-28 Thread Aleksandra Fedorova
Hi everyone,

with recent disruption in our CI process I'd like to discuss again the
issues in our merge workflow.

See the summary at the end.


As a starting point, here is the list of patches which were merged
into fuel-library repository without Verified +1 label from Fuel CI:

https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+NOT+label:Verified%252B1%252Cuser%253Dfuel-ci,n,z

And the list of merged patches with explicit Verified -1 label:

https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+label:Verified-1%252Cuser%253Dfuel-ci,n,z

There are two common reasons I know why these patchsets exist:

Case 1: Who cares about CI anyway.

Case 2: These patches can not pass CI because of some real reason,
which makes Fuel CI result irrelevant.

I am not sure, if i need to comment on the first one, but please just
remember: CI is not a devops playground made to disrupt your otherwise
clean and smooth development process. It is an extremely critical
service, providing the clear reference point for all the work we do.
And we all know how important the reference point is [1].

So let's move on to the Case 2 and talk about our CI limitations and
what could possibly make the test result irrelevant.

1) Dependencies.

Let's say you have a chain of dependent patchsets and none of them
could pass the CI on its own. How do you merge it?

Here is the trick: the leaf, i.e. last, topmost patch in the chain
should pass the CI.

The test we run for this patchset automatically pulls all dependencies
involved. Which makes Fuel CI result for this patchset perfectly
relevant for the whole chain.

2) Environment.

Fuel CI test environment usually uses slightly outdated version of
Fuel iso image and fuel-main code. Therefore it happens that you write
and test your patch against latest code via custom iso builds and it
works, but it can not pass CI. Does it make test results irrelevant?
No. It makes them even more valuable.

CI environment can be broken and can be outdated. This is the part of
the process. To deal with these situations we first need to fix the
environment, then run tests, and then merge the code.

And it helps if you contact devops team in advance  and inform us that
you soon will need the ISO with this particular features.

3) ?

Please add your examples and let's deal with them one by one.


Summary:

I'd like to propose the following merge policy:

1. any individual patchset MUST have +1 from Fuel CI;

2. any chain of dependent patchsets MUST have +1 from Fuel CI for the
topmost patch;

3. for all exceptional cases the person who does the merge MUST
explicitly contact devops team, and make sure that there will be
devops engineer available who will run additional checks before or
right after the merge. The very same person who does the merge also
MUST be available for some time after the merge to help the devops
engineer to deal with the test failures if they appear.



[1] http://www.youtube.com/watch?feature=player_embeddedv=QkCQ_-Id8zI#t=211


-- 
Aleksandra Fedorova
Fuel Devops Engineer
bookwar

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Fuel] fuel-library merge policy and Fuel CI

2014-10-28 Thread Vitaly Kramskikh
Aleksandra,

As you may know, there is a randomly failing nailgun unit test in fuel-web
repo, which fails for the major part of review requests. It's been
happening for a few days. But I need to merge some stuff and cannot wait
for the fix of this well known issue. So for every request with -1 from
Fuel CI I write an explanation why I decided to merge the request. Are you
ok with this? Here is an example: https://review.openstack.org/#/c/131079/

2014-10-28 23:10 GMT+07:00 Aleksandra Fedorova afedor...@mirantis.com:

 Hi everyone,

 with recent disruption in our CI process I'd like to discuss again the
 issues in our merge workflow.

 See the summary at the end.


 As a starting point, here is the list of patches which were merged
 into fuel-library repository without Verified +1 label from Fuel CI:


 https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+NOT+label:Verified%252B1%252Cuser%253Dfuel-ci,n,z

 And the list of merged patches with explicit Verified -1 label:


 https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+label:Verified-1%252Cuser%253Dfuel-ci,n,z

 There are two common reasons I know why these patchsets exist:

 Case 1: Who cares about CI anyway.

 Case 2: These patches can not pass CI because of some real reason,
 which makes Fuel CI result irrelevant.

 I am not sure, if i need to comment on the first one, but please just
 remember: CI is not a devops playground made to disrupt your otherwise
 clean and smooth development process. It is an extremely critical
 service, providing the clear reference point for all the work we do.
 And we all know how important the reference point is [1].

 So let's move on to the Case 2 and talk about our CI limitations and
 what could possibly make the test result irrelevant.

 1) Dependencies.

 Let's say you have a chain of dependent patchsets and none of them
 could pass the CI on its own. How do you merge it?

 Here is the trick: the leaf, i.e. last, topmost patch in the chain
 should pass the CI.

 The test we run for this patchset automatically pulls all dependencies
 involved. Which makes Fuel CI result for this patchset perfectly
 relevant for the whole chain.

 2) Environment.

 Fuel CI test environment usually uses slightly outdated version of
 Fuel iso image and fuel-main code. Therefore it happens that you write
 and test your patch against latest code via custom iso builds and it
 works, but it can not pass CI. Does it make test results irrelevant?
 No. It makes them even more valuable.

 CI environment can be broken and can be outdated. This is the part of
 the process. To deal with these situations we first need to fix the
 environment, then run tests, and then merge the code.

 And it helps if you contact devops team in advance  and inform us that
 you soon will need the ISO with this particular features.

 3) ?

 Please add your examples and let's deal with them one by one.


 Summary:

 I'd like to propose the following merge policy:

 1. any individual patchset MUST have +1 from Fuel CI;

 2. any chain of dependent patchsets MUST have +1 from Fuel CI for the
 topmost patch;

 3. for all exceptional cases the person who does the merge MUST
 explicitly contact devops team, and make sure that there will be
 devops engineer available who will run additional checks before or
 right after the merge. The very same person who does the merge also
 MUST be available for some time after the merge to help the devops
 engineer to deal with the test failures if they appear.



 [1]
 http://www.youtube.com/watch?feature=player_embeddedv=QkCQ_-Id8zI#t=211


 --
 Aleksandra Fedorova
 Fuel Devops Engineer
 bookwar

 --
 You received this message because you are subscribed to the Google Groups
 fuel-core-team group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to fuel-core-team+unsubscr...@mirantis.com.
 For more options, visit https://groups.google.com/a/mirantis.com/d/optout.




-- 
Vitaly Kramskikh,
Software Engineer,
Mirantis, Inc.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Fuel] fuel-library merge policy and Fuel CI

2014-10-28 Thread Aleksandra Fedorova
Vitaly,

though comments like this are definitely better than nothing, I think
we should address these issues in a more formal way.

For random failures we have to retrigger the build until it passes.
Yes, it could take some time (two-three rebuilds?), but it is the only
reliable way which shows that it is indeed random and hasn't suddenly
become permanent. If it fails three times in a row, this issue is
probably bigger than you think. Should we really ignore/postpone it
then?

And if it is really the known issue, we need to fix or disable this
particular test. And I think that this fix should be merged in the
repo via the general workflow.

It doesn't only make everything pass the CI properly, it also adds
this necessary step where you announce the issue publicly and it gets
approved as the official known issue. I would even add certain
keyword for the commit message to mark this temporary fixes to
simplify tracking.



On Tue, Oct 28, 2014 at 8:19 PM, Vitaly Kramskikh
vkramsk...@mirantis.com wrote:
 Aleksandra,

 As you may know, there is a randomly failing nailgun unit test in fuel-web
 repo, which fails for the major part of review requests. It's been happening
 for a few days. But I need to merge some stuff and cannot wait for the fix
 of this well known issue. So for every request with -1 from Fuel CI I write
 an explanation why I decided to merge the request. Are you ok with this?
 Here is an example: https://review.openstack.org/#/c/131079/

 2014-10-28 23:10 GMT+07:00 Aleksandra Fedorova afedor...@mirantis.com:

 Hi everyone,

 with recent disruption in our CI process I'd like to discuss again the
 issues in our merge workflow.

 See the summary at the end.


 As a starting point, here is the list of patches which were merged
 into fuel-library repository without Verified +1 label from Fuel CI:


 https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+NOT+label:Verified%252B1%252Cuser%253Dfuel-ci,n,z

 And the list of merged patches with explicit Verified -1 label:


 https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+label:Verified-1%252Cuser%253Dfuel-ci,n,z

 There are two common reasons I know why these patchsets exist:

 Case 1: Who cares about CI anyway.

 Case 2: These patches can not pass CI because of some real reason,
 which makes Fuel CI result irrelevant.

 I am not sure, if i need to comment on the first one, but please just
 remember: CI is not a devops playground made to disrupt your otherwise
 clean and smooth development process. It is an extremely critical
 service, providing the clear reference point for all the work we do.
 And we all know how important the reference point is [1].

 So let's move on to the Case 2 and talk about our CI limitations and
 what could possibly make the test result irrelevant.

 1) Dependencies.

 Let's say you have a chain of dependent patchsets and none of them
 could pass the CI on its own. How do you merge it?

 Here is the trick: the leaf, i.e. last, topmost patch in the chain
 should pass the CI.

 The test we run for this patchset automatically pulls all dependencies
 involved. Which makes Fuel CI result for this patchset perfectly
 relevant for the whole chain.

 2) Environment.

 Fuel CI test environment usually uses slightly outdated version of
 Fuel iso image and fuel-main code. Therefore it happens that you write
 and test your patch against latest code via custom iso builds and it
 works, but it can not pass CI. Does it make test results irrelevant?
 No. It makes them even more valuable.

 CI environment can be broken and can be outdated. This is the part of
 the process. To deal with these situations we first need to fix the
 environment, then run tests, and then merge the code.

 And it helps if you contact devops team in advance  and inform us that
 you soon will need the ISO with this particular features.

 3) ?

 Please add your examples and let's deal with them one by one.


 Summary:

 I'd like to propose the following merge policy:

 1. any individual patchset MUST have +1 from Fuel CI;

 2. any chain of dependent patchsets MUST have +1 from Fuel CI for the
 topmost patch;

 3. for all exceptional cases the person who does the merge MUST
 explicitly contact devops team, and make sure that there will be
 devops engineer available who will run additional checks before or
 right after the merge. The very same person who does the merge also
 MUST be available for some time after the merge to help the devops
 engineer to deal with the test failures if they appear.



 [1]
 http://www.youtube.com/watch?feature=player_embeddedv=QkCQ_-Id8zI#t=211


 --
 Aleksandra Fedorova
 Fuel Devops Engineer
 bookwar

 --
 You received this message because you are subscribed to the Google Groups
 fuel-core-team group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to fuel-core-team+unsubscr...@mirantis.com.
 For more options, visit