Re: [openstack-dev] [Fuel] fuel-library merge policy and Fuel CI
On 10/28/2014 02:15 PM, Aleksandra Fedorova wrote: Vitaly, though comments like this are definitely better than nothing, I think we should address these issues in a more formal way. For random failures we have to retrigger the build until it passes. Yes, it could take some time (two-three rebuilds?), but it is the only reliable way which shows that it is indeed random and hasn't suddenly become permanent. If it fails three times in a row, this issue is probably bigger than you think. Should we really ignore/postpone it then? And if it is really the known issue, we need to fix or disable this particular test. And I think that this fix should be merged in the repo via the general workflow. It doesn't only make everything pass the CI properly, it also adds this necessary step where you announce the issue publicly and it gets approved as the official known issue. I would even add certain keyword for the commit message to mark this temporary fixes to simplify tracking. Aleksandra, You are 100% correct here. Under no circumstances should any human be able to merge code into a master source tree. Only the CI system, after a successful run of tests, should be able to merge code into master. If there are, as Vitaly says, issues with a nailgun test that cause random failures, then the test (or nailgun, whichever is the cause) should be fixed ASAP. We deal with similar issues in the main OpenStack gate, and luckily there we don't allow humans to merge code directly into a branch. Only the CI system can do that, which means that although at times we get frustrated developers who must do the recheck dance a bit, there is a forcing function to have developers fix bugs in tests and server code that trigger false failures. All the best, and keep up the good work. -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Fuel] fuel-library merge policy and Fuel CI
Hi everyone, with recent disruption in our CI process I'd like to discuss again the issues in our merge workflow. See the summary at the end. As a starting point, here is the list of patches which were merged into fuel-library repository without Verified +1 label from Fuel CI: https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+NOT+label:Verified%252B1%252Cuser%253Dfuel-ci,n,z And the list of merged patches with explicit Verified -1 label: https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+label:Verified-1%252Cuser%253Dfuel-ci,n,z There are two common reasons I know why these patchsets exist: Case 1: Who cares about CI anyway. Case 2: These patches can not pass CI because of some real reason, which makes Fuel CI result irrelevant. I am not sure, if i need to comment on the first one, but please just remember: CI is not a devops playground made to disrupt your otherwise clean and smooth development process. It is an extremely critical service, providing the clear reference point for all the work we do. And we all know how important the reference point is [1]. So let's move on to the Case 2 and talk about our CI limitations and what could possibly make the test result irrelevant. 1) Dependencies. Let's say you have a chain of dependent patchsets and none of them could pass the CI on its own. How do you merge it? Here is the trick: the leaf, i.e. last, topmost patch in the chain should pass the CI. The test we run for this patchset automatically pulls all dependencies involved. Which makes Fuel CI result for this patchset perfectly relevant for the whole chain. 2) Environment. Fuel CI test environment usually uses slightly outdated version of Fuel iso image and fuel-main code. Therefore it happens that you write and test your patch against latest code via custom iso builds and it works, but it can not pass CI. Does it make test results irrelevant? No. It makes them even more valuable. CI environment can be broken and can be outdated. This is the part of the process. To deal with these situations we first need to fix the environment, then run tests, and then merge the code. And it helps if you contact devops team in advance and inform us that you soon will need the ISO with this particular features. 3) ? Please add your examples and let's deal with them one by one. Summary: I'd like to propose the following merge policy: 1. any individual patchset MUST have +1 from Fuel CI; 2. any chain of dependent patchsets MUST have +1 from Fuel CI for the topmost patch; 3. for all exceptional cases the person who does the merge MUST explicitly contact devops team, and make sure that there will be devops engineer available who will run additional checks before or right after the merge. The very same person who does the merge also MUST be available for some time after the merge to help the devops engineer to deal with the test failures if they appear. [1] http://www.youtube.com/watch?feature=player_embeddedv=QkCQ_-Id8zI#t=211 -- Aleksandra Fedorova Fuel Devops Engineer bookwar ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel] fuel-library merge policy and Fuel CI
Aleksandra, As you may know, there is a randomly failing nailgun unit test in fuel-web repo, which fails for the major part of review requests. It's been happening for a few days. But I need to merge some stuff and cannot wait for the fix of this well known issue. So for every request with -1 from Fuel CI I write an explanation why I decided to merge the request. Are you ok with this? Here is an example: https://review.openstack.org/#/c/131079/ 2014-10-28 23:10 GMT+07:00 Aleksandra Fedorova afedor...@mirantis.com: Hi everyone, with recent disruption in our CI process I'd like to discuss again the issues in our merge workflow. See the summary at the end. As a starting point, here is the list of patches which were merged into fuel-library repository without Verified +1 label from Fuel CI: https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+NOT+label:Verified%252B1%252Cuser%253Dfuel-ci,n,z And the list of merged patches with explicit Verified -1 label: https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+label:Verified-1%252Cuser%253Dfuel-ci,n,z There are two common reasons I know why these patchsets exist: Case 1: Who cares about CI anyway. Case 2: These patches can not pass CI because of some real reason, which makes Fuel CI result irrelevant. I am not sure, if i need to comment on the first one, but please just remember: CI is not a devops playground made to disrupt your otherwise clean and smooth development process. It is an extremely critical service, providing the clear reference point for all the work we do. And we all know how important the reference point is [1]. So let's move on to the Case 2 and talk about our CI limitations and what could possibly make the test result irrelevant. 1) Dependencies. Let's say you have a chain of dependent patchsets and none of them could pass the CI on its own. How do you merge it? Here is the trick: the leaf, i.e. last, topmost patch in the chain should pass the CI. The test we run for this patchset automatically pulls all dependencies involved. Which makes Fuel CI result for this patchset perfectly relevant for the whole chain. 2) Environment. Fuel CI test environment usually uses slightly outdated version of Fuel iso image and fuel-main code. Therefore it happens that you write and test your patch against latest code via custom iso builds and it works, but it can not pass CI. Does it make test results irrelevant? No. It makes them even more valuable. CI environment can be broken and can be outdated. This is the part of the process. To deal with these situations we first need to fix the environment, then run tests, and then merge the code. And it helps if you contact devops team in advance and inform us that you soon will need the ISO with this particular features. 3) ? Please add your examples and let's deal with them one by one. Summary: I'd like to propose the following merge policy: 1. any individual patchset MUST have +1 from Fuel CI; 2. any chain of dependent patchsets MUST have +1 from Fuel CI for the topmost patch; 3. for all exceptional cases the person who does the merge MUST explicitly contact devops team, and make sure that there will be devops engineer available who will run additional checks before or right after the merge. The very same person who does the merge also MUST be available for some time after the merge to help the devops engineer to deal with the test failures if they appear. [1] http://www.youtube.com/watch?feature=player_embeddedv=QkCQ_-Id8zI#t=211 -- Aleksandra Fedorova Fuel Devops Engineer bookwar -- You received this message because you are subscribed to the Google Groups fuel-core-team group. To unsubscribe from this group and stop receiving emails from it, send an email to fuel-core-team+unsubscr...@mirantis.com. For more options, visit https://groups.google.com/a/mirantis.com/d/optout. -- Vitaly Kramskikh, Software Engineer, Mirantis, Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Fuel] fuel-library merge policy and Fuel CI
Vitaly, though comments like this are definitely better than nothing, I think we should address these issues in a more formal way. For random failures we have to retrigger the build until it passes. Yes, it could take some time (two-three rebuilds?), but it is the only reliable way which shows that it is indeed random and hasn't suddenly become permanent. If it fails three times in a row, this issue is probably bigger than you think. Should we really ignore/postpone it then? And if it is really the known issue, we need to fix or disable this particular test. And I think that this fix should be merged in the repo via the general workflow. It doesn't only make everything pass the CI properly, it also adds this necessary step where you announce the issue publicly and it gets approved as the official known issue. I would even add certain keyword for the commit message to mark this temporary fixes to simplify tracking. On Tue, Oct 28, 2014 at 8:19 PM, Vitaly Kramskikh vkramsk...@mirantis.com wrote: Aleksandra, As you may know, there is a randomly failing nailgun unit test in fuel-web repo, which fails for the major part of review requests. It's been happening for a few days. But I need to merge some stuff and cannot wait for the fix of this well known issue. So for every request with -1 from Fuel CI I write an explanation why I decided to merge the request. Are you ok with this? Here is an example: https://review.openstack.org/#/c/131079/ 2014-10-28 23:10 GMT+07:00 Aleksandra Fedorova afedor...@mirantis.com: Hi everyone, with recent disruption in our CI process I'd like to discuss again the issues in our merge workflow. See the summary at the end. As a starting point, here is the list of patches which were merged into fuel-library repository without Verified +1 label from Fuel CI: https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+NOT+label:Verified%252B1%252Cuser%253Dfuel-ci,n,z And the list of merged patches with explicit Verified -1 label: https://review.openstack.org/#/q/project:stackforge/fuel-library+AND+status:merged+AND+label:Verified-1%252Cuser%253Dfuel-ci,n,z There are two common reasons I know why these patchsets exist: Case 1: Who cares about CI anyway. Case 2: These patches can not pass CI because of some real reason, which makes Fuel CI result irrelevant. I am not sure, if i need to comment on the first one, but please just remember: CI is not a devops playground made to disrupt your otherwise clean and smooth development process. It is an extremely critical service, providing the clear reference point for all the work we do. And we all know how important the reference point is [1]. So let's move on to the Case 2 and talk about our CI limitations and what could possibly make the test result irrelevant. 1) Dependencies. Let's say you have a chain of dependent patchsets and none of them could pass the CI on its own. How do you merge it? Here is the trick: the leaf, i.e. last, topmost patch in the chain should pass the CI. The test we run for this patchset automatically pulls all dependencies involved. Which makes Fuel CI result for this patchset perfectly relevant for the whole chain. 2) Environment. Fuel CI test environment usually uses slightly outdated version of Fuel iso image and fuel-main code. Therefore it happens that you write and test your patch against latest code via custom iso builds and it works, but it can not pass CI. Does it make test results irrelevant? No. It makes them even more valuable. CI environment can be broken and can be outdated. This is the part of the process. To deal with these situations we first need to fix the environment, then run tests, and then merge the code. And it helps if you contact devops team in advance and inform us that you soon will need the ISO with this particular features. 3) ? Please add your examples and let's deal with them one by one. Summary: I'd like to propose the following merge policy: 1. any individual patchset MUST have +1 from Fuel CI; 2. any chain of dependent patchsets MUST have +1 from Fuel CI for the topmost patch; 3. for all exceptional cases the person who does the merge MUST explicitly contact devops team, and make sure that there will be devops engineer available who will run additional checks before or right after the merge. The very same person who does the merge also MUST be available for some time after the merge to help the devops engineer to deal with the test failures if they appear. [1] http://www.youtube.com/watch?feature=player_embeddedv=QkCQ_-Id8zI#t=211 -- Aleksandra Fedorova Fuel Devops Engineer bookwar -- You received this message because you are subscribed to the Google Groups fuel-core-team group. To unsubscribe from this group and stop receiving emails from it, send an email to fuel-core-team+unsubscr...@mirantis.com. For more options, visit