Il 04/06/2015 08:46, Eyal Edri ha scritto: > > > ----- Original Message ----- >> From: "Sandro Bonazzola" <[email protected]> >> To: "Eyal Edri" <[email protected]>, "Max Kovgan" <[email protected]> >> Cc: [email protected], [email protected] >> Sent: Thursday, June 4, 2015 9:11:10 AM >> Subject: Re: [ovirt-devel] gerrit+ci improvement proposal >> >> Il 03/06/2015 21:46, Eyal Edri ha scritto: >>> >>> >>> ----- Original Message ----- >>>> From: "Max Kovgan" <[email protected]> >>>> To: [email protected] >>>> Cc: [email protected] >>>> Sent: Wednesday, June 3, 2015 8:22:54 PM >>>> Subject: [ovirt-devel] gerrit+ci improvement proposal >>>> >>>> Hi everyone! >>>> We really want to have reliable and snappy CI: to allow short cycles and >>>> encourage developers to write tests. >>>> >>>> # Problem >>>> >>>> Many patches are neither ready for review nor for CI upon submission, >>>> which >>>> is OK. >>>> But running all the jobs on those patches with limited resources results >>>> in: >>>> overloaded resources, slow response time, unhappy developers. >>>> >>>> # Proposed Solution >>>> >>>> To run less jobs we know we don’t need to, thus making more resources for >>>> the >>>> jobs we need to run. >>>> We have been experimenting to make our CI stabler and quicker to respond >>>> by >>>> using gerrit flags. This has improved in both directions very well >>>> internally. >>>> Now it seems a good time to let all the oVirt projects to use this. >>>> This solution indirectly promotes reviews and quick tests - “to fail >>>> early”, >>>> yet full blown static code analysis and long tests to run “when ready”. >>>> >>>> # How it works >>>> >>>> 2 new gerrit independent flags are added to gerrit. >>>> >>>> ## CI flag >>>> >>>> Will express patch CI status. Values: >>>> * +1 CI passed >>>> * 0 CI did not run yet >>>> * -1 CI failed >>>> Permissions for setting: project maintainers (for special cases) should be >>>> able to set/override (except Jenkins). >>>> >>>> ## Workflow flag >>>> >>>> Will express patch “workflow” state. Values: >>>> * 0 Work In Progress >>>> * +1 Ready For Review >>>> * +2 Ready For Merge >>>> Permissions for setting: Owner can set +1, Project Maintainers can set +2 >>>> >>>> ## Review + CI Integration: >>>> >>>> Merging [“Submit” button to appear] will require: Review+1, CI+1, >>>> Workflow+2 >>>> Patch lifecycle now is: >>>> --------------------------------------------------------------- >>>> patch state |owner |reviewer |maintainer |CI tests |pass >>>> --------------------------------------------------------------- >>>> added/updated |- |- |- |quick |CI+1 >>>> review |Workflow+1|Review+1 |- |heavy |CI+1 >>>> merge ready |- |- |Workflow+2 |gating |CI+1 >>>> merge |- |- |merge |merge |CI+1 >>>> >>>> Changes from current workflow: >>>> Owner only adds reviewers, now owner needs to set "Workflow+1" for the >>>> patch >>>> to be reviewed, and heavily auto-tested. >>>> Maintainer now needs to set "Workflow+2" and wait for "Submit" button to >>>> appear after CI has completed running gating tests. >>>> >>>> >>>> Next step will be to automate merge the change after Workflow+2 has been >>>> set >>>> by the Maintainer and gating tests passed. >>>> >>>> >>>> ## Why now? >>>> >>>> It is elimination of waste. The sooner - the better. >>>> The solution has been used for a while and it works. >>>> Resolving the problem without gerrit involved will lead to adding >>>> unreliable >>>> code into jobs, and will still be prone to problems: >>>> Just recently, 3d ago we’ve tried detecting what to run from jenkins >>>> relying only on gerrit comments so that upon Verified+1, we’d run the >>>> job. >>>> We could not use “Review+1”, because it makes no sense at all, so we >>>> left >>>> the job to set Verified+1. >>>> Meaning - re-trigger itself immediately more than 1 times. >>>> >>>> Jenkins and its visitors very unhappy, and we had to stop those jobs, >>>> clean >>>> up the queue, and spam developers. >>>> >>>> ## OK OK OK. Now what? >>>> >>>> Now we want your comments and opinions before pushing this further: >>>> Please participate in this thread, so we can start trying it out. >>>> Ask, Suggest better ideas, all this is welcome. >>>> >>>> >>>> Best Regards! >>>> >>>> >>>> N.B. >>>> Of course, this is not written in stone, in case we find a better approach >>>> on >>>> solving those issues, we will change to it. >>>> And we will keep improving so don't be afraid that it will be enforced: if >>>> this does not work out we will discard it. >>>> >>>> P.S. >>>> Kudos to dcaro, most of the work was done by him, and most of this text >>>> too. >>>> >>> >>> +1 from me, releasing CI from running non critical and un-essential jobs >>> will not only reduce load from ci, >>> and shorted response time for developers, it will allow us to add much more >>> powerful tests such as functional & system >>> tests that actually add hosts and run VMs, improving our ability to find >>> regression much more effectively. >>> >>> Another benefit to consider is saving reviewers time. I.e not only jenkins >>> benefits from Worklow+1, but also human reviewers. >>> Instead of looking at a patch that is too early to be reviewed, the author >>> can set the Workflow+1 when the code is ready to review >>> (even if he didn't verified it yet), thus saving time to other reviewers - >>> for example people can add an email rule >>> to alert them only when they are added to patches that have Workflow+1, and >>> not before. >> >> For human reviewers I suggest to keep using drafts until the patch is >> finished. > > keep using? how many developers do you know are working with drafts until > their patch is ready?
We have ~350 drafts in gerrit so someone is using them :-) https://gerrit.ovirt.org/#/q/is:draft > i agree if everyone would use drafts load on jenkins was already much lower, > unfortunately its not the case. > >> Once it's finished and humans reviewed the logic of the patch, Workflow+1 >> should be triggered allowing automation to check the correctness of the >> patch. >> IMHO there's no reason for wasting CI time on patches that will be correct >> from an automation point of view but nacked by reviewers. Especially if the >> patches are part of a big patchset. >> >> >>> >>> And one final note, for Workflow+2 -> this is a preparation for a gating >>> system, like Zuul used by openstack, that in the future >>> we might use as automatic merger pending passing a verification step. this >>> will prevent errors that happen sometimes >>> post merge due to conflicts or other issues, and will be another level of >>> validation before final merge. >>> But as max said, its all part of the plan and we'll test it of course >>> before implementing to see its value. >>> >>> >>>> >>>> >>>> >>>> Max Kovgan >>>> >>>> Senior Software Engineer >>>> Red Hat - EMEA ENG Virtualization R&D >>>> Tel.: +972 9769 2060 >>>> Email: mkovgan [at] redhat [dot] com >>>> Web: http://www.redhat.com >>>> RHT Global #: 82-72060 >>>> >>>> _______________________________________________ >>>> Devel mailing list >>>> [email protected] >>>> http://lists.ovirt.org/mailman/listinfo/devel >>> _______________________________________________ >>> Devel mailing list >>> [email protected] >>> http://lists.ovirt.org/mailman/listinfo/devel >>> >> >> >> -- >> Sandro Bonazzola >> Better technology. Faster innovation. Powered by community collaboration. >> See how it works at redhat.com >> _______________________________________________ >> Infra mailing list >> [email protected] >> http://lists.ovirt.org/mailman/listinfo/infra >> >> >> -- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/infra
