Vincent, I thought that your proposal was about enforcing the rules you have explained in your previous thread, not mainly about maintaining the CI system working properly for their purpose which is to check the stability and quality of XWiki. This seems to me really different. And from our recent talk on IRC, I understand that those working to that maintenance tasks like to share the job. But this does not seems to me like a weekly job that anyone can undertake easily.
I should admit that I have not enough knowledge of the CI system currently, not that I do not want, but that time is lacking. This is also why I have made very few work recently, since I need to invest more time in the test part of XWiki to be able to work a something more important. Is there any document sharing your paste experience of the build system that would help the Build Manager doing his job ? I feel myself completely incompetent for such job currently. For that part of the Job, I am fully +1, we need a well maintained CI system. For the part about hassling others to fix their last commit, this should simply not happen, and this is the reason of my previous mail. Enforcing the rules should simply be the job of all of us once we have vote for it. About tests that start flickering due to changes in the environment or whatever, this is like any other issue, it should be added to JIRA, probably with a high priority. The release manager could than track the responsible and ask for help if these were not fixed or ignored for the release. Hope that I have better express my opinion, and sorry for my lack of detailled knowledge of the CI system. Denis On Thu, Jun 9, 2011 at 14:27, Vincent Massol <[email protected]> wrote: > Hi Denis, > > On Jun 9, 2011, at 2:11 PM, Denis Gervalle wrote: > > > Hi committers, > > > > Even if I completely understand and agree with the goal pursued, I > > really dislike this way of solving it. It is the responsability of all > > of us to keep the build stable at all time. It should be the concerns > > of all recent committers to check if their recent commit breaks the > > CI, and to fix it ASAP when needed. > > Nobody has suggested to change this and this is still the rule. > > > If someone is not ready to do so > > in the upcomming hours following its commit, he should simply refrain > > to do that commit until he can follow it into the CI. I always follow > > these rules (even if most of the time, I commit only stuff that I have > > in production on my side), and I should admit that I would have commit > > more without them. But this is the necessary trade-off between > > evolutivity and stability. > > Again this has always been the rule and will continue to be the rule. > > > Having someone to enforce such rules is admitting that some of us > > needs another one to remind them the best practices. > > Not at all. The reason you don't see the need is because you've not been > active on helping with maintaing our build I guess. I'll list some build > issues that happen frequently: > > * Agents stop working. For ex this morning there were 2 jobs stuck because > of a failure to start FF. This happens from time to time. Someone needs to > investigate and at the very least kill the job to free the agent. > * New versions of jenkins fixing bugs. Someone needs to check and upgrade > the build when a new version has interesting stuff for us and when it fixes > some of our issues. > * The most important ones: flickering tests. I can tell for sure that you > haven't written UI tests or you'd have introduced flickering tests for sure > :) I have myself introduced several. Of course we always test locally before > committing and I even check that it works on jenkins. But flickering don't > fail immediately, they'll run fine for 50 or 100 iterations and suddenly > start to fail. It's not always easy to find who's the culprit on flickering > tests. > * Various issues with filesystem locks, permissions on agents, memory > settings, etc that make the build fail > > BTW there are also other build tasks such as: > * Upgrade versions of maven plugins we use when there are new versions > * Fix TODOs in pom.xml which are workaround waiting for maven issues to be > fixed > > You seem to think these are done automagically. The reason you think this > is because people like Thomas or myself (and even others but I think Thomas > and me have been the most active on this) are doing. Personally it's not my > role to be the Build Manager and I'm fed up of doing it. I want to share the > workload with everyone. > > In order to address these problems you cannot just rely on the good will of > everyone to work on them. You need someone responsible. Rather than having a > single person responsible all the time, I'm proposing that committers take > turn to do that. > > > This is not for > > me the philosophy behind Open-Source project, where everyone should do > > their best for the wellness of the project. > > Ok so tell me the last time you helped fix failing builds? :) IMO it's a > long time ago which means you don't do the best for the wellness of the > project (according to what you said) ... :) > (Surely you agree that the specific issues I've listed above are nobody's > faults). > > > So I could not admit there > > is a real need for this, and I really hope that everyone of us will > > understand the needs to move their cursor towards the stability of the > > build. > > That's exactly what I'm trying to achieve. Ideally everyone would take care > about the build but that doesn't work. Right now the goal is to create a > task force to stabilize the build again, ie make sure it's stable without > any false positive. Again the goal of the Build Manager is NOT to fix all > issues himself/herself but to ping others to help/fix the issues and in > general to increase the awareness of having a stable build. > > > So please guys, takes your responsibility without a need for a build > > policeman. > > It hasn't worked. Hence this new proposal. At some point we may find that > the build manager has nothing to do anymore which will be great. When that > happens we'll be able to remove that role. > > > Sorry, but I am -1 to do the policeman (but if I need to, I will do my > > duty), and I vote -0, just because I do not consider myself active > > enough to veto. > > Let's hope I've provided more info on why we need a build manager for the > time being ;) > > Thanks > -Vincent > > > > > Denis > > > > On Thursday, June 9, 2011, Thomas Mortagne <[email protected]> > > wrote: > >> On Thu, Jun 9, 2011 at 09:20, Vincent Massol <[email protected]> > wrote: > >>> > >>> On Jun 9, 2011, at 9:08 AM, Thomas Mortagne wrote: > >>> > >>>> On Wed, Jun 8, 2011 at 19:40, Vincent Massol <[email protected]> > wrote: > >>>>> Hi committers, > >>>>> > >>>>> We're having a hard time stabilizing our build (especially the > > functional test part, see my previous mail entitled "[VOTE] Important: > > Strategy to fix failing tests and stability"). Now I believe that it's > going > > to be hard to enforce it and thus I'd like to propose a variation: > >>>>> > >>>>> * The Build Manager has the *responsibility* to get the build fixed > > ASAP whenever it's failing. His priority #1 during the week becomes > > monitoring the Build > >>>>> * By "Build" we mean the CI Build on ci.xwiki.org and by "failing" > we > > mean anything that makes the build fail: tests, compilation, clirr, etc. > >>>>> * Every week we have a different Build Manager chosen amongst the > > Committers > >>>> > >>>> A week seems a bit short but in the other hand it will seems pretty > >>>> long for the Build Manager itself I'm sure ;) > >>>> > >>>>> * In order to fix build issues the Build Manager has several > > possibilities: > >>>>> - find out who caused the build to break and ask that person to fix > it. > > That person cannot refuse that and must consider it his/her priority to > fix > > it (or rollback the change that caused the build to fail) > >>>>> - rollback the issue that caused the build to fail > >>>>> - fix it himself/herself > >>>>> - find someone knowledgable in the failing domain and get him/her to > > fix the build. > >>>>> * At the end of the Week the Build Manager hands over his duty to the > > next Build Manager by contacting him/her. > >>>>> * We create a Build Manager Roster page on dev.xwiki.org to log past > > Build Managers (and possibly future ones if some have expressed the wish > to > > be the Build Manager for a specific week). > >>>>> * All committers must perform this duty and take turns > >>>>> > >>>>> Since I've started doing this this week, I propose to take this role > > for the current week. I'm also proposing to log Caleb has having been the > > Build Manager for the past week since he's done a lot to stabilize the > > build. > >>>>> > >>>>> If the vote is passed I'll log this on the Committership page as a > > Committer duty (I'll also cross reference it from the Build page). > >>>>> > >>>>> Here's my +1 > >>>> > >>>> +1 > >>>> > >>>> What don't you think about designed people who broke the build the > >>>> most for the following week ? > >>> > >>> An interesting idea... > >>> > >>> However: > >>> 1) it's hard for flickering tests to find out the culprit > >>> 2) it's not so much a problem of breaking the build often, it's more a > > problem of not fixing it immediately when broken > >> > >> Sure, my really proposal was actually "design the most painful people > >> for Build Manager as build manager" but I wanted to find a better > >> metric :) > >> > >>> > >>> However I agree that in the Roster we could log information for the > past > > week about who broke the build, how many flicker fixed, etc > >>> > >>> Thanks > >>> -Vincent > _______________________________________________ > devs mailing list > [email protected] > http://lists.xwiki.org/mailman/listinfo/devs > -- Denis Gervalle SOFTEC sa - CEO eGuilde sarl - CTO _______________________________________________ devs mailing list [email protected] http://lists.xwiki.org/mailman/listinfo/devs

