Re: [xwiki-devs] Fwd: [VOTE] Build Manager duty

Vincent Massol Thu, 09 Jun 2011 05:28:57 -0700

Hi Denis,

On Jun 9, 2011, at 2:11 PM, Denis Gervalle wrote:

> Hi committers,
> 
> Even if I completely understand and agree with the goal pursued, I
> really dislike this way of solving it. It is the responsability of all
> of us to keep the build stable at all time. It should be the concerns
> of all recent committers to check if their recent commit breaks the
> CI, and to fix it ASAP when needed.

Nobody has suggested to change this and this is still the rule.

> If someone is not ready to do so
> in the upcomming hours following its commit, he should simply refrain
> to do that commit until he can follow it into the CI. I always follow
> these rules (even if most of the time, I commit only stuff that I have
> in production on my side), and I should admit that I would have commit
> more without them. But this is the necessary trade-off between
> evolutivity and stability.

Again this has always been the rule and will continue to be the rule.

> Having someone to enforce such rules is admitting that some of us
> needs another one to remind them the best practices.

Not at all. The reason you don't see the need is because you've not been active 
on helping with maintaing our build I guess. I'll list some build issues that 
happen frequently:

* Agents stop working. For ex this morning there were 2 jobs stuck because of a 
failure to start FF. This happens from time to time. Someone needs to 
investigate and at the very least kill the job to free the agent.
* New versions of jenkins fixing bugs. Someone needs to check and upgrade the 
build when a new version has interesting stuff for us and when it fixes some of 
our issues.
* The most important ones: flickering tests. I can tell for sure that you 
haven't written UI tests or you'd have introduced flickering tests for sure :) 
I have myself introduced several. Of course we always test locally before 
committing and I even check that it works on jenkins. But flickering don't fail 
immediately, they'll run fine for 50 or 100 iterations and suddenly start to 
fail. It's not always easy to find who's the culprit on flickering tests.
* Various issues with filesystem locks, permissions on agents, memory settings, 
etc that make the build fail

BTW there are also other build tasks such as:
* Upgrade versions of maven plugins we use when there are new versions
* Fix TODOs in pom.xml which are workaround waiting for maven issues to be fixed

You seem to think these are done automagically. The reason you think this is 
because people like Thomas or myself (and even others but I think Thomas and me 
have been the most active on this) are doing. Personally it's not my role to be 
the Build Manager and I'm fed up of doing it. I want to share the workload with 
everyone.

In order to address these problems you cannot just rely on the good will of 
everyone to work on them. You need someone responsible. Rather than having a 
single person responsible all the time, I'm proposing that committers take turn 
to do that.

> This is not for
> me the philosophy behind Open-Source project, where everyone should do
> their best for the wellness of the project.

Ok so tell me the last time you helped fix failing builds? :) IMO it's a long 
time ago which means you don't do the best for the wellness of the project 
(according to what you said) ... :)
(Surely you agree that the specific issues I've listed above are nobody's 
faults).

> So I could not admit there
> is a real need for this, and I really hope that everyone of us will
> understand the needs to move their cursor towards the stability of the
> build.

That's exactly what I'm trying to achieve. Ideally everyone would take care 
about the build but that doesn't work. Right now the goal is to create a task 
force to stabilize the build again, ie make sure it's stable without any false 
positive. Again the goal of the Build Manager is NOT to fix all issues 
himself/herself but to ping others to help/fix the issues and in general to 
increase the awareness of having a stable build.

> So please guys, takes your responsibility without a need for a build
> policeman.

It hasn't worked. Hence this new proposal. At some point we may find that the 
build manager has nothing to do anymore which will be great. When that happens 
we'll be able to remove that role.

> Sorry, but I am -1 to do the policeman (but if I need to, I will do my
> duty), and I vote -0, just because I do not consider myself active
> enough to veto.

Let's hope I've provided more info on why we need a build manager for the time 
being ;)

Thanks
-Vincent

> 
> Denis
> 
> On Thursday, June 9, 2011, Thomas Mortagne <[email protected]>
> wrote:
>> On Thu, Jun 9, 2011 at 09:20, Vincent Massol <[email protected]> wrote:
>>> 
>>> On Jun 9, 2011, at 9:08 AM, Thomas Mortagne wrote:
>>> 
>>>> On Wed, Jun 8, 2011 at 19:40, Vincent Massol <[email protected]> wrote:
>>>>> Hi committers,
>>>>> 
>>>>> We're having a hard time stabilizing our build (especially the
> functional test part, see my previous mail entitled "[VOTE] Important:
> Strategy to fix failing tests and stability"). Now I believe that it's going
> to be hard to enforce it and thus I'd like to propose a variation:
>>>>> 
>>>>> * The Build Manager has the *responsibility* to get the build fixed
> ASAP whenever it's failing. His priority #1 during the week becomes
> monitoring the Build
>>>>> * By "Build" we mean the CI Build on ci.xwiki.org and by "failing" we
> mean anything that makes the build fail: tests, compilation, clirr, etc.
>>>>> * Every week we have a different Build Manager chosen amongst the
> Committers
>>>> 
>>>> A week seems a bit short but in the other hand it will seems pretty
>>>> long for the Build Manager itself I'm sure ;)
>>>> 
>>>>> * In order to fix build issues the Build Manager has several
> possibilities:
>>>>> - find out who caused the build to break and ask that person to fix it.
> That person cannot refuse that and must consider it his/her priority to fix
> it (or rollback the change that caused the build to fail)
>>>>> - rollback the issue that caused the build to fail
>>>>> - fix it himself/herself
>>>>> - find someone knowledgable in the failing domain and get him/her to
> fix the build.
>>>>> * At the end of the Week the Build Manager hands over his duty to the
> next Build Manager by contacting him/her.
>>>>> * We create a Build Manager Roster page on dev.xwiki.org to log past
> Build Managers (and possibly future ones if some have expressed the wish to
> be the Build Manager for a specific week).
>>>>> * All committers must perform this duty and take turns
>>>>> 
>>>>> Since I've started doing this this week, I propose to take this role
> for the current week. I'm also proposing to log Caleb has having been the
> Build Manager for the past week since he's done a lot to stabilize the
> build.
>>>>> 
>>>>> If the vote is passed I'll log this on the Committership page as a
> Committer duty (I'll also cross reference it from the Build page).
>>>>> 
>>>>> Here's my +1
>>>> 
>>>> +1
>>>> 
>>>> What don't you think about designed people who broke the build the
>>>> most for the following week ?
>>> 
>>> An interesting idea...
>>> 
>>> However:
>>> 1)  it's hard for flickering tests to find out the culprit
>>> 2) it's not  so much a problem of breaking the build often, it's more a
> problem of not fixing it immediately when broken
>> 
>> Sure, my really proposal was actually "design the most painful people
>> for Build Manager as build manager" but I wanted to find a better
>> metric :)
>> 
>>> 
>>> However I agree that in the Roster we could log information for the past
> week about who broke the build, how many flicker fixed, etc
>>> 
>>> Thanks
>>> -Vincent
_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Re: [xwiki-devs] Fwd: [VOTE] Build Manager duty

Reply via email to