(Cross-posted to mozilla.dev.tree-management; please reply to mozilla.dev.platform)

TL;DR: I propose that instead of using inbound as a "linear commit; rebase as needed" tree, we use a tree that allows multiple heads (e.g try) to land patches on. Each patch queue would be based off a known-green m-c changeset, and result in a new "head" on the try tree. If the developer is satisfied with the green-ness of the head, they can flag it for merging into m-c. I believe this will result in both less resource usage and less frustration for developers/sheriffs than using inbound. If this sounds interesting to you, please read on...

Long version:

There have been long closures on inbound recently for various reasons, mostly various kinds of bustage and having to wait for the builders to catch up. I'm sure everybody feels the frustration of not being able to land their work because the tree is closed for something unrelated. There was a thread on some mobile mailing lists about the possibility of creating a mobile-central branch similar to fx-team, services-central, etc. to land mobile patches in isolation, and then merge that in to m-c. There are advantages and disadvantages to this solution, but while thinking about it I had a different idea that I think might work better.

The developer workflow I'm proposing requires there be a tree that is allowed to have multiple heads. The "try" tree we have now satisfies this requirement, so I'm using that in my proposal, but we could just as easily use inbound or some other tree provided it accepts multiple heads. The process for landing a patch from the developer's point of view would look like this:

1. Take the latest green m-c change, commit your patch(es) on top of it, and push it to try. 2. If your try push is green, flag it for eventual merge to m-c and you're done. 3. If your try push is not green, update your patch(es) and go back to step 1.

The "eventual merge" step would be resolved by the sheriffs picking up green flagged heads from try and merging them into m-c. If the resulting m-c is broken, the tree is closed until fixed (see Scenario D below) but people can continue landing on try while this happens.

I'm going to go through some common scenarios of patches that currently cause bustage on inbound and show how they would be different with my suggested process. Except for the last scenario they all show a clear win. Even in the last one, I would consider it a win although it's debatable depending on what you value. If you feel this proposal would be worse in any non-pathological scenario, please do reply with a description of said scenario.

Note that in the scenarios below I'm not assuming people test their patches on try unless I explicitly state that (e.g. in Scenario C). Also I'm assuming that each named patch is independent of the other named patches (i.e. is not part of the same "patch queue") unless otherwise stated.

Scenario A
----------
In this scenario, there are a bunch of patches, let's call them patches A, B, C, D, and E. Patch C contains a bug that causes bustage on tbpl, but is easily fixed by the developer once the bustage is visible.

With the current process:

Patch A lands on inbound, tree is green.
Patch B lands on inbound, tree is green.
Patch C lands on inbound, causes bustage.
Patch D lands on inbound, bustage continues.
Patch E lands on inbound, bustage continues.
Patch C is backed out, tree is green.
Patch C' (fixed) lands on inbound, tree is green.
Eventually inbound is merged to m-c, tree is green.

In this process we used 8 units of tbpl resources, but the backout investigation was straightforward and did not consume much developer/sheriff time.

With my suggested process, all of these patches would be based on the latest green m-c patch and land on try first:

Patch A lands on try, tree is green.
Patch B lands on try, tree is green.
Patch C lands on try, causes bustage.
Patch D lands on try, tree is green.
Patch E lands on try, tree is green.
Patch C' (fixed) lands on try, tree is green.
A, B, D, E, and C' are merged to m-c, tree is green.

This process uses 7 units of tbpl resources, and there was no backout involved which keeps the final set of patches in m-c cleaner (helps with hg blame). Again no significant developer/sheriff time was used.

So even with one bad patch, my suggested process is winning here. Let's look at another scenario:

----------
Scenario B - the one with tree closure
----------
This is the same as scenario A, except now both B and C are bad patches and require one iteration on tbpl to fix.

With the current process:

Patch A lands on inbound, tree is green.
Patch B lands on inbound, causes bustage.
Patch C lands on inbound, bustage continues.
Patch D lands on inbound, bustage continues.
Patch E lands on inbound, bustage continues.
Patch B is backed out, bustage continues.
Tree is closed.
Investigation to determine which of {C,D,E} is causing bustage.
Eventually C is backed out, tree is green.
Tree is reopened.
Patch B' lands on inbound, tree is green.
Patch C' lands on inbound, tree is green.
Inbound is merged to m-c, tree is green.

This process used 10 units of tbpl time, plus had a tree closure (significant developer frustration) and some amount of developer/sheriff time was used to figure out what happened.

Suggested process:

Patch A lands on try, tree is green.
Patch B lands on try, causes bustage.
Patch C lands on try, causes bustage.
Patch D lands on try, tree is green.
Patch E lands on try, tree is green.
Patch B' lands on try, tree is green.
Patch C' lands on try, tree is green.
A, D, E, B', and C' and merged to m-c, tree is green.

This process used 8 units of tbpl time, and no tree closure or sheriff time was involved. The only developer frustration was to the author(s) of the bad patches. Doesn't get much fairer than that. The hg blame in m-c is also about as clean as possible.

----------
Scenario C - the one with model citizens
----------
In this scenario, people are model citizens and push to try before pushing to inbound to ensure their patches are great. Patches A and B are perfect, but C has a bug that requires one iteration to fix.

Current process:

Patch A lands on try, tree is green.
Patch A lands on inbound, tree is green.
Patch B lands on try, tree is green.
Patch B lands on inbound, tree is green.
Patch C lands on try, causes bustage.
Patch C' lands on try, tree is green.
Patch C' lands on inbound, tree is green.
Inbound is merged to m-c, tree is green.

8 units of tbpl time are used to land these 3 patches, but at least inbound was never broken or closed. That's not bad at all.

In my suggested process there is no inbound, so it's much simplified:

Patch A lands on try, tree is green.
Patch B lands on try, tree is green.
Patch C lands on try, causes bustage.
Patch C' lands on try, tree is green.
Patches A, B, and C' are merged to m-c, tree is green.

This uses 5 units of tbpl time, also with no tree closure or breakage. Clear win.

----------
Scenario D - the one with patch interactions
----------
In this scenario we have our 5 patches A-E, but unbeknownst to the authors, patches B and C have a bad interaction and so cause bustage when combined. Patch C' is a modified version of C that accounts for the interaction, and patch F is the diff between C and C' (i.e. it can be applied on top of C to get C').

Current process:

Patch A lands on inbound, tree is green.
Patch B lands on inbound, tree is green.
Patch C lands on inbound, causes bustage.
Patch D lands on inbound, bustage continues.
Patch E lands on inbound, bustage continues.
Patch C is backed out, tree is green.
Patch C' lands on inbound, tree is green.
Inbound is merged to m-c, tree is green.

8 units of tbpl time are used, assuming the author of patch C can figure out exactly what happened (I don't know how often this is the case). For one more unit of tbpl time, you can buy patch C going green on try *before* the inbound landing, which means the author will know with more certainty it was a bad interaction with either patch A or B. In more subtle interactions, though, the author of patch C will spend 2-3 units of tbpl time on try iterations to figure out what happened before pushing C' back to inbound.

Suggested process:

Patch A lands on try, tree is green.
Patch B lands on try, tree is green.
Patch C lands on try, tree is green.
Patch D lands on try, tree is green.
Patch E lands on try, tree is green.
Patches A, B, C, D, and E are merged to m-c, causes bustage.
Tree is closed.
Patch F (based on new m-c) lands on try, tree is green.
Patch F is merged to m-c, tree is green.
Tree is opened.

Again here 8 units of tbpl time are used, assuming the author of patch C can figure out exactly what happened. If the interactions are more subtle, it won't be obvious and the sheriffs (or somebody) will need to spend 2-3 units of tbpl time to bisect the merge and figure out what happened before pushing F. You could argue that this is slightly worse than the current process since there's no clear "owner" for the tree closure. Note that this is a problem with our current process in scenario B as well, and that we can solve this by using policy (e.g. "the author of the last change in the merge must do the bisection").

Most importantly, the tree closure doesn't prevent people from continuing to land things on try *without any modification to their workflow*. This part, IMO, is critical, and outweighs the potential disadvantage to the person stuck with bisecting the bustage.

--- End of scenarios

My suggested process *requires* a tree which allows multiple heads, which is why I suggest "try" instead of "inbound". It has been suggested that hg does not deal well with tree with large numbers of heads, which is why try is reset every so often. In my proposal, we can still reset try periodically, as long as the changes have been merged to m-c. M-c itself will always have only one head.

Another potential problem with this approach is that we will have more merge changes in m-c, which generally screws with hg bisect. Personally I already have enough trouble with hg bisect to the point where I don't use it because I can't trust it. This may be a legitimate problem for some, but it's not for me.

Cheers,
kats
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to