(Cross-posted to mozilla.dev.tree-management; please reply to
mozilla.dev.platform)
TL;DR: I propose that instead of using inbound as a "linear commit;
rebase as needed" tree, we use a tree that allows multiple heads (e.g
try) to land patches on. Each patch queue would be based off a
known-green m-c changeset, and result in a new "head" on the try tree.
If the developer is satisfied with the green-ness of the head, they can
flag it for merging into m-c. I believe this will result in both less
resource usage and less frustration for developers/sheriffs than using
inbound. If this sounds interesting to you, please read on...
Long version:
There have been long closures on inbound recently for various reasons,
mostly various kinds of bustage and having to wait for the builders to
catch up. I'm sure everybody feels the frustration of not being able to
land their work because the tree is closed for something unrelated.
There was a thread on some mobile mailing lists about the possibility of
creating a mobile-central branch similar to fx-team, services-central,
etc. to land mobile patches in isolation, and then merge that in to m-c.
There are advantages and disadvantages to this solution, but while
thinking about it I had a different idea that I think might work better.
The developer workflow I'm proposing requires there be a tree that is
allowed to have multiple heads. The "try" tree we have now satisfies
this requirement, so I'm using that in my proposal, but we could just as
easily use inbound or some other tree provided it accepts multiple
heads. The process for landing a patch from the developer's point of
view would look like this:
1. Take the latest green m-c change, commit your patch(es) on top of it,
and push it to try.
2. If your try push is green, flag it for eventual merge to m-c and
you're done.
3. If your try push is not green, update your patch(es) and go back to
step 1.
The "eventual merge" step would be resolved by the sheriffs picking up
green flagged heads from try and merging them into m-c. If the resulting
m-c is broken, the tree is closed until fixed (see Scenario D below) but
people can continue landing on try while this happens.
I'm going to go through some common scenarios of patches that currently
cause bustage on inbound and show how they would be different with my
suggested process. Except for the last scenario they all show a clear
win. Even in the last one, I would consider it a win although it's
debatable depending on what you value. If you feel this proposal would
be worse in any non-pathological scenario, please do reply with a
description of said scenario.
Note that in the scenarios below I'm not assuming people test their
patches on try unless I explicitly state that (e.g. in Scenario C). Also
I'm assuming that each named patch is independent of the other named
patches (i.e. is not part of the same "patch queue") unless otherwise
stated.
Scenario A
----------
In this scenario, there are a bunch of patches, let's call them patches
A, B, C, D, and E. Patch C contains a bug that causes bustage on tbpl,
but is easily fixed by the developer once the bustage is visible.
With the current process:
Patch A lands on inbound, tree is green.
Patch B lands on inbound, tree is green.
Patch C lands on inbound, causes bustage.
Patch D lands on inbound, bustage continues.
Patch E lands on inbound, bustage continues.
Patch C is backed out, tree is green.
Patch C' (fixed) lands on inbound, tree is green.
Eventually inbound is merged to m-c, tree is green.
In this process we used 8 units of tbpl resources, but the backout
investigation was straightforward and did not consume much
developer/sheriff time.
With my suggested process, all of these patches would be based on the
latest green m-c patch and land on try first:
Patch A lands on try, tree is green.
Patch B lands on try, tree is green.
Patch C lands on try, causes bustage.
Patch D lands on try, tree is green.
Patch E lands on try, tree is green.
Patch C' (fixed) lands on try, tree is green.
A, B, D, E, and C' are merged to m-c, tree is green.
This process uses 7 units of tbpl resources, and there was no backout
involved which keeps the final set of patches in m-c cleaner (helps with
hg blame). Again no significant developer/sheriff time was used.
So even with one bad patch, my suggested process is winning here. Let's
look at another scenario:
----------
Scenario B - the one with tree closure
----------
This is the same as scenario A, except now both B and C are bad patches
and require one iteration on tbpl to fix.
With the current process:
Patch A lands on inbound, tree is green.
Patch B lands on inbound, causes bustage.
Patch C lands on inbound, bustage continues.
Patch D lands on inbound, bustage continues.
Patch E lands on inbound, bustage continues.
Patch B is backed out, bustage continues.
Tree is closed.
Investigation to determine which of {C,D,E} is causing bustage.
Eventually C is backed out, tree is green.
Tree is reopened.
Patch B' lands on inbound, tree is green.
Patch C' lands on inbound, tree is green.
Inbound is merged to m-c, tree is green.
This process used 10 units of tbpl time, plus had a tree closure
(significant developer frustration) and some amount of developer/sheriff
time was used to figure out what happened.
Suggested process:
Patch A lands on try, tree is green.
Patch B lands on try, causes bustage.
Patch C lands on try, causes bustage.
Patch D lands on try, tree is green.
Patch E lands on try, tree is green.
Patch B' lands on try, tree is green.
Patch C' lands on try, tree is green.
A, D, E, B', and C' and merged to m-c, tree is green.
This process used 8 units of tbpl time, and no tree closure or sheriff
time was involved. The only developer frustration was to the author(s)
of the bad patches. Doesn't get much fairer than that. The hg blame in
m-c is also about as clean as possible.
----------
Scenario C - the one with model citizens
----------
In this scenario, people are model citizens and push to try before
pushing to inbound to ensure their patches are great. Patches A and B
are perfect, but C has a bug that requires one iteration to fix.
Current process:
Patch A lands on try, tree is green.
Patch A lands on inbound, tree is green.
Patch B lands on try, tree is green.
Patch B lands on inbound, tree is green.
Patch C lands on try, causes bustage.
Patch C' lands on try, tree is green.
Patch C' lands on inbound, tree is green.
Inbound is merged to m-c, tree is green.
8 units of tbpl time are used to land these 3 patches, but at least
inbound was never broken or closed. That's not bad at all.
In my suggested process there is no inbound, so it's much simplified:
Patch A lands on try, tree is green.
Patch B lands on try, tree is green.
Patch C lands on try, causes bustage.
Patch C' lands on try, tree is green.
Patches A, B, and C' are merged to m-c, tree is green.
This uses 5 units of tbpl time, also with no tree closure or breakage.
Clear win.
----------
Scenario D - the one with patch interactions
----------
In this scenario we have our 5 patches A-E, but unbeknownst to the
authors, patches B and C have a bad interaction and so cause bustage
when combined. Patch C' is a modified version of C that accounts for the
interaction, and patch F is the diff between C and C' (i.e. it can be
applied on top of C to get C').
Current process:
Patch A lands on inbound, tree is green.
Patch B lands on inbound, tree is green.
Patch C lands on inbound, causes bustage.
Patch D lands on inbound, bustage continues.
Patch E lands on inbound, bustage continues.
Patch C is backed out, tree is green.
Patch C' lands on inbound, tree is green.
Inbound is merged to m-c, tree is green.
8 units of tbpl time are used, assuming the author of patch C can figure
out exactly what happened (I don't know how often this is the case). For
one more unit of tbpl time, you can buy patch C going green on try
*before* the inbound landing, which means the author will know with more
certainty it was a bad interaction with either patch A or B. In more
subtle interactions, though, the author of patch C will spend 2-3 units
of tbpl time on try iterations to figure out what happened before
pushing C' back to inbound.
Suggested process:
Patch A lands on try, tree is green.
Patch B lands on try, tree is green.
Patch C lands on try, tree is green.
Patch D lands on try, tree is green.
Patch E lands on try, tree is green.
Patches A, B, C, D, and E are merged to m-c, causes bustage.
Tree is closed.
Patch F (based on new m-c) lands on try, tree is green.
Patch F is merged to m-c, tree is green.
Tree is opened.
Again here 8 units of tbpl time are used, assuming the author of patch C
can figure out exactly what happened. If the interactions are more
subtle, it won't be obvious and the sheriffs (or somebody) will need to
spend 2-3 units of tbpl time to bisect the merge and figure out what
happened before pushing F. You could argue that this is slightly worse
than the current process since there's no clear "owner" for the tree
closure. Note that this is a problem with our current process in
scenario B as well, and that we can solve this by using policy (e.g.
"the author of the last change in the merge must do the bisection").
Most importantly, the tree closure doesn't prevent people from
continuing to land things on try *without any modification to their
workflow*. This part, IMO, is critical, and outweighs the potential
disadvantage to the person stuck with bisecting the bustage.
--- End of scenarios
My suggested process *requires* a tree which allows multiple heads,
which is why I suggest "try" instead of "inbound". It has been suggested
that hg does not deal well with tree with large numbers of heads, which
is why try is reset every so often. In my proposal, we can still reset
try periodically, as long as the changes have been merged to m-c. M-c
itself will always have only one head.
Another potential problem with this approach is that we will have more
merge changes in m-c, which generally screws with hg bisect. Personally
I already have enough trouble with hg bisect to the point where I don't
use it because I can't trust it. This may be a legitimate problem for
some, but it's not for me.
Cheers,
kats
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform