[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Tue, Nov 3, 2009 at 10:38 PM, Drew Wilson atwil...@chromium.org wrote: Do the trybots build the release version? Because I had a build break last week that passed the 3 basic trybots, but failed to compile on the release buildbots because of a missing include which was apparently pulled in through other means in the debug version. No, they do not currently build the release version. Nicolas -atw On Tue, Nov 3, 2009 at 7:30 PM, Nicolas Sylvain nsylv...@chromium.orgwrote: On Tue, Nov 3, 2009 at 7:46 PM, Kenneth Russell k...@chromium.org wrote: On Tue, Nov 3, 2009 at 6:05 PM, John Abd-El-Malek j...@chromium.org wrote: But this means that the person didn't use the trybot. I think we need to be harsher on people who commit with changes that didn't complete or failed on the trybot. They need to have a really good reason as to why they want to try their change on the buildbot and possibly delay many other engineers. For the record, I completely support immediate backouts of changes that break the tree, and agree that all changes should go through the trybots -- but sometimes the trybots don't work. I don't know anything about the architectural differences between the trybots and buildbots, but from recent experience I think the trybots are trying to do incremental builds, when that isn't guaranteed to always work. even the bots on the main waterfall do incremental builds (except some of them). If the change requires a clobber, use gcl try CHANGENAME -c to run the code on the try bot doing a full build. If it's just a matter of throwing hardware at the problem of making the trybots nearly 100% reliable I think we should make that investment. -Ken On Tue, Nov 3, 2009 at 3:11 PM, Ben Goodger (Google) b...@chromium.org wrote: The most common case of 5 minute bustage fix is file was omitted from changelist. -Ben On Tue, Nov 3, 2009 at 2:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Wed, Nov 4, 2009 at 11:40 AM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 6:05 PM, John Abd-El-Malek j...@chromium.orgwrote: But this means that the person didn't use the trybot. I think we need to be harsher on people who commit with changes that didn't complete or failed on the trybot. There are a large number of reasons why the trybots can have false negatives or false positives, or why it's easy to break things even when you're trying to make use of them. Trybots are great, they're not a panacea. I strongly oppose any kind of immediate auto-revert policy. I don't think anyone suggested immediate auto revert. Try bots are not perfect. They won't get all the failures. But even if it's not entirely your fault, it does not mean that your change deserves to be in the tree. It sucks to have your change reverted, and it will add a 10-minute overhead for you to un-revert the change on your working copy, but at least it won't keep the tree closed until you try to figure out what the problem is. And you can blame this 10-minute time wasted on the try bots. We always keep adding new things to it... but unless we get thousands of machines (which we don't have the capacity of handling right now), we won't be able to test everything and build all possible configurations. Nicolas PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Wed, Nov 4, 2009 at 11:56 AM, Nicolas Sylvain nsylv...@chromium.orgwrote: I don't think anyone suggested immediate auto revert. Ben Goodger: I am supportive of auto-revert as long as we apply it universally Kenneth Russell: I completely support immediate backouts of changes that break the tree Try bots are not perfect. They won't get all the failures. But even if it's not entirely your fault, it does not mean that your change deserves to be in the tree. Irrelevant to the argument I'm making. I claim that irrespective of what happened on the trybots, authors who break something should have a brief grace period to fix their problems. at least it won't keep the tree closed until you try to figure out what the problem is. This is why I suggested reverting if the author hasn't immediately jumped on the problem and determined the fix. I think everyone agrees that we don't want breaking changes to sit clogging the pipeline for long period of time. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
Meant to hit reply-all I want to caveat this on having less flakiness. ;-) See my comments on IRC about some tests randomly making a mess of themselves then going green again. -Ben I'll also add this could be done automatically based on the set of tests we define as not flaky. I always feel less bad about being reverted when it's a mindless script doing it. ;D -Ben On Wed, Nov 4, 2009 at 12:02 PM, Peter Kasting pkast...@google.com wrote: On Wed, Nov 4, 2009 at 11:56 AM, Nicolas Sylvain nsylv...@chromium.orgwrote: I don't think anyone suggested immediate auto revert. Ben Goodger: I am supportive of auto-revert as long as we apply it universally Kenneth Russell: I completely support immediate backouts of changes that break the tree Try bots are not perfect. They won't get all the failures. But even if it's not entirely your fault, it does not mean that your change deserves to be in the tree. Irrelevant to the argument I'm making. I claim that irrespective of what happened on the trybots, authors who break something should have a brief grace period to fix their problems. at least it won't keep the tree closed until you try to figure out what the problem is. This is why I suggested reverting if the author hasn't immediately jumped on the problem and determined the fix. I think everyone agrees that we don't want breaking changes to sit clogging the pipeline for long period of time. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Wed, Nov 4, 2009 at 12:02 PM, Peter Kasting pkast...@google.com wrote: On Wed, Nov 4, 2009 at 11:56 AM, Nicolas Sylvain nsylv...@chromium.orgwrote: I don't think anyone suggested immediate auto revert. Ben Goodger: I am supportive of auto-revert as long as we apply it universally Kenneth Russell: I completely support immediate backouts of changes that break the tree Try bots are not perfect. They won't get all the failures. But even if it's not entirely your fault, it does not mean that your change deserves to be in the tree. Irrelevant to the argument I'm making. I claim that irrespective of what happened on the trybots, authors who break something should have a brief grace period to fix their problems. I don't see who this benefits - assuming that a given patch is broken and needs a small delta to be correct, it's just as easy to submit a patch with a small delta as it is to submit the small delta. Leaving the broken patch in the tree for any period of time after it's known to be bad is a waste of everybody's time. - James at least it won't keep the tree closed until you try to figure out what the problem is. This is why I suggested reverting if the author hasn't immediately jumped on the problem and determined the fix. I think everyone agrees that we don't want breaking changes to sit clogging the pipeline for long period of time. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Wed, Nov 4, 2009 at 12:08 PM, James Robinson jam...@google.com wrote: I don't see who this benefits - assuming that a given patch is broken and needs a small delta to be correct, it's just as easy to submit a patch with a small delta as it is to submit the small delta. Leaving the broken patch in the tree for any period of time after it's known to be bad is a waste of everybody's time. For one, the patch isn't always broken. I've seen a number of cases as both sheriff and committer where something breaks because the bots needed a clobber, not because the patch was wrong. Or when a test started failing, but it was due to some other problem than the patch at hand. The author is frequently in the best position to determine if this is the case. For another, because if the period of time between breaking and fixing is, say, two minutes, then the inconvenience to the rest of the team is minuscule (based on our commit frequency, it's likely to be zero), whereas the inconvenience to the author of reverting, retesting, reapplying is not. This equation reverses extremely rapidly, which is why I am OK with reverts for anything that aren't trivial, immediate fixes. Finally, because I have seen many cases where the additional cycle time of the trivial patch-to-fix was lower than the cycle time of the revert. The revert itself may be fast to perform, but if it touches some core header file and causes the bots to rebuild half the world, you're not actually saving time in the end. But I think all this could be summed up in Try to balance courtesy to the author and courtesy to the team; judgment calls are better than unilateral statements. Not that this bikeshed debate matters anyway. Whoever is sheriff runs the tree how they want to, in the end. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Wed, Nov 4, 2009 at 12:16 PM, Peter Kasting pkast...@google.com wrote: On Wed, Nov 4, 2009 at 12:08 PM, James Robinson jam...@google.com wrote: I don't see who this benefits - assuming that a given patch is broken and needs a small delta to be correct, it's just as easy to submit a patch with a small delta as it is to submit the small delta. Leaving the broken patch in the tree for any period of time after it's known to be bad is a waste of everybody's time. For one, the patch isn't always broken. I've seen a number of cases as both sheriff and committer where something breaks because the bots needed a clobber, not because the patch was wrong. Or when a test started failing, but it was due to some other problem than the patch at hand. The author is frequently in the best position to determine if this is the case. For another, because if the period of time between breaking and fixing is, say, two minutes, then the inconvenience to the rest of the team is minuscule (based on our commit frequency, it's likely to be zero), whereas the inconvenience to the author of reverting, retesting, reapplying is not. This equation reverses extremely rapidly, which is why I am OK with reverts for anything that aren't trivial, immediate fixes. Finally, because I have seen many cases where the additional cycle time of the trivial patch-to-fix was lower than the cycle time of the revert. The revert itself may be fast to perform, but if it touches some core header file and causes the bots to rebuild half the world, you're not actually saving time in the end. But I think all this could be summed up in Try to balance courtesy to the author and courtesy to the team; judgment calls are better than unilateral statements. I agree. But for someone who is new to the team and is sheriffing for the first time, it might be hard to apply this rule. I think we should still try to have a specific rule, and make it clear that good judgement is always more important. Nicolas Not that this bikeshed debate matters anyway. Whoever is sheriff runs the tree how they want to, in the end. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
it seems that the time limit should depend on the type of bustage break winxp compile --- you get 2 minutes break linux views compile --- you get 30 minutes -- Evan Stade On Wed, Nov 4, 2009 at 1:10 PM, Nicolas Sylvain nsylv...@chromium.org wrote: On Wed, Nov 4, 2009 at 12:16 PM, Peter Kasting pkast...@google.com wrote: On Wed, Nov 4, 2009 at 12:08 PM, James Robinson jam...@google.com wrote: I don't see who this benefits - assuming that a given patch is broken and needs a small delta to be correct, it's just as easy to submit a patch with a small delta as it is to submit the small delta. Leaving the broken patch in the tree for any period of time after it's known to be bad is a waste of everybody's time. For one, the patch isn't always broken. I've seen a number of cases as both sheriff and committer where something breaks because the bots needed a clobber, not because the patch was wrong. Or when a test started failing, but it was due to some other problem than the patch at hand. The author is frequently in the best position to determine if this is the case. For another, because if the period of time between breaking and fixing is, say, two minutes, then the inconvenience to the rest of the team is minuscule (based on our commit frequency, it's likely to be zero), whereas the inconvenience to the author of reverting, retesting, reapplying is not. This equation reverses extremely rapidly, which is why I am OK with reverts for anything that aren't trivial, immediate fixes. Finally, because I have seen many cases where the additional cycle time of the trivial patch-to-fix was lower than the cycle time of the revert. The revert itself may be fast to perform, but if it touches some core header file and causes the bots to rebuild half the world, you're not actually saving time in the end. But I think all this could be summed up in Try to balance courtesy to the author and courtesy to the team; judgment calls are better than unilateral statements. I agree. But for someone who is new to the team and is sheriffing for the first time, it might be hard to apply this rule. I think we should still try to have a specific rule, and make it clear that good judgement is always more important. Nicolas Not that this bikeshed debate matters anyway. Whoever is sheriff runs the tree how they want to, in the end. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
But this means that the person didn't use the trybot. I think we need to be harsher on people who commit with changes that didn't complete or failed on the trybot. They need to have a really good reason as to why they want to try their change on the buildbot and possibly delay many other engineers. On Tue, Nov 3, 2009 at 3:11 PM, Ben Goodger (Google) b...@chromium.orgwrote: The most common case of 5 minute bustage fix is file was omitted from changelist. -Ben On Tue, Nov 3, 2009 at 2:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Tue, Nov 3, 2009 at 7:08 PM, Ben Goodger (Google) b...@chromium.orgwrote: I am supportive of auto-revert as long as we apply it universally. So many times the tree has been busted forever because of a vacuum of action by the sheriff. Also FYI - the trybots never work for me on my home system. No idea why. From home you to type something like : gcl try CHANGENAME --use_svn --svn_repo=svn://svn.chromium.org/chrome-try/try Nicolas -Ben On Tue, Nov 3, 2009 at 7:05 PM, John Abd-El-Malek j...@chromium.orgwrote: But this means that the person didn't use the trybot. I think we need to be harsher on people who commit with changes that didn't complete or failed on the trybot. They need to have a really good reason as to why they want to try their change on the buildbot and possibly delay many other engineers. On Tue, Nov 3, 2009 at 3:11 PM, Ben Goodger (Google) b...@chromium.orgwrote: The most common case of 5 minute bustage fix is file was omitted from changelist. -Ben On Tue, Nov 3, 2009 at 2:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
The most common case of 5 minute bustage fix is file was omitted from changelist. -Ben On Tue, Nov 3, 2009 at 2:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Tue, Nov 3, 2009 at 6:05 PM, John Abd-El-Malek j...@chromium.org wrote: But this means that the person didn't use the trybot. I think we need to be harsher on people who commit with changes that didn't complete or failed on the trybot. They need to have a really good reason as to why they want to try their change on the buildbot and possibly delay many other engineers. For the record, I completely support immediate backouts of changes that break the tree, and agree that all changes should go through the trybots -- but sometimes the trybots don't work. I don't know anything about the architectural differences between the trybots and buildbots, but from recent experience I think the trybots are trying to do incremental builds, when that isn't guaranteed to always work. If it's just a matter of throwing hardware at the problem of making the trybots nearly 100% reliable I think we should make that investment. -Ken On Tue, Nov 3, 2009 at 3:11 PM, Ben Goodger (Google) b...@chromium.org wrote: The most common case of 5 minute bustage fix is file was omitted from changelist. -Ben On Tue, Nov 3, 2009 at 2:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Tue, Nov 3, 2009 at 2:34 PM, Peter Kasting pkast...@google.com wrote: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. Pinging the author is important, especially if there are special circumstances (like it only fails on the bot and they want to ssh in to examine something). --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
IMO, I wouldn't mind draconian reverts in the interest of keeping the tree open and allowing the sheriffs some semblance of productivity. OTOH, git makes it really easy for me to un-revert and try again, so maybe I'm biased there. - a On Tue, Nov 3, 2009 at 3:03 PM, Eric Seidel esei...@chromium.org wrote: Could we just automate rollouts and this 5-minute timer? If we have the tools to do automated rollouts, would it be reasonable to add them as a phase in the buildbot? On Tue, Nov 3, 2009 at 2:52 PM, Nicolas Sylvain nsylv...@chromium.org wrote: +1 On Tue, Nov 3, 2009 at 3:38 PM, Avi Drissman a...@chromium.org wrote: I'm OK with that. Just make it clear that the sheriff does have authority. One time when I was sheriff I wanted to revert a broken patch. The author insisted on patching it over and over. He finally got it working about about seven patches and nearly three hours or so, when I was insisting on backing it out after the first 30m. Yes, this is exactly what we want to avoid. The 2-minute rule usually includes: Oops, I forgot to commit a file Let me disable the test I just added, it clearly does not work Oops, before committing I renamed a variable and forgot to change it at one place It also use to mean: Oops, I forgot an include. But this one has been biting us to much in the past, so I leave it at the discretion of the sheriff. I think people need to use their good judgement too. The length of a minute should be inversely proportional to the number of people trying to commit during this time of the day. Nicolas Avi On Tue, Nov 3, 2009 at 5:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
On Tue, Nov 3, 2009 at 7:46 PM, Kenneth Russell k...@chromium.org wrote: On Tue, Nov 3, 2009 at 6:05 PM, John Abd-El-Malek j...@chromium.org wrote: But this means that the person didn't use the trybot. I think we need to be harsher on people who commit with changes that didn't complete or failed on the trybot. They need to have a really good reason as to why they want to try their change on the buildbot and possibly delay many other engineers. For the record, I completely support immediate backouts of changes that break the tree, and agree that all changes should go through the trybots -- but sometimes the trybots don't work. I don't know anything about the architectural differences between the trybots and buildbots, but from recent experience I think the trybots are trying to do incremental builds, when that isn't guaranteed to always work. even the bots on the main waterfall do incremental builds (except some of them). If the change requires a clobber, use gcl try CHANGENAME -c to run the code on the try bot doing a full build. If it's just a matter of throwing hardware at the problem of making the trybots nearly 100% reliable I think we should make that investment. -Ken On Tue, Nov 3, 2009 at 3:11 PM, Ben Goodger (Google) b...@chromium.org wrote: The most common case of 5 minute bustage fix is file was omitted from changelist. -Ben On Tue, Nov 3, 2009 at 2:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
I'm OK with that. Just make it clear that the sheriff does have authority. One time when I was sheriff I wanted to revert a broken patch. The author insisted on patching it over and over. He finally got it working about about seven patches and nearly three hours or so, when I was insisting on backing it out after the first 30m. Avi On Tue, Nov 3, 2009 at 5:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
+100. This is very similar to getting paged about a production problem. Sometimes you get sucked into wasting an hour on easy fixes which don't fix anything. That sets you up for stupid mistakes. So, you broke the build. Take it like a man/woman, revert your change, and land it again when you've got it right. It happens to everyone. The goal here isn't to save YOU ten minutes of time, the goal is to save 5 minutes for each of the couple dozen co-workers which will be impacted by the broken tree. Only ask for grace to fix the breakage if you're 100% certain you can fix it right the first time. -scott On Tue, Nov 3, 2009 at 3:52 PM, Nicolas Sylvain nsylv...@chromium.org wrote: +1 On Tue, Nov 3, 2009 at 3:38 PM, Avi Drissman a...@chromium.org wrote: I'm OK with that. Just make it clear that the sheriff does have authority. One time when I was sheriff I wanted to revert a broken patch. The author insisted on patching it over and over. He finally got it working about about seven patches and nearly three hours or so, when I was insisting on backing it out after the first 30m. Yes, this is exactly what we want to avoid. The 2-minute rule usually includes: Oops, I forgot to commit a file Let me disable the test I just added, it clearly does not work Oops, before committing I renamed a variable and forgot to change it at one place It also use to mean: Oops, I forgot an include. But this one has been biting us to much in the past, so I leave it at the discretion of the sheriff. I think people need to use their good judgement too. The length of a minute should be inversely proportional to the number of people trying to commit during this time of the day. Nicolas Avi On Tue, Nov 3, 2009 at 5:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: revert now, ask questions later? WAS: Reverting a change, the fast way
Do the trybots build the release version? Because I had a build break last week that passed the 3 basic trybots, but failed to compile on the release buildbots because of a missing include which was apparently pulled in through other means in the debug version. -atw On Tue, Nov 3, 2009 at 7:30 PM, Nicolas Sylvain nsylv...@chromium.orgwrote: On Tue, Nov 3, 2009 at 7:46 PM, Kenneth Russell k...@chromium.org wrote: On Tue, Nov 3, 2009 at 6:05 PM, John Abd-El-Malek j...@chromium.org wrote: But this means that the person didn't use the trybot. I think we need to be harsher on people who commit with changes that didn't complete or failed on the trybot. They need to have a really good reason as to why they want to try their change on the buildbot and possibly delay many other engineers. For the record, I completely support immediate backouts of changes that break the tree, and agree that all changes should go through the trybots -- but sometimes the trybots don't work. I don't know anything about the architectural differences between the trybots and buildbots, but from recent experience I think the trybots are trying to do incremental builds, when that isn't guaranteed to always work. even the bots on the main waterfall do incremental builds (except some of them). If the change requires a clobber, use gcl try CHANGENAME -c to run the code on the try bot doing a full build. If it's just a matter of throwing hardware at the problem of making the trybots nearly 100% reliable I think we should make that investment. -Ken On Tue, Nov 3, 2009 at 3:11 PM, Ben Goodger (Google) b...@chromium.org wrote: The most common case of 5 minute bustage fix is file was omitted from changelist. -Ben On Tue, Nov 3, 2009 at 2:34 PM, Peter Kasting pkast...@google.com wrote: On Tue, Nov 3, 2009 at 2:08 PM, Ojan Vafai o...@chromium.org wrote: To be clear, here's the proposed policy: Any change that would close the tree can be reverted if it can't be fixed in 2 minutes. How about: If a change closes the tree, the change author has 1 or 2 minutes to respond to a ping. The change should be reverted if the author doesn't respond, if he says to revert, or if he does not say he has a fix within the next 5 minutes. I can't fix _any_ problem in 2 minutes. But I can fix most of them in 5. The goal is to allow the author a reasonable chance to fix trivial problems before we revert. And I think the tree should go ahead and close during that interval. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---