On 19/06/14 13:22, Mark McLoughlin wrote: > On Thu, 2014-06-19 at 09:34 +0100, Matthew Booth wrote: >> On 19/06/14 08:32, Mark McLoughlin wrote: >>> Hi Armando, >>> >>> On Tue, 2014-06-17 at 14:51 +0200, Armando M. wrote: >>>> I wonder what the turnaround of trivial patches actually is, I bet you >>>> it's very very small, and as Daniel said, the human burden is rather >>>> minimal (I would be more concerned about slowing them down in the >>>> gate, but I digress). >>> >>>> >>>> I think that introducing a two-tier level for patch approval can only >>>> mitigate the problem, but I wonder if we'd need to go a lot further, >>>> and rather figure out a way to borrow concepts from queueing theory so >>>> that they can be applied in the context of Gerrit. For instance >>>> Little's law [1] says: >>>> >>>> "The long-term average number of customers (in this context reviews) >>>> in a stable system L is equal to the long-term average effective >>>> arrival rate, λ, multiplied by the average time a customer spends in >>>> the system, W; or expressed algebraically: L = λW." >>>> >>>> L can be used to determine the number of core reviewers that a project >>>> will need at any given time, in order to meet a certain arrival rate >>>> and average time spent in the queue. If the number of core reviewers >>>> is a lot less than L then that core team is understaffed and will need >>>> to increase. >>>> >>>> If we figured out how to model and measure Gerrit as a queuing system, >>>> then we could improve its performance a lot more effectively; for >>>> instance, this idea of privileging trivial patches over longer patches >>>> has roots in a popular scheduling policy [3] for M/G/1 queues, but >>>> that does not really help aging of 'longer service time' patches and >>>> does not have a preemption mechanism built-in to avoid starvation. >>>> >>>> Just a crazy opinion... >>>> Armando >>>> >>>> [1] - http://en.wikipedia.org/wiki/Little's_law >>>> [2] - http://en.wikipedia.org/wiki/Shortest_job_first >>>> [3] - http://en.wikipedia.org/wiki/M/G/1_queue >>> >>> This isn't crazy at all. We do have a problem that surely could be >>> studied and solved/improved by applying queueing theory or lessons from >>> fields like lean manufacturing. Right now, we're simply applying our >>> intuition and the little I've read about these sorts of problems is that >>> your intuition can easily take you down the wrong path. >>> >>> There's a bunch of things that occur just glancing through those >>> articles: >>> >>> - Do we have an unstable system? Would it be useful to have arrival >>> and exit rate metrics to help highlight this? Over what time period >>> would those rates need to be averaged to be useful? Daily, weekly, >>> monthly, an entire release cycle? >>> >>> - What are we trying to optimize for? The length of time in the >>> queue? The number of patches waiting in the queue? The response >>> time to a new patch revision? >>> >>> - We have a single queue, with a bunch of service nodes with a wide >>> variance between their service rates, very little in the way of >>> scheduling policy, a huge rate of service nodes sending jobs back >>> for rework, a cost associated with maintaining a job while it sits >>> in the queue, the tendency for some jobs to disrupt many other jobs >>> with merge conflicts ... not simple. >>> >>> - Is there any sort of natural limit in our queue size that makes the >>> system stable - e.g. do people naturally just stop submitting >>> patches at some point? >>> >>> My intuition on all of this lately is that we need some way to model and >>> experiment with this queue, and I think we could make some interesting >>> progress if we could turn it into a queueing network rather than a >>> single, extremely complex queue. >>> >>> Say we had a front-end for gerrit which tracked which queue a patch is >>> in, we could experiment with things like: >>> >>> - a triage queue, with non-cores signed up as triagers looking for >>> obvious mistakes and choosing the next queue for a patch to enter >>> into >>> >>> - queues having a small number of cores signed up as owners - e.g. >>> high priority bugfix, API, scheduler, object conversion, libvirt >>> driver, vmware driver, etc. >>> >>> - we'd allow for a large number of queues so that cores could aim for >>> an "inbox zero" approach on individual queues, something that would >>> probably help keep cores motivated. >>> >>> - we could apply different scheduling policies to each of the >>> different queues - i.e. explicit guidance for cores about which >>> patches they should pick off the queue next. >>> >>> - we could track metrics on individual queues as well as the whole >>> network, identifying bottlenecks and properly recognizing which >>> reviewers are doing a small number of difficult reviews versus >>> those doing a high number of trivial reviews. >>> >>> - we could require some queues to feed into a final approval queue >>> where some people are responsible for giving an approved patch a >>> final sanity check - i.e. there would be a class of reviewer with >>> good instincts who quickly churn through already-reviewed patches >>> looking for the kind of mistakes people tend to mistake when >>> they're down in the weeds. >>> >>> - explicit queues for large, cross-cutting changes like coding style >>> changes. Perhaps we could stop servicing these queues at certain >>> points in the cycles, or reduce the rate at which they are >>> serviced. >>> >>> - we could include specs and client patches in the same network so >>> that they prioritized in the same way. >>> >>> Lots of ideas, none of it is trivial ... but perhaps it'll spark >>> someone's interest :) >> >> This is all good stuff, but by the sounds of it experimenting in gerrit >> isn't likely to be simple. > > Right, which is why I said "a front-end for gerrit" ... this would sit > in front of gerrit, at least to begin with. > >> Remember, though, that the relevant metric is code quality, not review >> rate. > > Not sure what conclusion you're driving towards there - obviously that's > true, but ... ignore review rate? measure code quality how?
I mean that there are other ways to improve code quality than review. If we find ourselves putting a lot of effort into this, we may be better served by looking for low hanging fruit elsewhere. Matt -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
