Re: [openstack-dev] [nova] Is the BP approval process broken?

Sylvain Bauza Fri, 29 Aug 2014 18:39:28 -0700

<life>Sorry folks, I just had a new daughter since Thursday so I'm onPTO until Monday, so thanks to the people who discussed about theblueprint I created and how we can avoid the problem raised by Don for Kilo.

</life>

Answers inline.


Le 29/08/2014 19:42, John Garbutt a écrit :

I think this is now more about code reviews, but this is important...

On 29 August 2014 10:30, Daniel P. Berrange <berra...@redhat.com> wrote:

On Fri, Aug 29, 2014 at 11:07:33AM +0200, Thierry Carrez wrote:

Joe Gordon wrote:

On Thu, Aug 28, 2014 at 2:43 PM, Alan Kavanagh
<alan.kavan...@ericsson.com <mailto:alan.kavan...@ericsson.com>> wrote:

     I share Donald's points here, I believe what would help is to
     clearly describe in the Wiki the process and workflow for the BP
     approval process and build in this process how to deal with
     discrepancies/disagreements and build timeframes for each stage and
     process of appeal etc.
     The current process would benefit from some fine tuning and helping
     to build safe guards and time limits/deadlines so folks can expect
     responses within a reasonable time and not be left waiting in the cold.

This is a resource problem, the nova team simply does not have enough
people doing enough reviews to make this possible.

I think Nova lacks core reviewers more than it lacks reviewers, though.
Just looking at the ratio of core developers vs. patchsets proposed,
it's pretty clear that the core team is too small:

Nova: 750 patchsets/month for 21 core = 36
Heat: 230/14 = 16
Swift: 50/16 = 3

Neutron has the same issue (550/14 = 39). I think above 20, you have a
dysfunctional setup. No amount of process, spec, or runway will solve
that fundamental issue.

+1

+1. I can really understand that there is a reviewers bandwidth issuewithin Nova which can't just be answered by adding new cores, because itwould create some parliament issues.

The problem is, you can't just add core reviewers, they have to actually
understand enough of the code base to be trusted with that +2 power. All
potential candidates are probably already in. In Nova, the code base is
so big it's difficult to find people that know enough of it. In Neutron,
the contributors are often focused on subsections of the code base so
they are not really interested in learning enough of the rest. That
makes the pool of core candidates quite dry.

The other point is keeping the reviews consistent. Making the team
larger makes that harder.

If we did a better job of discussing core disagreements more in the
nova-meeting, maybe that would help keep consistency between a larger
group of people. But it boils down to trusting each other, and a group
bigger than 20, is a lot of people to get to know.

+1 to John, I think adding 5 new cores now would create some problems ifwe do that before discussing how the specs model and the Kilo bps can bedone. I don't want to do kind of parralelism to what's happening withsome politics model here but I think we need to have anyway a smallnumber of deciders within a big project to make it successful (anddelegate, but I will explain further below)

I fear the only solution is smaller groups being experts on smaller
codebases. There is less to review, and more candidates that are likely
to be experts in this limited area.

Applied to Nova, that means modularization -- having strong internal
interfaces and trusting subteams to +2 the code they are experts on.
Maybe VMWare driver people should just +2 VMware-related code. We've had
that discussion before, and I know there is a dangerous potential
quality slope there -- I just fail to see any other solution to bring
that 750/21=36 figure down to a bearable level, before we burn out all
of the Nova core team.

This worked really well for Cinder, and I hope Gantt will do the same
kind of thing for Scheduling.

It certainly feels like we really need to split things up, maybe:
* API (talks to compute api to creates tasks and gets objects)
* core task orchestration and persistence (compute api, db objects,
conductor, talks to compute manager api, scheduler api, network api)
* compute manager + "drivers" (gets instance objects)
* Scheduling (models resources, gets )
* nova-network

But clearly, that will make evolving those interfaces much harder, the
separate they become.

Certainly we fee a few release away from some of those splits.

I broadly agree - I think that unless Nova moves more towards something
that is closer to the Linux style subsystem maintainer model we are
doomed. I know in Linux, the maintainers actually use separate git trees,
and that isn't what I mean - I think using a single git tree is still
desirable (at least for now). What I mean is that we should place more
trust on the opinion of the people who are experts for a particular
area of code. Let those experts take on a greater burden of the code
review so core team can put more focus on actual merge approval.

I know some of the core team try to do this implicitly - eg we know who
some of the main people involved in hyperv or vmware are, so will tend
to treat their +1 as an effective +2 from the POV of their driver code,
but our rules still require two actual +2s from core, so it doesn't
entirely help us right now. I think we need to do some work in tooling
to make this more of an explicit process though.

I do prefer a distinction between core and sub-core-team.

Just because we want multiple sub-core-team members, but still make it
easier to spot the core reviewer.

As the Nova project is becoming bigger and bigger and as IMHO I thinkNova should not have more than the current number of cores, my personalopinion is that cores should be represented by subteams people forhelping them to know if a patch is good or not, so we would only needone core for validating it.

So, wrt what proposed Daniel above, if some sub-core members would say+1.5, they could just ping or discuss during Nova meetings the cores tosay they would just have to sign off.

The problem is that gerrit does not allow us to say person X has +2 for
code that touches directory path /foo/bar. The +2 is global to the entire
repository. We could try to deal with this problem outside of gerrit
though. As a starting point, each virt driver (or major functional area
of nova codebase) should have an explicit list of people who are considered
to be the "core team" for that area of code.  From such a list, tools like
gerrymander (or anything else that can query gerrit), could see when a
person in those lists +1'd a change touching their area of responsibility
and change that to be presented as a "+1.5".

Or we create a +3 for core, and +2 for sub team, but I like the distinction.

I'm just thinking it could confuse people if +2 would then become anon-validation.If creating a mid-level note would be tricky, I could also propose thatanother label could be created like "Subteam Verified" and whose grouppeople could just check it ?

This would make it very explicit to the reviewers that they should consider
the change to be (almost) equivalent to a +2.  We could potentially then
relax the rule of

      "+A requires two +2s"

to be

      "+A requires (two +2s) or (one +2 and one +1.5)"

I think that could work, but maybe:

      "+A requires (two +2s) or (one +2 and two +1.5)"

Only because it would be good for the sub-core teams to see each
others +1.5s. But maybe thats silly, and just makes things worse
again.

Well, right point. If we need two 1.5, then the velocity issue would bethen on subteam approvers.If we assume that the subteams are big and worldwide enough for having agood reviewing ratio, I'm OK with that. If not, then the issue would notdisappear.

I think this would significantly improve our review throughput. I think
it could also help us get people to gain greater responsibility. The
jump from regular contributor to core team member is quite a high bar.
If we had the intermediate step of subsystem team member that would ease
the progression. It would also give the subsystem teams a greater sense
of engagement & value in the nova community

This keeps being suggested. I always used to worry about this because
of things like:

* code consistency is easier when more reviews know more of nova

* makes it harder to get the natural alignment of core reviews, as we
would less regularly see each others code reviews

* almost encourages people to work in a silo of nova, where ideally we
want help in the core of nova too

* would be bad if people got stuck in one part of nova by this, even
if they wanted to go work in other areas of nova

But honestly, we are running out of options here. Maybe we should try this out?

It is risky (in terms of keeping the code base consistent), but we do
need to scale out our review capacity. I am wondering what other ideas
people have...?

Thanks,
John

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Is the BP approval process broken?

Reply via email to