On 13 March 2014 10:09, Matthew Booth <mbo...@redhat.com> wrote: > On 12/03/14 18:28, Matt Riedemann wrote: >> On 2/25/2014 6:36 AM, Matthew Booth wrote: >>> I'm new to Nova. After some frustration with the review process, >>> specifically in the VMware driver, I decided to try to visualise how the >>> review process is working across Nova. To that end, I've created 2 >>> graphs, both attached to this mail. >>> >>> Both graphs show a nova directory tree pruned at the point that a >>> directory contains less than 2% of total LOCs. Additionally, /tests and >>> /locale are pruned as they make the resulting graph much busier without >>> adding a great deal of useful information. The data for both graphs was >>> generated from the most recent 1000 changes in gerrit on Monday 24th Feb >>> 2014. This includes all pending changes, just over 500, and just under >>> 500 recently merged changes. >>> >>> pending.svg shows the percentage of LOCs which have an outstanding >>> change against them. This is one measure of how hard it is to write new >>> code in Nova. >>> >>> merged.svg shows the average length of time between the >>> ultimately-accepted version of a change being pushed and being approved. >>> >>> Note that there are inaccuracies in these graphs, but they should be >>> mostly good. Details of generation here: >>> https://github.com/mdbooth/heatmap. This code is obviously >>> single-purpose, but is free for re-use if anyone feels so inclined. >>> >>> The first graph above (pending.svg) is the one I was most interested in, >>> and shows exactly what I expected it to. Note the size of 'vmwareapi'. >>> If you check out Nova master, 24% of the vmwareapi driver has an >>> outstanding change against it. It is practically impossible to write new >>> code in vmwareapi without stomping on an oustanding patch. Compare that >>> to the libvirt driver at a much healthier 3%. >>> >>> The second graph (merged.svg) is an attempt to look at why that is. >>> Again comparing the VMware driver with the libvirt we can see that at 12 >>> days, it takes much longer for a change to be approved in the VMware >>> driver than in the libvirt driver. I suspect that this isn't the whole >>> story, which is likely a combination of a much longer review time with >>> very active development. >>> >>> What's the impact of this? As I said above, it obviously makes it very >>> hard to come in as a new developer of the VMware driver when almost a >>> quarter of it has been rewritten, but you can't see it. I am very new to >>> this and others should validate my conclusions, but I also believe this >>> is having a detrimental impact to code quality. Remember that the above >>> 12 day approval is only the time for the final version to be approved. >>> If a change goes through multiple versions, each of those also has an >>> increased review period, meaning that the time from first submission to >>> final inclusion is typically very, very protracted. The VMware driver >>> has its fair share of high priority issues and functionality gaps, and >>> the developers are motived to get it in the best possible shape as >>> quickly as possible. However, it is my impression that when problems >>> stem from structural issues, the developers choose to add metaphorical >>> gaffer tape rather than fix them, because fixing both creates a >>> dependency chain which pushes the user-visible fix months into the >>> future. In this respect the review process is dysfunctional, and is >>> actively detrimental to code quality. >>> >>> Unfortunately I'm not yet sufficiently familiar with the project to >>> offer a solution. A core reviewer who regularly looks at it is an >>> obvious fix. A less obvious fix might involve a process which allows >>> developers to work on a fork which is periodically merged, rather like >>> the kernel. >>> >>> Matt >>> >>> >>> >>> _______________________________________________ >>> OpenStack-dev mailing list >>> OpenStack-dev@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> >> When I originally read this I had some ideas in mind for a response >> regarding review latency with the vmware driver patches, but felt like >> anything I said, albeit what I consider honest, would sound >> bad/offensive in some way, and didn't want to convey that. >> >> But this came up in IRC today: >> >> https://blueprints.launchpad.net/nova/+spec/vmware-spawn-refactor >> >> That spurred some discussion around this same topic and I think >> highlights one of the major issues, which is code quality and the design >> of the driver. >> >> For example, the driver's spawn method is huge and there are a lot of >> nested methods within it. There are a lot of vmware patches and a lot >> of blueprints, and a lot of them touch spawn. When I'm reviewing them, >> I'm looking for new conditions and checking to see if those are unit >> tested (positive/error cases) and a lot of the time I find it very >> difficult to tell if they are or not. I think a lot of that has to do >> with how the vmware tests are scaffolded with fakes to simulate a >> vcenter backend, but it makes it very difficult if you're not extremely >> familiar with that code to know if something is covered or not. >> >> And I've actually asked in bp reviews before, 'you have this new/changed >> condition, where is it tested' and the response is more or less 'we plan >> on refactoring this later and will add unit tests then' or 'it's hard to >> test this path without a lot of changes to how the tests are working'. >> That's unacceptable to me, and I generally give up on the review after >> that. > > I think part of this is a (hopefully temporary) catch 22. Major > refactors are put off because review process latency would complicate > and delay user visible features, and user visible features are delayed > in review because they highlight the urgent need for major refactors. I > think Juno will be an excellent opportunity to jump out of this cycle.
+1 Just to answer this point, despite the review latency, please don't be tempted to think one big change will get in quicker than a series of little, easy to review, changes. All changes are not equal. A large change often scares me away to easier to review patches. Seems like, for Juno-1, it would be worth cancelling all non-urgent bug fixes, and doing the refactoring we need. I think the aim here should be better (and easier to understand) unit test coverage. Thats a great way to drive good code structure. >> So to move this all forward, I think that bp above should be top >> priority for the vmware team in Juno to keep bp patches moving at the >> pace they do, because the features and refactoring just keeps coming and >> at least for me it's very hard to burn out on looking at those reviews. > > As I think has been mentioned elsewhere, the other big one is the move > to the common vSphere API code in oslo, which is a huge source of cruft > in the driver. This is going to require mechanical changes across the > whole driver, so is going to be quite disruptive. As this work is > largely done, I would put it before even the spawn() refactor. +1 sounds like a good first step is to move to oslo.vmware A big step forward towards the end of Juno is the driver specific CI. That really does help give reviewers more confidence that the changes doesn't break everything. John _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev