Hi all, This is an issue that has been discussed quite a few times. As I was fearing the bottleneck effect is getting worse with each release.
Nova grew simply too much and even though features like networking and block storage have been spun off at some point in time, it still lacks the cohesion necessary for a successful long term lifecycle, or in other terms, it’s just too big to be properly maintained by a handful of amazing and overworked people. Compute drivers are easy to identify as decoupled sub-projects and are among those which suffer to a bigger extent the lack of an independent development process. Nova is a mature project (at least relatively to the OpenStack’s context) and as such new features and bug fixes need to go through a very thorough screening and review before being approved and merged, which does not work well with sub-projects that need to grow faster, especially when introduced later in the lifecycle (e.g the current Hyper-V driver introduced in Folsom) or when being pushed by more aggressive market requirements. Just as an example, only 3 out of 8 Hyper-V blueprint specs have been approved and implemented in Juno, the rest will simply get bumped to Kilo, which means that new additional specs will need to be bumped to L and so on introducing further delays. We ended up privileging feature parity blueprints, delaying almost anything else. Bug fixes landing time in stable releases is also another issue for the user base since merging in master takes a long time and backporting requires another long review process, e.g. more than four months in some cases [1]. As a result we ended up releasing the fixes in a project fork that became our de facto stable release in place of upstream, while waiting for upstream merge. We never experienced similar issues in smaller projects like Neutron, Cinder, Ceilometer or Horizon where we are involved as well, which can be a practical example of the potential benefits of splitting Nova. OpenStack has a clear process for incubation, letting new projects grow as fast as they need during their youth and integrating them into core only when a mature stage is reached [2]. Unfortunately this process applies to projects, but not to subprojects (Hyper-V and VMWare drivers in particular, but not only) resulting in a way slower development pace compared to what a project lead by an independent team could have allowed. On the other hand, Docker is an example of a driver going the StackForge way, but its ultimate potential inclusion in Nova will just increase the current pain points. >From an Hyper-V team perspective, in the late Havana cycle the same reasons highlighted in this thread almost lead us to ask for removal of the driver from Nova in order to improve our development process, even at the cost of the subsequent fall from (core) grace and StackForge incubation Purgatory period, so I’m definitely happy that the conversation has been resumed with a bigger consensus. The main factor that blocked the Hyper-V driver’s exit from Nova was the introduction of the Hyper-V CI during the same cycle. Regressions are a very sensitive topic when you run OpenStack components on an operating system which is not Linux and the CI helped a lot in blocking or discovering issues in a timely fashion. Beside that, the size of the Hyper-V team increased considerably during Icehouse and Juno [3], so the Hyper-V CI became a mandatory and almost irreplaceable tool in our review process, leading us to reach an excellent level of stability of the driver on every supported version of Hyper-V (and progressive CI voting stability as well, but that’s another topic [4]). This means that if we reach a point in which we agree to spin off the drivers in separate core projects, we need to consider how driver related CIs will be still included in the Nova review process, possibly with voting rights when the individual CI stability allows it. Having each third party CI to vote only on its spin-off driver project is not an option IMO, as it won’t catch regressions introduced in Nova that affect the drivers, including race conditions [5] An interesting area of discussion is who is going to be part of the initial core teams for each new subproject. I truly appreciated the experience and help of the Nova core guys, so in order to allow a smoother transition I’d suggest to have for each new project (e.g. nova-compute-hyperv, nova-compute-vmware, etc) an initial core team consisting in one or two members of the current Nova sub-team and one Nova core, with ideally each patch reviewed by both the domain experts and the Nova core. The team could then go on its way by voting its own members as any other OpenStack project does. Another point of discussion is the stabilization and documentation of the driver interface. There are simply too many areas where the behavior between drivers differs, and looking at some other driver’s behavior was in too many cases the only source of documentation we had. As an alternative, I’m personally not against the option of keeping Nova the way it is, giving more autonomy (aka +2 rights) on selected subtrees to sub-teams, but that sounds to me like a half baked solution that will lead to problems like blueprint spec approval rights and other issues. One of the additional very positive side effects that I can foresee, is that by freeing up our resources from the continuous "submit-code -> wait-long-time-for-review -> rebase -> repeat" cycle that defines the current Nova development cycle, we’ll be able to assign more people to review patches in Nova itself, resulting hopefully in a long term benefit for everybody. As another clear advantage, separate projects would allow growth and new features, while allowing Nova to focus on stability and concentrate on reducing its technical debt, another related and recurring topic of discussion. To recap this post, if this proposal goes on: 1) As Hyper-V Nova sub-team we are definitely in favour of the spin-off of Nova core sub-projects, including the compute drivers. 2) We need to agree on how to handle the driver CIs reporting in Nova 3) We need to define the members of the new projects core teams 4) We need to document better the compute driver interface We are fully aware of the great efforts pulled by the nova core team and the points above do not imply in any way that they are not doing their best. It’s just a matter of removing unnecessary bottlenecks for the benefit of the entire OpenStack community. Thanks Dan for raising this topic! Alessandro [1] https://review.openstack.org/#/c/83143/ [2] http://robhirschfeld.com/2014/08/18/ugly-baby/ [3] http://stackalytics.com/?company=cloudbase%20solutions&module=nova&metric=commits&release=all [4] http://ci-stats.cloudbase.it [5] https://bugs.launchpad.net/nova/+bug/1276772 On 04 Sep 2014, at 15:59, Thierry Carrez <thie...@openstack.org> wrote: > Like I mentioned before, I think the only way out of the Nova death > spiral is to split code and give control over it to smaller dedicated > review teams. This is one way to do it. Thanks Dan for pulling this > together :) > > A couple comments inline: > > Daniel P. Berrange wrote: >> [...] >> This is a crisis. A large crisis. In fact, if you got a moment, it's >> a twelve-storey crisis with a magnificent entrance hall, carpeting >> throughout, 24-hour portage, and an enormous sign on the roof, >> saying 'This Is a Large Crisis'. A large crisis requires a large >> plan. >> [...] > > I totally agree. We need a plan now, because we can't go through another > cycle without a solution in sight. > >> [...] >> This has quite a few implications for the way development would >> operate. >> >> - The Nova core team at least, would be voluntarily giving up a big >> amount of responsibility over the evolution of virt drivers. Due >> to human nature, people are not good at giving up power, so this >> may be painful to swallow. Realistically current nova core are >> not experts in most of the virt drivers to start with, and more >> important we clearly do not have sufficient time to do a good job >> of review with everything submitted. Much of the current need >> for core review of virt drivers is to prevent the mis-use of a >> poorly defined virt driver API...which can be mitigated - See >> later point(s) >> >> - Nova core would/should not have automatic +2 over the virt driver >> repositories since it is unreasonable to assume they have the >> suitable domain knowledge for all virt drivers out there. People >> would of course be able to be members of multiple core teams. For >> example John G would naturally be nova-core and nova-xen-core. I >> would aim for nova-core and nova-libvirt-core, and so on. I do not >> want any +2 responsibility over VMWare/HyperV/Docker drivers since >> they're not my area of expertize - I only look at them today because >> they have no other nova-core representation. >> >> - Not sure if it implies the Nova PTL would be solely focused on >> Nova common. eg would there continue to be one PTL over all virt >> driver implementation projects, or would each project have its >> own PTL. Maybe this is irrelevant if a Czars approach is chosen >> by virt driver projects for their work. I'd be inclined to say >> that a single PTL should stay as a figurehead to represent all >> the virt driver projects, acting as a point of contact to ensure >> we keep communication / co-operation between the drivers in sync. >> [...] > > At this point it may look like our current structure (programs, one PTL, > single core teams...) prevents us from implementing that solution. I > just want to say that in OpenStack, organizational structure reflects > how we work, not the other way around. If we need to reorganize > "official" project structure to work in smarter and long-term healthy > ways, that's a really small price to pay. > > -- > Thierry Carrez (ttx) > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev