On Mon, Aug 11, 2014 at 03:25:39PM -0700, Joe Gordon wrote: > I am not sure if I follow. The gate issue with live snapshots has been > worked around by turning it off [0], so presumably this patch is forward > facing. I fail to see how this patch is needed to help the gate in the > future. Wouldn't it just delay the issues until we change the version_cap?
Consider that we have a feature already in tree that is not currently tested by the gate. Now we update libvirt in the gate and so tempest suddenly starts exercising the feature. Now if there is a bug, every single review submitted to the gate is potentially going to crash and burn causing major pain for everyone trying to get tests to pass. With this version cap, you can update libvirt in the gate in knowledge that we won't turn on new previously untested feature patches, so you have lower risk of causing gate instability. Once the gate is updated to new libvirt, we submit a patch to update version cap. If there is a bug in the new features enabled it only affects that one patch under review instead of killing the entire CI system for anyone. Only once we have passing tests for the new version cap value and that is merged would the gate as a whole be impact. Of course sometimes the bugs are very non-deterministic and rare so things might still sneak through, but at least some portion of bugs will be detected this way and help the gate reliability during updates of libvirt. > The issue I see with the libvirt version_cap [1] is best captured in its > commit message: "The end user can override the limit if they wish to opt-in > to use of untested features via the 'version_cap' setting in the 'libvirt' > group." This goes against the very direction nova has been moving in for > some time now. We have been moving away from merging untested (re: no > integration testing) features. This patch changes the very direction the > project is going in over testing without so much as a discussion. While I > think it may be time that we revisited this discussion, the discussion > needs to happen before any patches are merged. Like it or not we have a number of features in Nova that we don't have test coverage for, due to a variety of reasons, some short term, some long term, some permanently unavoidable. One of the reasons is due to the gate having too old libvirt for a feature. As mentioned elsewhere people are looking at addressing that, by trying to figure out how to do a gate job with newer libvirt. Blocking feature development during Juno until the gate issues are addressed is not going to help the work to get new gate jobs, but will discourage our contributors and further the (somewhat valid) impression that we're not a very welcoming project to work with. The version cap setting is *not* encouraging us to add more features that lack testing. It is about recognising that we're *already* accepting such features and so taking steps to ensure our end users don't exercise the untested code paths unless they explicitly choose to. This ensures that what the user tests out of the box actually meets our "Tier 1" status. > I am less concerned about the contents of this patch, and more concerned > with how such a big de facto change in nova policy (we accept untested code > sometimes) without any discussion or consensus. In your comment on the > revert [2], you say the 'whether not-CI-tested features should be allowed > to be merged' debate is 'clearly unresolved.' How did you get to that > conclusion? This was never brought up in the mid-cycles as a unresolved > topic to be discussed. In our specs template we say "Is this untestable in > gate given current limitations (specific hardware / software configurations > available)? If so, are there mitigation plans (3rd party testing, gate > enhancements, etc)" [3]. We have been blocking untested features for some > time now. That last lines are nonsense. We have never unconditionally blocked untested features nor do I recommend that we do so. The specs template testing allows the contributor to *justify* why they think the feature is worth accepting despite lack of testing. The reviewers make a judgement call on whether the justification is valid or not. This is a pragmmatic approach to the problem. > I am further perplexed by what Daniel Berrange, the patch author, meant > when he commented [2] "Regardless of the outcome of the testing discussion > we believe this is a useful feature to have." Who is 'we'? Because I don't > see how that can be nova-core or even nova-specs-core, especially > considering how many members of those groups are +2 on the revert. So if > 'we' is neither of those groups then who is 'we'? By 'we' I'm referring to the people who submitted & approved the patch. As explained soooooo many times now, this version cap concept is something that is useful to end users even if this debate of testing was not happening and libvirt had 100% testing coverage. ie consider we test on libvirt 1.2.0 but a cloud admin has deployed on libvirt 1.0.0. Now the admin wants to pull in libvirt 1.1.5 to get some critical bug fix, but they don't want their nova deployment to start exercising new feature paths which can result in a change in behaviour of their cloud. The version cap settings allows them to update from 1.0.0 to 1.1.5, while telling nova to only use features from 1.0.0. This patch has *no* bearing/impact on what we decide wrt policy for testing of virt drivers and is demonstratably useful in production deployments. So the idea that we must revert this due to lack of decision wrt testing policy is bogus. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev