Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-19 Thread Kashyap Chamarthy
On Tue, Aug 12, 2014 at 03:54:37PM -0400, Russell Bryant wrote:
 On 08/12/2014 03:40 PM, Kashyap Chamarthy wrote:

[. . .]

(Sorry for the late response, was off for a week.)

  So, effectively, you're trying to add a minimal Fedora image w/
  virt-preview repo (as part of some post-install kickstart script).
  If so, where would the image be stored? I'm asking because,
  previously Sean Dague mentioned of mirroring issues (which later
  turned out to be intermittent network issues with OpenStack infra
  cloud providers) of Fedora images, and floated an idea whether an
  updated image can be stored on tarballs.openstack.org, like how
  Trove[1] does. But, OpenStack infra folks (fungi) raised some valid
  points on why not do that.
  
  IIUC, if you intend to run tests w/ this CI job with this new image,
  there has to be a mechanism in place to ensure the cached copy (on
  tarballs.o.o) is updated.
  
  If I misunderstood what you said, please correct me.
 
 Patches for this here:
 
 https://review.openstack.org/#/c/113349/
 https://review.openstack.org/#/c/113350/
 
 The first one is the important part about how the image is created.
 nodepool runs some prep scripts against the cloud's distro image and
 then snapshots it.  That's the image stored to be used later for
 testing.
 
 In this case, it enables the virt-preview repo and then calls out to
 the regular devstack prep scripts to cache all packages needed for the
 test locally on the image.

Cool, the first change looks fine to me. And I see that the second one,
you marked as WIP.

 If there are issues with the reliability of fedorapeople.org, it will
 indeed cause problems, 

In my past 3 years of its usage, I haven't seen any noticeable
reliability issues, so we're fine there.

 but at least it's local to image creation and not every test run.

True.

Thanks for this. It'll surely make it easier to test Nova with more
bleeding edge libvirt/QEMU.

-- 
/kashyap

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Mark McLoughlin
On Mon, 2014-08-11 at 15:25 -0700, Joe Gordon wrote:
 
 
 
 On Sun, Aug 10, 2014 at 11:59 PM, Mark McLoughlin mar...@redhat.com
 wrote:
 On Fri, 2014-08-08 at 09:06 -0400, Russell Bryant wrote:
  On 08/07/2014 08:06 PM, Michael Still wrote:
   It seems to me that the tension here is that there are
 groups who
   would really like to use features in newer libvirts that
 we don't CI
   on in the gate. Is it naive to think that a possible
 solution here is
   to do the following:
  
- revert the libvirt version_cap flag
 
  I don't feel strongly either way on this.  It seemed useful
 at the time
  for being able to decouple upgrading libvirt and enabling
 features that
  come with that.
 
 
 Right, I suggested the flag as a more deliberate way of
 avoiding the
 issue that was previously seen in the gate with live
 snapshots. I still
 think it's a pretty elegant and useful little feature, and
 don't think
 we need to use it as proxy battle over testing requirements
 for new
 libvirt features.
 
 
 Mark,
 
 
 I am not sure if I follow.  The gate issue with live snapshots has
 been worked around by turning it off [0], so presumably this patch is
 forward facing.  I fail to see how this patch is needed to help the
 gate in the future.

On the live snapshot issue specifically, we disabled it by requiring
1.3.0 for the feature. With the version cap set to 1.2.2, we won't
automatically enable this code path again if we update to 1.3.0. No
question that's a bit of a mess, though.

The point was a more general one - we learned from the live snapshot
issue that having a libvirt upgrade immediately enable new code paths
was a bad idea. The patch is a simple, elegant way of avoiding that.

  Wouldn't it just delay the issues until we change the version_cap?

Yes, that's the idea. Rather than having to scramble when the new
devstack-gate image shows up, we'd be able to work on any issues in the
context of a patch series to bump the version_cap.

 The issue I see with the libvirt version_cap [1] is best captured in
 its commit message: The end user can override the limit if they wish
 to opt-in to use of untested features via the 'version_cap' setting in
 the 'libvirt' group. This goes against the very direction nova has
 been moving in for some time now. We have been moving away from
 merging untested (re: no integration testing) features.  This patch
 changes the very direction the project is going in over testing
 without so much as a discussion. While I think it may be time that we
 revisited this discussion, the discussion needs to happen before any
 patches are merged.

You put it well - some apparently see us moving towards a zero-tolerance
policy of not having any code which isn't functionally tested in the
gate. That obviously is not the case right now.

The sentiment is great, but any zero-tolerance policy is dangerous. I'm
very much in favor of discussing this further. We should have some
principles and goals around this, but rather than argue this in the
abstract we should be open to discussing the tradeoffs involved with
individual patches.

 I am less concerned about the contents of this patch, and more
 concerned with how such a big de facto change in nova policy (we
 accept untested code sometimes) without any discussion or consensus.
 In your comment on the revert [2], you say the 'whether not-CI-tested
 features should be allowed to be merged' debate is 'clearly
 unresolved.' How did you get to that conclusion? This was never
 brought up in the mid-cycles as a unresolved topic to be discussed. In
 our specs template we say Is this untestable in gate given current
 limitations (specific hardware / software configurations available)?
 If so, are there mitigation plans (3rd party testing, gate
 enhancements, etc) [3].  We have been blocking untested features for
 some time now.

Asking is this tested in a spec template makes a tonne of sense.
Requiring some thought to be put into mitigation where a feature is
untestable in the gate makes sense. Requiring that the code is tested
where possible makes sense. It's a zero-tolerance get your code
functionally tested or GTFO policy that I'm concerned about.

 I am further perplexed by what Daniel Berrange, the patch author,
 meant when he commented [2] Regardless of the outcome of the testing
 discussion we believe this is a useful feature to have. Who is 'we'?
 Because I don't see how that can be nova-core or even nova-specs-core,
 especially considering how many members of those groups are +2 on the
 revert. So if 'we' is neither of those groups then who is 'we'?

That's for Dan to answer, but I think you're either nitpicking or have a
very serious concern.

If nitpicking, Dan could just be using the Royal 'We' :) Or he could
just mean 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Daniel P. Berrange
On Mon, Aug 11, 2014 at 03:25:39PM -0700, Joe Gordon wrote:
 I am not sure if I follow.  The gate issue with live snapshots has been
 worked around by turning it off [0], so presumably this patch is forward
 facing.  I fail to see how this patch is needed to help the gate in the
 future. Wouldn't it just delay the issues until we change the version_cap?

Consider that we have a feature already in tree that is not currently
tested by the gate. Now we update libvirt in the gate and so tempest
suddenly starts exercising the feature. Now if there is a bug, every
single review submitted to the gate is potentially going to crash and
burn causing major pain for everyone trying to get tests to pass. With
this version cap, you can update libvirt in the gate in knowledge that
we won't turn on new previously untested feature patches, so you have
lower risk of causing gate instability. Once the gate is updated to new
libvirt, we submit a patch to update version cap. If there is a bug in
the new features enabled it only affects that one patch under review
instead of killing the entire CI system for anyone. Only once we have
passing tests for the new version cap value and that is merged would
the gate as a whole be impact. Of course sometimes the bugs are very
non-deterministic and rare so things might still sneak through, but
at least some portion of bugs will be detected this way and help the
gate reliability during updates of libvirt.

 The issue I see with the libvirt version_cap [1] is best captured in its
 commit message: The end user can override the limit if they wish to opt-in
 to use of untested features via the 'version_cap' setting in the 'libvirt'
 group. This goes against the very direction nova has been moving in for
 some time now. We have been moving away from merging untested (re: no
 integration testing) features.  This patch changes the very direction the
 project is going in over testing without so much as a discussion. While I
 think it may be time that we revisited this discussion, the discussion
 needs to happen before any patches are merged.

Like it or not we have a number of features in Nova that we don't have
test coverage for, due to a variety of reasons, some short term, some
long term, some permanently unavoidable. One of the reasons is due to
the gate having too old libvirt for a feature. As mentioned elsewhere
people are looking at addressing that, by trying to figure out how to
do a gate job with newer libvirt. Blocking feature development during
Juno until the gate issues are addressed is not going to help the work
to get new gate jobs, but will discourage our contributors and further
the (somewhat valid) impression that we're not a very welcoming project
to work with.

The version cap setting is *not* encouraging us to add more features that
lack testing. It is about recognising that we're *already* accepting such
features and so taking steps to ensure our end users don't exercise the
untested code paths unless they explicitly choose to. This ensures that
what the user tests out of the box actually meets our Tier 1 status.

 I am less concerned about the contents of this patch, and more concerned
 with how such a big de facto change in nova policy (we accept untested code
 sometimes) without any discussion or consensus. In your comment on the
 revert [2], you say the 'whether not-CI-tested features should be allowed
 to be merged' debate is 'clearly unresolved.' How did you get to that
 conclusion? This was never brought up in the mid-cycles as a unresolved
 topic to be discussed. In our specs template we say Is this untestable in
 gate given current limitations (specific hardware / software configurations
 available)? If so, are there mitigation plans (3rd party testing, gate
 enhancements, etc) [3].  We have been blocking untested features for some
 time now.

That last lines are nonsense. We have never unconditionally blocked untested
features nor do I recommend that we do so. The specs template testing allows
the contributor to *justify* why they think the feature is worth accepting
despite lack of testing. The reviewers make a judgement call on whether the
justification is valid or not. This is a pragmmatic approach to the problem.

 I am further perplexed by what Daniel Berrange, the patch author, meant
 when he commented [2] Regardless of the outcome of the testing discussion
 we believe this is a useful feature to have. Who is 'we'? Because I don't
 see how that can be nova-core or even nova-specs-core, especially
 considering how many members of those groups are +2 on the revert. So if
 'we' is neither of those groups then who is 'we'?

By 'we' I'm referring to the people who submitted  approved the patch. As
explained soo many times now, this version cap concept is something that
is useful to end users even if this debate of testing was not happening and
libvirt had 100% testing coverage. ie consider we test on libvirt 1.2.0 but
a cloud admin has deployed on libvirt 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Russell Bryant
On 08/12/2014 05:54 AM, Daniel P. Berrange wrote:
 I am less concerned about the contents of this patch, and more concerned
 with how such a big de facto change in nova policy (we accept untested code
 sometimes) without any discussion or consensus. In your comment on the
 revert [2], you say the 'whether not-CI-tested features should be allowed
 to be merged' debate is 'clearly unresolved.' How did you get to that
 conclusion? This was never brought up in the mid-cycles as a unresolved
 topic to be discussed. In our specs template we say Is this untestable in
 gate given current limitations (specific hardware / software configurations
 available)? If so, are there mitigation plans (3rd party testing, gate
 enhancements, etc) [3].  We have been blocking untested features for some
 time now.
 
 That last lines are nonsense. We have never unconditionally blocked untested
 features nor do I recommend that we do so. The specs template testing allows
 the contributor to *justify* why they think the feature is worth accepting
 despite lack of testing. The reviewers make a judgement call on whether the
 justification is valid or not. This is a pragmmatic approach to the problem.

That has been my interpretation and approach as well: we strongly prefer
functional testing for everything, but take a pragmatic approach and
evaluate proposals on a case by case basis.  It's clear we need to be a
bit more explicit here.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Kashyap Chamarthy
On Mon, Aug 11, 2014 at 08:05:26AM -0400, Russell Bryant wrote:
 On 08/11/2014 07:58 AM, Russell Bryant wrote:
  On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
  There is work to add support for this in devestack already which I
  prefer since it makes it easy for developers to get an environment
  which matches the build system:
 
https://review.openstack.org/#/c/108714/
  
  Ah, cool.  Devstack is indeed a better place to put the build scripting.
   So, I think we should:
  
  1) Get the above patch working, and then merged.
  
  2) Get an experimental job going to use the above while we work on #3
  
  3) Before the job can move into the check queue and potentially become
  voting, it needs to not rely on downloading the source on every run.
  IIRC, we can have nodepool build an image to use for these jobs that
  includes the bits already installed.
  
  I'll switch my efforts over to helping get the above completed.
  
 
 I still think the devstack patch is good, but after some more thought, I
 think a better long term CI job setup would just be a fedora image with
 the virt-preview repo. 

So, effectively, you're trying to add a minimal Fedora image w/
virt-preview repo (as part of some post-install kickstart script). If
so, where would the image be stored? I'm asking because, previously Sean
Dague mentioned of mirroring issues (which later turned out to be
intermittent network issues with OpenStack infra cloud providers) of
Fedora images, and floated an idea whether an updated image can be
stored on tarballs.openstack.org, like how Trove[1] does. But, OpenStack
infra folks (fungi) raised some valid points on why not do that.

IIUC, if you intend to run tests w/ this CI job with this new image,
there has to be a mechanism in place to ensure the cached copy (on
tarballs.o.o) is updated.

If I misunderstood what you said, please correct me.


[1] http://tarballs.openstack.org/trove/images/

 I think I'll try that ...

 

-- 
/kashyap

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Russell Bryant
On 08/12/2014 03:40 PM, Kashyap Chamarthy wrote:
 On Mon, Aug 11, 2014 at 08:05:26AM -0400, Russell Bryant wrote:
 On 08/11/2014 07:58 AM, Russell Bryant wrote:
 On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
 There is work to add support for this in devestack already which I
 prefer since it makes it easy for developers to get an environment
 which matches the build system:

   https://review.openstack.org/#/c/108714/

 Ah, cool.  Devstack is indeed a better place to put the build scripting.
  So, I think we should:

 1) Get the above patch working, and then merged.

 2) Get an experimental job going to use the above while we work on #3

 3) Before the job can move into the check queue and potentially become
 voting, it needs to not rely on downloading the source on every run.
 IIRC, we can have nodepool build an image to use for these jobs that
 includes the bits already installed.

 I'll switch my efforts over to helping get the above completed.


 I still think the devstack patch is good, but after some more thought, I
 think a better long term CI job setup would just be a fedora image with
 the virt-preview repo. 
 
 So, effectively, you're trying to add a minimal Fedora image w/
 virt-preview repo (as part of some post-install kickstart script). If
 so, where would the image be stored? I'm asking because, previously Sean
 Dague mentioned of mirroring issues (which later turned out to be
 intermittent network issues with OpenStack infra cloud providers) of
 Fedora images, and floated an idea whether an updated image can be
 stored on tarballs.openstack.org, like how Trove[1] does. But, OpenStack
 infra folks (fungi) raised some valid points on why not do that.
 
 IIUC, if you intend to run tests w/ this CI job with this new image,
 there has to be a mechanism in place to ensure the cached copy (on
 tarballs.o.o) is updated.
 
 If I misunderstood what you said, please correct me.

Patches for this here:

https://review.openstack.org/#/c/113349/
https://review.openstack.org/#/c/113350/

The first one is the important part about how the image is created.
nodepool runs some prep scripts against the cloud's distro image and
then snapshots it.  That's the image stored to be used later for testing.

In this case, it enables the virt-preview repo and then calls out to the
regular devstack prep scripts to cache all packages needed for the test
locally on the image.

If there are issues with the reliability of fedorapeople.org, it will
indeed cause problems, but at least it's local to image creation and not
every test run.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Joe Gordon
On Tue, Aug 12, 2014 at 12:23 AM, Mark McLoughlin mar...@redhat.com wrote:

 On Mon, 2014-08-11 at 15:25 -0700, Joe Gordon wrote:
 
 
 
  On Sun, Aug 10, 2014 at 11:59 PM, Mark McLoughlin mar...@redhat.com
  wrote:
  On Fri, 2014-08-08 at 09:06 -0400, Russell Bryant wrote:
   On 08/07/2014 08:06 PM, Michael Still wrote:
It seems to me that the tension here is that there are
  groups who
would really like to use features in newer libvirts that
  we don't CI
on in the gate. Is it naive to think that a possible
  solution here is
to do the following:
   
 - revert the libvirt version_cap flag
  
   I don't feel strongly either way on this.  It seemed useful
  at the time
   for being able to decouple upgrading libvirt and enabling
  features that
   come with that.
 
 
  Right, I suggested the flag as a more deliberate way of
  avoiding the
  issue that was previously seen in the gate with live
  snapshots. I still
  think it's a pretty elegant and useful little feature, and
  don't think
  we need to use it as proxy battle over testing requirements
  for new
  libvirt features.
 
 
  Mark,
 
 
  I am not sure if I follow.  The gate issue with live snapshots has
  been worked around by turning it off [0], so presumably this patch is
  forward facing.  I fail to see how this patch is needed to help the
  gate in the future.

 On the live snapshot issue specifically, we disabled it by requiring
 1.3.0 for the feature. With the version cap set to 1.2.2, we won't
 automatically enable this code path again if we update to 1.3.0. No
 question that's a bit of a mess, though.


Agreed



 The point was a more general one - we learned from the live snapshot
 issue that having a libvirt upgrade immediately enable new code paths
 was a bad idea. The patch is a simple, elegant way of avoiding that.

   Wouldn't it just delay the issues until we change the version_cap?

 Yes, that's the idea. Rather than having to scramble when the new
 devstack-gate image shows up, we'd be able to work on any issues in the
 context of a patch series to bump the version_cap.


So the version_cap flag only possibly help for bugs in libvirt that are
triggered by new nova code paths, and not bugs that are triggered by
existing nova code paths that trigger a libvirt regression. Furthermore it
can only catch libvirt bugs that trigger frequently enough to be caught on
the patch to bump the version_cap, and we commonly have bugs that are 1 in
a 1000 these days. This sounds like a potential solution for a very
specific case when I would rather see a more general solution.




  The issue I see with the libvirt version_cap [1] is best captured in
  its commit message: The end user can override the limit if they wish
  to opt-in to use of untested features via the 'version_cap' setting in
  the 'libvirt' group. This goes against the very direction nova has
  been moving in for some time now. We have been moving away from
  merging untested (re: no integration testing) features.  This patch
  changes the very direction the project is going in over testing
  without so much as a discussion. While I think it may be time that we
  revisited this discussion, the discussion needs to happen before any
  patches are merged.

 You put it well - some apparently see us moving towards a zero-tolerance
 policy of not having any code which isn't functionally tested in the
 gate. That obviously is not the case right now.

 The sentiment is great, but any zero-tolerance policy is dangerous. I'm
 very much in favor of discussing this further. We should have some
 principles and goals around this, but rather than argue this in the
 abstract we should be open to discussing the tradeoffs involved with
 individual patches.


To bad the mid-cycle just passed this would have been a great discussion
for it.



  I am less concerned about the contents of this patch, and more
  concerned with how such a big de facto change in nova policy (we
  accept untested code sometimes) without any discussion or consensus.
  In your comment on the revert [2], you say the 'whether not-CI-tested
  features should be allowed to be merged' debate is 'clearly
  unresolved.' How did you get to that conclusion? This was never
  brought up in the mid-cycles as a unresolved topic to be discussed. In
  our specs template we say Is this untestable in gate given current
  limitations (specific hardware / software configurations available)?
  If so, are there mitigation plans (3rd party testing, gate
  enhancements, etc) [3].  We have been blocking untested features for
  some time now.

 Asking is this tested in a spec template makes a tonne of sense.
 Requiring some thought to be put into mitigation where a feature is
 untestable in the gate makes sense. Requiring that 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Mark McLoughlin
On Fri, 2014-08-08 at 09:06 -0400, Russell Bryant wrote:
 On 08/07/2014 08:06 PM, Michael Still wrote:
  It seems to me that the tension here is that there are groups who
  would really like to use features in newer libvirts that we don't CI
  on in the gate. Is it naive to think that a possible solution here is
  to do the following:
  
   - revert the libvirt version_cap flag
 
 I don't feel strongly either way on this.  It seemed useful at the time
 for being able to decouple upgrading libvirt and enabling features that
 come with that.

Right, I suggested the flag as a more deliberate way of avoiding the
issue that was previously seen in the gate with live snapshots. I still
think it's a pretty elegant and useful little feature, and don't think
we need to use it as proxy battle over testing requirements for new
libvirt features.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Daniel P. Berrange
On Mon, Aug 11, 2014 at 08:57:31AM +1000, Michael Still wrote:
 On Sun, Aug 10, 2014 at 2:33 AM, Jeremy Stanley fu...@yuggoth.org wrote:
  On 2014-08-08 09:06:29 -0400 (-0400), Russell Bryant wrote:
  [...]
  We've seen several times that building and maintaining 3rd party
  CI is a *lot* of work.
 
  Building and maintaining *any* CI is a *lot* of work, not the least
  of which is the official OpenStack project CI (I believe Monty
  mentioned in #openstack-infra last night that our CI is about twice
  the size of Travis-CI now, not sure what metric he's comparing there
  though).
 
  Like you said in [1], doing this in infra's CI would be ideal. I
  think 3rd party should be reserved for when running it in the
  project's infrastructure is not an option for some reason
  (requires proprietary hw or sw, for example).
 
  Add to the not an option for some reason list, software which is
  not easily obtainable through typical installation channels (PyPI,
  Linux distro-managed package repositories for their LTS/server
  releases, et cetera) or which requires gyrations which destabilize
  or significantly complicate maintenance of the overall system as
  well as reproducibility for developers. It may be possible to work
  around some of these concerns via access from multiple locations
  coupled with heavy caching, but adding that in for a one-off source
  is hard to justify the additional complexity too.
 
 My understanding is that Fedora has a PPA equivalent which ships a
 latest and greated libvirt. So, it would be packages if we went the
 Fedora route, which should be less work.

Yes, there is the 'virt preview' repository

http://fedoraproject.org/wiki/Virtualization_Preview_Repository

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Daniel P. Berrange
On Fri, Aug 08, 2014 at 09:06:29AM -0400, Russell Bryant wrote:
 On 08/07/2014 08:06 PM, Michael Still wrote:
  It seems to me that the tension here is that there are groups who
  would really like to use features in newer libvirts that we don't CI
  on in the gate. Is it naive to think that a possible solution here is
  to do the following:
  
   - revert the libvirt version_cap flag
 
 I don't feel strongly either way on this.  It seemed useful at the time
 for being able to decouple upgrading libvirt and enabling features that
 come with that.  I'd like to let Dan get back from vacation and weigh in
 on it, though.

Yes, I think that version cap feature is valuable no matter what we
do about CI testing, which is why I +2'd it originally.

 
 I wonder if the job could be as simple as one with an added step in the
 config to install latest libvirt from source.  Dan, do you think someone
 could add a libvirt-current.tar.gz to http://libvirt.org/sources/ ?
 Using the latest release seems better than master from git.

I'd strongly recommend against using GIT master. It will cause openstack
CI more pain than benefits. Using the latest stable release is a better
bet

 I'll mess around and see if I can spin up an experimental job.

There is work to add support for this in devestack already which I
prefer since it makes it easy for developers to get an environment
which matches the build system:

  https://review.openstack.org/#/c/108714/

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Gary Kotton
In the past there was smokestack that used a fedora version of libvirt.
Any idea what happened to that?

On 8/11/14, 12:53 PM, Daniel P. Berrange berra...@redhat.com wrote:

On Fri, Aug 08, 2014 at 09:06:29AM -0400, Russell Bryant wrote:
 On 08/07/2014 08:06 PM, Michael Still wrote:
  It seems to me that the tension here is that there are groups who
  would really like to use features in newer libvirts that we don't CI
  on in the gate. Is it naive to think that a possible solution here is
  to do the following:
  
   - revert the libvirt version_cap flag
 
 I don't feel strongly either way on this.  It seemed useful at the time
 for being able to decouple upgrading libvirt and enabling features that
 come with that.  I'd like to let Dan get back from vacation and weigh in
 on it, though.

Yes, I think that version cap feature is valuable no matter what we
do about CI testing, which is why I +2'd it originally.

 
 I wonder if the job could be as simple as one with an added step in the
 config to install latest libvirt from source.  Dan, do you think someone
 could add a libvirt-current.tar.gz to
https://urldefense.proofpoint.com/v1/url?u=http://libvirt.org/sources/k=
oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPh
CZFxPEq8%3D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%0As=0298
f3506d0ac41e81fce6b0b3096455c3e78ba578970e42290997a766b0811e ?
 Using the latest release seems better than master from git.

I'd strongly recommend against using GIT master. It will cause openstack
CI more pain than benefits. Using the latest stable release is a better
bet

 I'll mess around and see if I can spin up an experimental job.

There is work to add support for this in devestack already which I
prefer since it makes it easy for developers to get an environment
which matches the build system:

  https://review.openstack.org/#/c/108714/

Regards,
Daniel
-- 
|: 
https://urldefense.proofpoint.com/v1/url?u=http://berrange.com/k=oIvRg1%2
BdGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPhCZFxPEq8%
3D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%0As=d7e4cac70435cd
1066f4f7ec8795f16c345c0ca4529bb4b3033762552817e775  -o-
https://urldefense.proofpoint.com/v1/url?u=http://www.flickr.com/photos/db
errange/k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfD
tysg45MkPhCZFxPEq8%3D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%
0As=0f30863dcc9a34ffe520b0cee787f76bb67334963ebcd57d9b53b3890f0265ea :|
|: 
https://urldefense.proofpoint.com/v1/url?u=http://libvirt.org/k=oIvRg1%2B
dGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPhCZFxPEq8%3
D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%0As=07044ea24f9483f
b87f1c67511c4c62a3e2b8971ad8418b7a0a61068f623f529  -o-
 
https://urldefense.proofpoint.com/v1/url?u=http://virt-manager.org/k=oIvR
g1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPhCZFxP
Eq8%3D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%0As=185d42ae1a
5b0422c1fa3306e55397461140024c38530f3a6986a72c51edec08 :|
|: 
https://urldefense.proofpoint.com/v1/url?u=http://autobuild.org/k=oIvRg1%
2BdGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPhCZFxPEq8
%3D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%0As=2e295d841588d
8cb243a6c37eb0c17007760ba0506cd621fff34de6a6f7c7b03   -o-
https://urldefense.proofpoint.com/v1/url?u=http://search.cpan.org/~danberr
/k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45M
kPhCZFxPEq8%3D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%0As=39
b473e1829f6e6c4f7bd4e27304334df8c0d1e9ac2cf7432490ac72fbe111e6 :|
|: 
https://urldefense.proofpoint.com/v1/url?u=http://entangle-photo.org/k=oI
vRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPhCZF
xPEq8%3D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%0As=c100dc7d
e638a7bd0477a6d9723599041f442e80207213597a220e1a2cd34c74   -o-
https://urldefense.proofpoint.com/v1/url?u=http://live.gnome.org/gtk-vnck
=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPh
CZFxPEq8%3D%0Am=rPfYVc630ymUlJf3W60dbNbCgcW5TimMFx1ooqIdr38%3D%0As=5c43d
0b2c8ad31d53813420ce9ab1885943b4ef88458006edcb2295f47c46846 :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
 There is work to add support for this in devestack already which I
 prefer since it makes it easy for developers to get an environment
 which matches the build system:
 
   https://review.openstack.org/#/c/108714/

Ah, cool.  Devstack is indeed a better place to put the build scripting.
 So, I think we should:

1) Get the above patch working, and then merged.

2) Get an experimental job going to use the above while we work on #3

3) Before the job can move into the check queue and potentially become
voting, it needs to not rely on downloading the source on every run.
IIRC, we can have nodepool build an image to use for these jobs that
includes the bits already installed.

I'll switch my efforts over to helping get the above completed.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Daniel P. Berrange
On Mon, Aug 11, 2014 at 07:58:41AM -0400, Russell Bryant wrote:
 On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
  There is work to add support for this in devestack already which I
  prefer since it makes it easy for developers to get an environment
  which matches the build system:
  
https://review.openstack.org/#/c/108714/
 
 Ah, cool.  Devstack is indeed a better place to put the build scripting.
  So, I think we should:
 
 1) Get the above patch working, and then merged.
 
 2) Get an experimental job going to use the above while we work on #3
 
 3) Before the job can move into the check queue and potentially become
 voting, it needs to not rely on downloading the source on every run.

Don't we have the ability to mirror downloads locally to the build
system for python ?  The proposed patch allows an alternate download
URL to be set via an env variable so it could point to a local mirror
instead of libvirt.org / qemu.org

 IIRC, we can have nodepool build an image to use for these jobs that
 includes the bits already installed.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/09/2014 12:33 PM, Jeremy Stanley wrote:
 On 2014-08-08 09:06:29 -0400 (-0400), Russell Bryant wrote:
 [...]
 We've seen several times that building and maintaining 3rd party
 CI is a *lot* of work.
 
 Building and maintaining *any* CI is a *lot* of work, not the least
 of which is the official OpenStack project CI (I believe Monty
 mentioned in #openstack-infra last night that our CI is about twice
 the size of Travis-CI now, not sure what metric he's comparing there
 though).

Dang, I'd love to see those numbers.  :-)

 Like you said in [1], doing this in infra's CI would be ideal. I
 think 3rd party should be reserved for when running it in the
 project's infrastructure is not an option for some reason
 (requires proprietary hw or sw, for example).
 
 Add to the not an option for some reason list, software which is
 not easily obtainable through typical installation channels (PyPI,
 Linux distro-managed package repositories for their LTS/server
 releases, et cetera) or which requires gyrations which destabilize
 or significantly complicate maintenance of the overall system as
 well as reproducibility for developers. It may be possible to work
 around some of these concerns via access from multiple locations
 coupled with heavy caching, but adding that in for a one-off source
 is hard to justify the additional complexity too.

Understood.  Some questions ... is building an image that has libvirt
and qemu pre-installed from source good enough?  It avoids the
dependency on job runs, but moves it to image build time though, so it
still exists.

If the above still doesn't seem like a workable setup, then I think we
should just go straight to an image with fedora + virt-preview repo,
which kind of sounds easier, anyway.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/11/2014 07:58 AM, Russell Bryant wrote:
 On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
 There is work to add support for this in devestack already which I
 prefer since it makes it easy for developers to get an environment
 which matches the build system:

   https://review.openstack.org/#/c/108714/
 
 Ah, cool.  Devstack is indeed a better place to put the build scripting.
  So, I think we should:
 
 1) Get the above patch working, and then merged.
 
 2) Get an experimental job going to use the above while we work on #3
 
 3) Before the job can move into the check queue and potentially become
 voting, it needs to not rely on downloading the source on every run.
 IIRC, we can have nodepool build an image to use for these jobs that
 includes the bits already installed.
 
 I'll switch my efforts over to helping get the above completed.
 

I still think the devstack patch is good, but after some more thought, I
think a better long term CI job setup would just be a fedora image with
the virt-preview repo.  I think I'll try that ...

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/11/2014 08:01 AM, Daniel P. Berrange wrote:
 On Mon, Aug 11, 2014 at 07:58:41AM -0400, Russell Bryant wrote:
 On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
 There is work to add support for this in devestack already which I
 prefer since it makes it easy for developers to get an environment
 which matches the build system:

   https://review.openstack.org/#/c/108714/

 Ah, cool.  Devstack is indeed a better place to put the build scripting.
  So, I think we should:

 1) Get the above patch working, and then merged.

 2) Get an experimental job going to use the above while we work on #3

 3) Before the job can move into the check queue and potentially become
 voting, it needs to not rely on downloading the source on every run.
 
 Don't we have the ability to mirror downloads locally to the build
 system for python ?  The proposed patch allows an alternate download
 URL to be set via an env variable so it could point to a local mirror
 instead of libvirt.org / qemu.org

There's a pypi mirror at least.  I'm not sure about mirroring other things.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Jeremy Stanley
On 2014-08-11 08:04:34 -0400 (-0400), Russell Bryant wrote:
 Dang, I'd love to see those numbers.  :-)

Me too. Now that I'm not travelling I'll see if I can find out what
he meant by that.

 Understood.  Some questions ... is building an image that has libvirt
 and qemu pre-installed from source good enough?  It avoids the
 dependency on job runs, but moves it to image build time though, so it
 still exists.

Moving complex stability risks to image creation time still causes
us to potentially fail to update our worker images as often, which
means tests randomly run on increasingly stale systems in some
providers/regions until the issue is noticed, identified and
addressed. That said, we do already compile some things during job
runs today (in particular, library bindings which get install-time
linked by some Python modules).

In reality, depending on more things gathered from different places
on the Internet (be it Git repository sites like GitHub/Bitbucket,
or private package collections) decreases our overall stability far
more than compiling things does.

 If the above still doesn't seem like a workable setup, then I think we
 should just go straight to an image with fedora + virt-preview repo,
 which kind of sounds easier, anyway.

If it's published from EPEL or whatever Fedora's equivalent is, then
that's probably fine. If it's served from a separate site, then that
increases the chances that we run into network issues either at
image build time or job run time. Also, we would want to make sure
whatever solution we settle on is well integrated within DevStack
itself, so that individual developers can recreate these conditions
themselves without a lot of additional work.

One other thing to keep in mind... Fedora's lifecycle is too short
for us to support outside of jobs for our master branches, so this
would not be a solution beyond release time (we couldn't continue to
run these jobs for Juno once released if the solution hinges on
Fedora). Getting the versions we want developers and deployers to
use into Ubuntu 14.04 Cloud Archive and CentOS (RHEL) 7 EPEL on the
other hand would be a much more viable long-term solution.
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/11/2014 09:17 AM, Jeremy Stanley wrote:
 On 2014-08-11 08:04:34 -0400 (-0400), Russell Bryant wrote:
 Dang, I'd love to see those numbers.  :-)
 
 Me too. Now that I'm not travelling I'll see if I can find out what
 he meant by that.
 
 Understood.  Some questions ... is building an image that has libvirt
 and qemu pre-installed from source good enough?  It avoids the
 dependency on job runs, but moves it to image build time though, so it
 still exists.
 
 Moving complex stability risks to image creation time still causes
 us to potentially fail to update our worker images as often, which
 means tests randomly run on increasingly stale systems in some
 providers/regions until the issue is noticed, identified and
 addressed. That said, we do already compile some things during job
 runs today (in particular, library bindings which get install-time
 linked by some Python modules).
 
 In reality, depending on more things gathered from different places
 on the Internet (be it Git repository sites like GitHub/Bitbucket,
 or private package collections) decreases our overall stability far
 more than compiling things does.
 
 If the above still doesn't seem like a workable setup, then I think we
 should just go straight to an image with fedora + virt-preview repo,
 which kind of sounds easier, anyway.
 
 If it's published from EPEL or whatever Fedora's equivalent is, then
 that's probably fine. If it's served from a separate site, then that
 increases the chances that we run into network issues either at
 image build time or job run time. Also, we would want to make sure
 whatever solution we settle on is well integrated within DevStack
 itself, so that individual developers can recreate these conditions
 themselves without a lot of additional work.

EPEL is a repo produced by the Fedora project for RHEL and its
derivatives.  The virt-preview repo is hosted on fedorapeople.org, which
is where custom repos live.  I'd say it's more analogous to Ubuntu's PPAs.

https://fedorapeople.org/groups/virt/virt-preview/

 One other thing to keep in mind... Fedora's lifecycle is too short
 for us to support outside of jobs for our master branches, so this
 would not be a solution beyond release time (we couldn't continue to
 run these jobs for Juno once released if the solution hinges on
 Fedora). Getting the versions we want developers and deployers to
 use into Ubuntu 14.04 Cloud Archive and CentOS (RHEL) 7 EPEL on the
 other hand would be a much more viable long-term solution.

Yep, makes sense.

For testing bleeding edge, I've also got my eye on how we could do this
with CentOS.  There is a virt SIG in CentOS that I'm hoping will produce
something similar to Fedora's virt-preview repo, but it's not there yet.
 I'm going to go off and discuss this with the SIG there.

http://wiki.centos.org/SpecialInterestGroup/Virtualization

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Jeremy Stanley
On 2014-08-11 08:07:02 -0400 (-0400), Russell Bryant wrote:
 There's a pypi mirror at least.  I'm not sure about mirroring
 other things.

Right, that's a specific solution for mirroring the pypi.python.org
cheeseshop. We've got our (Infra) sights set on mirroring Ubuntu and
CentOS package repositories to similarly reduce the incidence of job
run-time failures we see getting updated packages and indexes from
distro sites and cloud provider mirrors, but those too are
nontrivial efforts and will be a while yet before we have them in
place. Mirroring things is generally complex, since different kinds
of files/data need widely differing retrieval, indexing and caching
solutions--there's no one-size-fits-all option really.

Perhaps another good example is the Fedora qcow2 image we download
and cache on DevStack worker images so that Heat can perform some of
its more complex integration tests... failures encountered when
obtaining that image from dl.fedoraproject.org are (last time I
checked anyway) our most frequent cause of nodepool update problems.
We could set up our own mirror of that file of course, but to some
extent that's still just moving the problem--each additional
mirroring solution is something new we have to monitor, maintain and
troubleshoot so we must ask ourselves whether the increased
management burden from that new complexity is balanced by potential
decreases in management burden found by improving stability in other
parts of the system.
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Daniel P. Berrange
On Mon, Aug 11, 2014 at 01:45:56PM +, Jeremy Stanley wrote:
 On 2014-08-11 08:07:02 -0400 (-0400), Russell Bryant wrote:
  There's a pypi mirror at least.  I'm not sure about mirroring
  other things.
 
 Right, that's a specific solution for mirroring the pypi.python.org
 cheeseshop. We've got our (Infra) sights set on mirroring Ubuntu and
 CentOS package repositories to similarly reduce the incidence of job
 run-time failures we see getting updated packages and indexes from
 distro sites and cloud provider mirrors, but those too are
 nontrivial efforts and will be a while yet before we have them in
 place. Mirroring things is generally complex, since different kinds
 of files/data need widely differing retrieval, indexing and caching
 solutions--there's no one-size-fits-all option really.

If there are specific things we could do to libvirt.org / qemu.org
download sites, to make mirroring easier or more reliable for OpenStack,
we could certainly explore options there, since we know the right people
involved in both projects. 

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Jeremy Stanley
On 2014-08-11 14:56:27 +0100 (+0100), Daniel P. Berrange wrote:
 If there are specific things we could do to libvirt.org / qemu.org
 download sites, to make mirroring easier or more reliable for
 OpenStack, we could certainly explore options there, since we know
 the right people involved in both projects.

Mirroring is generally a reaction to stability issues more than
anything. Providing those files from multiple locations behind a
global cache/load-balancing solution (maybe a CDN) to ensure
stability would likely help prevent us from needing to have yet
another one-off mirror for a handful of files. It's worth adding to
https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
and discussing tomorrow, if you're around, so we can be sure to
get input from more of the Infra team.
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Joe Gordon
On Sun, Aug 10, 2014 at 11:59 PM, Mark McLoughlin mar...@redhat.com wrote:

 On Fri, 2014-08-08 at 09:06 -0400, Russell Bryant wrote:
  On 08/07/2014 08:06 PM, Michael Still wrote:
   It seems to me that the tension here is that there are groups who
   would really like to use features in newer libvirts that we don't CI
   on in the gate. Is it naive to think that a possible solution here is
   to do the following:
  
- revert the libvirt version_cap flag
 
  I don't feel strongly either way on this.  It seemed useful at the time
  for being able to decouple upgrading libvirt and enabling features that
  come with that.

 Right, I suggested the flag as a more deliberate way of avoiding the
 issue that was previously seen in the gate with live snapshots. I still
 think it's a pretty elegant and useful little feature, and don't think
 we need to use it as proxy battle over testing requirements for new
 libvirt features.


Mark,

I am not sure if I follow.  The gate issue with live snapshots has been
worked around by turning it off [0], so presumably this patch is forward
facing.  I fail to see how this patch is needed to help the gate in the
future. Wouldn't it just delay the issues until we change the version_cap?

The issue I see with the libvirt version_cap [1] is best captured in its
commit message: The end user can override the limit if they wish to opt-in
to use of untested features via the 'version_cap' setting in the 'libvirt'
group. This goes against the very direction nova has been moving in for
some time now. We have been moving away from merging untested (re: no
integration testing) features.  This patch changes the very direction the
project is going in over testing without so much as a discussion. While I
think it may be time that we revisited this discussion, the discussion
needs to happen before any patches are merged.

I am less concerned about the contents of this patch, and more concerned
with how such a big de facto change in nova policy (we accept untested code
sometimes) without any discussion or consensus. In your comment on the
revert [2], you say the 'whether not-CI-tested features should be allowed
to be merged' debate is 'clearly unresolved.' How did you get to that
conclusion? This was never brought up in the mid-cycles as a unresolved
topic to be discussed. In our specs template we say Is this untestable in
gate given current limitations (specific hardware / software configurations
available)? If so, are there mitigation plans (3rd party testing, gate
enhancements, etc) [3].  We have been blocking untested features for some
time now.

I am further perplexed by what Daniel Berrange, the patch author, meant
when he commented [2] Regardless of the outcome of the testing discussion
we believe this is a useful feature to have. Who is 'we'? Because I don't
see how that can be nova-core or even nova-specs-core, especially
considering how many members of those groups are +2 on the revert. So if
'we' is neither of those groups then who is 'we'?

[0] https://review.openstack.org/#/c/102643/4/nova/virt/libvirt/driver.py
[1] https://review.openstack.org/#/c/107119/
[2] https://review.openstack.org/#/c/110754/
[3]
http://specs.openstack.org/openstack/nova-specs/specs/template.html#testing





 Mark.


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-10 Thread Michael Still
On Sun, Aug 10, 2014 at 2:33 AM, Jeremy Stanley fu...@yuggoth.org wrote:
 On 2014-08-08 09:06:29 -0400 (-0400), Russell Bryant wrote:
 [...]
 We've seen several times that building and maintaining 3rd party
 CI is a *lot* of work.

 Building and maintaining *any* CI is a *lot* of work, not the least
 of which is the official OpenStack project CI (I believe Monty
 mentioned in #openstack-infra last night that our CI is about twice
 the size of Travis-CI now, not sure what metric he's comparing there
 though).

 Like you said in [1], doing this in infra's CI would be ideal. I
 think 3rd party should be reserved for when running it in the
 project's infrastructure is not an option for some reason
 (requires proprietary hw or sw, for example).

 Add to the not an option for some reason list, software which is
 not easily obtainable through typical installation channels (PyPI,
 Linux distro-managed package repositories for their LTS/server
 releases, et cetera) or which requires gyrations which destabilize
 or significantly complicate maintenance of the overall system as
 well as reproducibility for developers. It may be possible to work
 around some of these concerns via access from multiple locations
 coupled with heavy caching, but adding that in for a one-off source
 is hard to justify the additional complexity too.

My understanding is that Fedora has a PPA equivalent which ships a
latest and greated libvirt. So, it would be packages if we went the
Fedora route, which should be less work.

Michael

-- 
Rackspace Australia

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-09 Thread Jeremy Stanley
On 2014-08-08 09:06:29 -0400 (-0400), Russell Bryant wrote:
[...]
 We've seen several times that building and maintaining 3rd party
 CI is a *lot* of work.

Building and maintaining *any* CI is a *lot* of work, not the least
of which is the official OpenStack project CI (I believe Monty
mentioned in #openstack-infra last night that our CI is about twice
the size of Travis-CI now, not sure what metric he's comparing there
though).

 Like you said in [1], doing this in infra's CI would be ideal. I
 think 3rd party should be reserved for when running it in the
 project's infrastructure is not an option for some reason
 (requires proprietary hw or sw, for example).

Add to the not an option for some reason list, software which is
not easily obtainable through typical installation channels (PyPI,
Linux distro-managed package repositories for their LTS/server
releases, et cetera) or which requires gyrations which destabilize
or significantly complicate maintenance of the overall system as
well as reproducibility for developers. It may be possible to work
around some of these concerns via access from multiple locations
coupled with heavy caching, but adding that in for a one-off source
is hard to justify the additional complexity too.

 I wonder if the job could be as simple as one with an added step
 in the config to install latest libvirt from source.  Dan, do you
 think someone could add a libvirt-current.tar.gz to
 http://libvirt.org/sources/ ? Using the latest release seems
 better than master from git.
[...]

Would getting it into EPEL for CentOS 7 or UCA for Ubuntu 14.04 LTS
hopefully be an option?
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-08 Thread Russell Bryant
On 08/07/2014 08:06 PM, Michael Still wrote:
 It seems to me that the tension here is that there are groups who
 would really like to use features in newer libvirts that we don't CI
 on in the gate. Is it naive to think that a possible solution here is
 to do the following:
 
  - revert the libvirt version_cap flag

I don't feel strongly either way on this.  It seemed useful at the time
for being able to decouple upgrading libvirt and enabling features that
come with that.  I'd like to let Dan get back from vacation and weigh in
on it, though.

  - instead implement a third party CI with the latest available
 libvirt release [1]

As for the general idea of doing CI, absolutely.  That was discussed
earlier in the thread, though nobody has picked up the ball yet.  I can
work on it, though.  We just need to figure out a sensible approach.

We've seen several times that building and maintaining 3rd party CI is a
*lot* of work.  Like you said in [1], doing this in infra's CI would be
ideal.  I think 3rd party should be reserved for when running it in the
project's infrastructure is not an option for some reason (requires
proprietary hw or sw, for example).

I wonder if the job could be as simple as one with an added step in the
config to install latest libvirt from source.  Dan, do you think someone
could add a libvirt-current.tar.gz to http://libvirt.org/sources/ ?
Using the latest release seems better than master from git.

I'll mess around and see if I can spin up an experimental job.

  - document clearly in the release notes the versions of dependancies
 that we tested against in a given release: hypervisor versions (gate
 and third party), etc etc

Sure, that sounds like a good thing to document in release notes.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-08 Thread Russell Bryant
On 08/08/2014 01:46 AM, Luke Gorrie wrote:
 On 8 August 2014 02:06, Michael Still mi...@stillhq.com
 mailto:mi...@stillhq.com wrote:
 
 1: I think that ultimately should live in infra as part of check, but
 I'd be ok with it starting as a third party if that delivers us
 something faster. I'd be happy enough to donate resources to get that
 going if we decide to go with this plan.
 
 
 Can we cooperate somehow?
 
 We are already working on bringing up a third party CI covering QEMU 2.1
 and Libvirt 1.2.7. The intention of this CI is to test the software
 configuration that we are recommending for NFV deployments (including
 vhost-user feature which appeared in those releases), and to provide CI
 cover for the code we are offering for Neutron.
 
 Michele Paolino is working on this and the relevant nova/devstack changes.

It sounds like what you're working on is a separate thing.  You're
targeting coverage for a specific set of use cases, while this is a
flavor of the general CI coverage we're already doing, but with the
latest (not pegged) libvirt (and maybe qemu).

By all means, more testing is useful though.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-08 Thread Luke Gorrie
On 8 August 2014 15:27, Russell Bryant rbry...@redhat.com wrote:

 It sounds like what you're working on is a separate thing.


Roger. Just wanted to check if our work could have some broader utility,
but as you say we do have a specific use case in mind.

Cheers!
-Luke
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-08 Thread Russell Bryant
On 08/08/2014 09:06 AM, Russell Bryant wrote:
  - instead implement a third party CI with the latest available
 libvirt release [1]
 
 As for the general idea of doing CI, absolutely.  That was discussed
 earlier in the thread, though nobody has picked up the ball yet.  I can
 work on it, though.  We just need to figure out a sensible approach.
 
 We've seen several times that building and maintaining 3rd party CI is a
 *lot* of work.  Like you said in [1], doing this in infra's CI would be
 ideal.  I think 3rd party should be reserved for when running it in the
 project's infrastructure is not an option for some reason (requires
 proprietary hw or sw, for example).
 
 I wonder if the job could be as simple as one with an added step in the
 config to install latest libvirt from source.  Dan, do you think someone
 could add a libvirt-current.tar.gz to http://libvirt.org/sources/ ?
 Using the latest release seems better than master from git.
 
 I'll mess around and see if I can spin up an experimental job.

Here's a first stab at it:

https://review.openstack.org/113020

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-07 Thread Michael Still
It seems to me that the tension here is that there are groups who
would really like to use features in newer libvirts that we don't CI
on in the gate. Is it naive to think that a possible solution here is
to do the following:

 - revert the libvirt version_cap flag
 - instead implement a third party CI with the latest available
libvirt release [1]
 - document clearly in the release notes the versions of dependancies
that we tested against in a given release: hypervisor versions (gate
and third party), etc etc

Michael

1: I think that ultimately should live in infra as part of check, but
I'd be ok with it starting as a third party if that delivers us
something faster. I'd be happy enough to donate resources to get that
going if we decide to go with this plan.

On Fri, Aug 8, 2014 at 12:38 AM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:


 On 7/18/2014 2:55 AM, Daniel P. Berrange wrote:

 On Thu, Jul 17, 2014 at 12:13:13PM -0700, Johannes Erdfelt wrote:

 On Thu, Jul 17, 2014, Russell Bryant rbry...@redhat.com wrote:

 On 07/17/2014 02:31 PM, Johannes Erdfelt wrote:

 It kind of helps. It's still implicit in that you need to look at what
 features are enabled at what version and determine if it is being
 tested.

 But the behavior is still broken since code is still getting merged
 that
 isn't tested. Saying that is by design doesn't help the fact that
 potentially broken code exists.


 Well, it may not be tested in our CI yet, but that doesn't mean it's not
 tested some other way, at least.


 I'm skeptical. Unless it's tested continuously, it'll likely break at
 some time.

 We seem to be selectively choosing the continuous part of CI. I'd
 understand if it was reluctantly because of immediate problems but
 this reads like it's acceptable long-term too.

 I think there are some good ideas in other parts of this thread to look
 at how we can more reguarly rev libvirt in the gate to mitigate this.

 There's also been work going on to get Fedora enabled in the gate, which
 is a distro that regularly carries a much more recent version of libvirt
 (among other things), so that's another angle that may help.


 That's an improvement, but I'm still not sure I understand what the
 workflow will be for developers.


 That's exactly why we want to have the CI system using newer libvirt
 than it does today. The patch to cap the version doesn't change what
 is tested - it just avoids users hitting untested paths by default
 so they're not exposed to any potential instability until we actually
 get a more updated CI system

 Do they need to now wait for Fedora to ship a new version of libvirt?
 Fedora is likely to help the problem because of how quickly it generally
 ships new packages and their release schedule but it would still hold
 back some features?


 Fedora has an add-on repository (virt-preview) which contains the
 latest QEMU + libvirt RPMs for current stable release - this is lags
 upstream by a matter of days, so there would be no appreciable delay
 in getting access to newest possible releases.

 Also, this explanation doesn't answer my question about what happens
 when the gate finally gets around to actually testing those potentially
 broken code paths.


 I think we would just test out the bump and make sure it's working fine
 before it's enabled for every job.  That would keep potential breakage
 localized to people working on debugging/fixing it until it's ready to
 go.


 The downside is that new features for libvirt could be held back by
 needing to fix other unrelated features. This is certainly not a bigger
 problem than users potentially running untested code simply because they
 are on a newer version of libvirt.

 I understand we have an immediate problem and I see the short-term value
 in the libvirt version cap.

 I try to look at the long-term and unless it's clear to me that a
 solution is proposed to be short-term and there are some understood
 trade-offs then I'll question the long-term implications of it.


 Once CI system is regularly tracking upstream releases within a matter of
 days, then the version cap is a total non-issue from a feature
 availability POV. It is none the less useful in the long term, for
 example,
 if there were a problem we miss in testing, which a deployer then hits in
 the field, the version cap would allow them to get their deployment to
 avoid use of the newer libvirt feature, which could be a useful workaround
 for them until a fix is available.

 Regards,
 Daniel


 FYI, there is a proposed revert of the libvirt version cap change mentioned
 previously in this thread [1].

 Just bringing it up again here since the discussion should happen in the ML
 rather than gerrit.

 [1] https://review.openstack.org/#/c/110754/

 --

 Thanks,

 Matt Riedemann



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Rackspace 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-07 Thread Luke Gorrie
On 8 August 2014 02:06, Michael Still mi...@stillhq.com wrote:

 1: I think that ultimately should live in infra as part of check, but
 I'd be ok with it starting as a third party if that delivers us
 something faster. I'd be happy enough to donate resources to get that
 going if we decide to go with this plan.


Can we cooperate somehow?

We are already working on bringing up a third party CI covering QEMU 2.1
and Libvirt 1.2.7. The intention of this CI is to test the software
configuration that we are recommending for NFV deployments (including
vhost-user feature which appeared in those releases), and to provide CI
cover for the code we are offering for Neutron.

Michele Paolino is working on this and the relevant nova/devstack changes.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-07 Thread Matt Riedemann



On 7/18/2014 2:55 AM, Daniel P. Berrange wrote:

On Thu, Jul 17, 2014 at 12:13:13PM -0700, Johannes Erdfelt wrote:

On Thu, Jul 17, 2014, Russell Bryant rbry...@redhat.com wrote:

On 07/17/2014 02:31 PM, Johannes Erdfelt wrote:

It kind of helps. It's still implicit in that you need to look at what
features are enabled at what version and determine if it is being
tested.

But the behavior is still broken since code is still getting merged that
isn't tested. Saying that is by design doesn't help the fact that
potentially broken code exists.


Well, it may not be tested in our CI yet, but that doesn't mean it's not
tested some other way, at least.


I'm skeptical. Unless it's tested continuously, it'll likely break at
some time.

We seem to be selectively choosing the continuous part of CI. I'd
understand if it was reluctantly because of immediate problems but
this reads like it's acceptable long-term too.


I think there are some good ideas in other parts of this thread to look
at how we can more reguarly rev libvirt in the gate to mitigate this.

There's also been work going on to get Fedora enabled in the gate, which
is a distro that regularly carries a much more recent version of libvirt
(among other things), so that's another angle that may help.


That's an improvement, but I'm still not sure I understand what the
workflow will be for developers.


That's exactly why we want to have the CI system using newer libvirt
than it does today. The patch to cap the version doesn't change what
is tested - it just avoids users hitting untested paths by default
so they're not exposed to any potential instability until we actually
get a more updated CI system


Do they need to now wait for Fedora to ship a new version of libvirt?
Fedora is likely to help the problem because of how quickly it generally
ships new packages and their release schedule but it would still hold
back some features?


Fedora has an add-on repository (virt-preview) which contains the
latest QEMU + libvirt RPMs for current stable release - this is lags
upstream by a matter of days, so there would be no appreciable delay
in getting access to newest possible releases.


Also, this explanation doesn't answer my question about what happens
when the gate finally gets around to actually testing those potentially
broken code paths.


I think we would just test out the bump and make sure it's working fine
before it's enabled for every job.  That would keep potential breakage
localized to people working on debugging/fixing it until it's ready to go.


The downside is that new features for libvirt could be held back by
needing to fix other unrelated features. This is certainly not a bigger
problem than users potentially running untested code simply because they
are on a newer version of libvirt.

I understand we have an immediate problem and I see the short-term value
in the libvirt version cap.

I try to look at the long-term and unless it's clear to me that a
solution is proposed to be short-term and there are some understood
trade-offs then I'll question the long-term implications of it.


Once CI system is regularly tracking upstream releases within a matter of
days, then the version cap is a total non-issue from a feature
availability POV. It is none the less useful in the long term, for example,
if there were a problem we miss in testing, which a deployer then hits in
the field, the version cap would allow them to get their deployment to
avoid use of the newer libvirt feature, which could be a useful workaround
for them until a fix is available.

Regards,
Daniel



FYI, there is a proposed revert of the libvirt version cap change 
mentioned previously in this thread [1].


Just bringing it up again here since the discussion should happen in the 
ML rather than gerrit.


[1] https://review.openstack.org/#/c/110754/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-18 Thread Daniel P. Berrange
On Thu, Jul 17, 2014 at 12:13:13PM -0700, Johannes Erdfelt wrote:
 On Thu, Jul 17, 2014, Russell Bryant rbry...@redhat.com wrote:
  On 07/17/2014 02:31 PM, Johannes Erdfelt wrote:
   It kind of helps. It's still implicit in that you need to look at what
   features are enabled at what version and determine if it is being
   tested.
   
   But the behavior is still broken since code is still getting merged that
   isn't tested. Saying that is by design doesn't help the fact that
   potentially broken code exists.
  
  Well, it may not be tested in our CI yet, but that doesn't mean it's not
  tested some other way, at least.
 
 I'm skeptical. Unless it's tested continuously, it'll likely break at
 some time.
 
 We seem to be selectively choosing the continuous part of CI. I'd
 understand if it was reluctantly because of immediate problems but
 this reads like it's acceptable long-term too.
 
  I think there are some good ideas in other parts of this thread to look
  at how we can more reguarly rev libvirt in the gate to mitigate this.
  
  There's also been work going on to get Fedora enabled in the gate, which
  is a distro that regularly carries a much more recent version of libvirt
  (among other things), so that's another angle that may help.
 
 That's an improvement, but I'm still not sure I understand what the
 workflow will be for developers.

That's exactly why we want to have the CI system using newer libvirt
than it does today. The patch to cap the version doesn't change what
is tested - it just avoids users hitting untested paths by default
so they're not exposed to any potential instability until we actually
get a more updated CI system

 Do they need to now wait for Fedora to ship a new version of libvirt?
 Fedora is likely to help the problem because of how quickly it generally
 ships new packages and their release schedule but it would still hold
 back some features?

Fedora has an add-on repository (virt-preview) which contains the
latest QEMU + libvirt RPMs for current stable release - this is lags
upstream by a matter of days, so there would be no appreciable delay
in getting access to newest possible releases.

   Also, this explanation doesn't answer my question about what happens
   when the gate finally gets around to actually testing those potentially
   broken code paths.
  
  I think we would just test out the bump and make sure it's working fine
  before it's enabled for every job.  That would keep potential breakage
  localized to people working on debugging/fixing it until it's ready to go.
 
 The downside is that new features for libvirt could be held back by
 needing to fix other unrelated features. This is certainly not a bigger
 problem than users potentially running untested code simply because they
 are on a newer version of libvirt.
 
 I understand we have an immediate problem and I see the short-term value
 in the libvirt version cap.
 
 I try to look at the long-term and unless it's clear to me that a
 solution is proposed to be short-term and there are some understood
 trade-offs then I'll question the long-term implications of it.

Once CI system is regularly tracking upstream releases within a matter of
days, then the version cap is a total non-issue from a feature
availability POV. It is none the less useful in the long term, for example,
if there were a problem we miss in testing, which a deployer then hits in
the field, the version cap would allow them to get their deployment to
avoid use of the newer libvirt feature, which could be a useful workaround
for them until a fix is available.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Daniel P. Berrange
On Thu, Jul 17, 2014 at 08:46:12AM +1000, Michael Still wrote:
 Top posting to the original email because I want this to stand out...
 
 I've added this to the agenda for the nova mid cycle meetup, I think
 most of the contributors to this thread will be there. So, if we can
 nail this down here then that's great, but if we think we'd be more
 productive in person chatting about this then we have that option too.

FYI, I'm afraid I won't be at the mid-cycle meetup since it clashed with
my being on holiday. So I'd really prefer if we keep the discussion on
this mailing list where everyone has a chance to participate.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Sean Dague
On 07/17/2014 12:45 AM, Michael Still wrote:
 On Thu, Jul 17, 2014 at 3:27 AM, Vishvananda Ishaya
 vishvana...@gmail.com wrote:
 On Jul 16, 2014, at 8:28 AM, Daniel P. Berrange berra...@redhat.com wrote:
 On Wed, Jul 16, 2014 at 08:12:47AM -0700, Clark Boylan wrote:

 I am worried that we would just regress to the current process because
 we have tried something similar to this previously and were forced to
 regress to the current process.

 IMHO the longer we wait between updating the gate to new versions
 the bigger the problems we create for ourselves. eg we were switching
 from 0.9.8 released Dec 2011, to  1.1.1 released Jun 2013, so we
 were exposed to over 1 + 1/2 years worth of code churn in a single
 event. The fact that we only hit a couple of bugs in that, is actually
 remarkable given the amount of feature development that had gone into
 libvirt in that time. If we had been tracking each intervening libvirt
 release I expect the majority of updates would have had no ill effect
 on us at all. For the couple of releases where there was a problem we
 would not be forced to rollback to a version years older again, we'd
 just drop back to the previous release at most 1 month older.

 This is a really good point. As someone who has to deal with packaging
 issues constantly, it is odd to me that libvirt is one of the few places
 where we depend on upstream packaging. We constantly pull in new python
 dependencies from pypi that are not packaged in ubuntu. If we had to
 wait for packaging before merging the whole system would grind to a halt.

 I think we should be updating our libvirt version more frequently vy
 installing from source or our own ppa instead of waiting for the ubuntu
 team to package it.
 
 I agree with Vish here, although I do recognise its a bunch of work
 for someone. One of the reasons we experienced bugs in the gate is
 that we jumped 18 months in libvirt versions in a single leap. If we
 had flexibility of packaging, we could have stepped through each major
 version along the way, and that would have helped us identify problems
 in a more controlled manner.

We've talked about the 'CI the world plan' for a while, which this would
be part of. That's a ton of work that no one is signed up for.

But more importantly setting up and running the tests is  10% of the
time cost. Triage and fixing bugs long term is a real cost. As we've
seen with the existing gate bugs we can't even close the bugs that are
preventing ourselves from merging code -
http://status.openstack.org/elastic-recheck/, so I'm not sure which band
of magical elves we'd expect to debug and fix these things. :)

-Sean

-- 
Sean Dague
http://dague.net



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Daniel P. Berrange
On Wed, Jul 16, 2014 at 12:38:44PM -0600, Chris Friesen wrote:
 On 07/16/2014 11:59 AM, Monty Taylor wrote:
 On 07/16/2014 07:27 PM, Vishvananda Ishaya wrote:
 
 This is a really good point. As someone who has to deal with packaging
 issues constantly, it is odd to me that libvirt is one of the few places
 where we depend on upstream packaging. We constantly pull in new python
 dependencies from pypi that are not packaged in ubuntu. If we had to
 wait for packaging before merging the whole system would grind to a halt.
 
 I think we should be updating our libvirt version more frequently vy
 installing from source or our own ppa instead of waiting for the ubuntu
 team to package it.
 
 Shrinking in terror from what I'm about to say ... but I actually agree
 with this, There are SEVERAL logistical issues we'd need to sort, not
 the least of which involve the actual mechanics of us doing that and
 properly gating,etc. But I think that, like the python depends where we
 tell distros what version we _need_ rather than using what version they
 have, libvirt, qemu, ovs and maybe one or two other things are areas in
 which we may want or need to have a strongish opinion.
 
 I'll bring this up in the room tomorrow at the Infra/QA meetup, and will
 probably be flayed alive for it - but maybe I can put forward a
 straw-man proposal on how this might work.
 
 How would this work...would you have them uninstall the distro-provided
 libvirt/qemu and replace them with newer ones?  (In which case what happens
 if the version desired by OpenStack has bugs in features that OpenStack
 doesn't use, but that some other software that the user wants to run does
 use?)

Having upstream testing the latest version of libvirt doesn't mean that
the latest version of libvirt is a mandatory requirement for distros. We
already have places where we use a feature of libvirt from say, 1.1.0, but
our reported min libvirt is still 0.9.6. In some cases Nova will take
alternative code paths for compat, in other cases attempts to use the
feature will just  be reported as an error to the caller.

 Or would you have OpenStack versions of them installed in parallel in an
 alternate location?

If the distros do not have the latest version of libvirt though, they are
responsible for running the OpenStack CI tests against their version and
figuring out whether it is still functional to the level they require to
satisfy their users/customers demands.

We're already in this situation today - eg gate now tests Ubuntu 14.04
with libvirt 1.2.2, but many people shipping OpenStack are certainly not
running on libvirt 1.2.2. So this is just business as usual for distros
and downstream vendors

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Sean Dague
On 07/16/2014 05:08 PM, Mark McLoughlin wrote:
 On Wed, 2014-07-16 at 16:15 +0200, Sean Dague wrote:
 ..
 Based on these experiences, libvirt version differences seem to be as
 substantial as major hypervisor differences. There is a proposal here -
 https://review.openstack.org/#/c/103923/ to hold newer versions of
 libvirt to the same standard we hold xen, vmware, hyperv, docker,
 ironic, etc.
 
 That's a bit of a mis-characterization - in terms of functional test
 coverage, the libvirt driver is the bar that all the other drivers
 struggle to meet.
 
 And I doubt any of us pay too close attention to the feature coverage
 that the 3rd party CI test jobs have.
 
 I'm somewhat concerned that the -2 pile on in this review is a double
 standard of libvirt features, and features exploiting really new
 upstream features. I feel like a lot of the language being used here
 about the burden of doing this testing is exactly the same as was
 presented by the docker team before their driver was removed, which was
 ignored by the Nova team at the time.
 
 Personally, I wasn't very comfortable with the docker driver move. It
 certainly gave an outward impression that we're an unfriendly community.
 The mitigating factor was that a lot of friendly, collaborative,
 coaching work went on in the background for months. Expectations were
 communicated well in advance.
 
 Kicking the docker driver out of the tree has resulted in an uptick in
 the amount of work happening on it, but I suspect most people involved
 have a bad taste in their mouths. I guess there's incentives at play
 which mean they'll continue plugging away at it, but those incentives
 aren't always at play.

I agree. The whole history of the docker driver is sorted. The fact that
it was rushed in, was broken about 4 weeks later, the pleas for getting
the install path in devstack fixed were ignored, and remained basically
broken until it was on threat of removal. I think there is a bad taste
in everyone's mouth around it.

 It was the concern by the freebsd
 team, which was also ignored and they were told to go land libvirt
 patches instead.

 I'm ok with us as a project changing our mind and deciding that the test
 bar needs to be taken down a notch or two because it's too burdensome to
 contributors and vendors, but if we are doing that, we need to do it for
 everyone. A lot of other organizations have put a ton of time and energy
 into this, and are carrying a maintenance cost of running these systems
 to get results back in a timely basis.
 
 I don't agree that we need to apply the same rules equally to everyone.
 
 At least part of the reasoning behind the emphasis on 3rd party CI
 testing was that projects (Neutron in particular) were being overwhelmed
 by contributions to drivers from developers who never contributed in any
 way to the core. The corollary of that is the contributors who do
 contribute to the core should be given a bit more leeway in return.
 
 There's a natural building of trust and element of human relationships
 here. As a reviewer, you learn to trust contributors with a good track
 record and perhaps prioritize contributions from them.

I agree with this. However, I'm not sure the currently 3rd party CI
model fixed the issue. The folks doing CI work at most of these entities
aren't the developers in the core, and are not often even in the same
groups as the developers in the projects.

 As we seem deadlocked in the review, I think the mailing list is
 probably a better place for this.

 If we want to reduce the standards for libvirt we should reconsider
 what's being asked of 3rd party CI teams, and things like the docker
 driver, as well as the A, B, C driver classification. Because clearly
 libvirt 1.2.5+ isn't actually class A supported.
 
 No, there are features or code paths of the libvirt 1.2.5+ driver that
 aren't as well tested as the class A designation implies. And we have
 a proposal to make sure these aren't used by default:
 
   https://review.openstack.org/107119

That's interesting, I had not seen that go through. There are also auto
feature selection by qemu, do we need that as well?

 i.e. to stray off the class A path, an operator has to opt into it by
 changing a configuration option that explains they will be enabling code
 paths which aren't yet tested upstream.
 
 These features have value to some people now, they don't risk regressing
 the class A driver and there's a clear path to them being elevated to
 class A in time. We should value these contributions and nurture these
 contributors.
 
 Appending some of my comments from the review below. The tl;dr is that I
 think we're losing sight of the importance of welcoming and nurturing
 contributors, and valuing whatever contributions they can make. That
 terrifies me. 

Honestly, I agree, which is why I started this thread mostly on a level
playing field. Because we're now 6 months into the 3rd Party CI
requirements experiment for Nova, and while some things 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Daniel P. Berrange
On Thu, Jul 17, 2014 at 10:59:27AM +0200, Sean Dague wrote:
 We've talked about the 'CI the world plan' for a while, which this would
 be part of. That's a ton of work that no one is signed up for.
 
 But more importantly setting up and running the tests is  10% of the
 time cost. Triage and fixing bugs long term is a real cost. As we've
 seen with the existing gate bugs we can't even close the bugs that are
 preventing ourselves from merging code -
 http://status.openstack.org/elastic-recheck/, so I'm not sure which band
 of magical elves we'd expect to debug and fix these things. :)

Yep, this is really a critical blocking problem we need to figure out and
resolve, before we attempt any plan to make our testing requirements
stricter. Making our testing reqiurements stricter without first improving
our overall test reliability will inflict untold pain  misery on both our
code contributors and people who are maintaining the CI systems.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Mark McLoughlin
On Thu, 2014-07-17 at 09:58 +0100, Daniel P. Berrange wrote:
 On Thu, Jul 17, 2014 at 08:46:12AM +1000, Michael Still wrote:
  Top posting to the original email because I want this to stand out...
  
  I've added this to the agenda for the nova mid cycle meetup, I think
  most of the contributors to this thread will be there. So, if we can
  nail this down here then that's great, but if we think we'd be more
  productive in person chatting about this then we have that option too.
 
 FYI, I'm afraid I won't be at the mid-cycle meetup since it clashed with
 my being on holiday. So I'd really prefer if we keep the discussion on
 this mailing list where everyone has a chance to participate.

Same here. Pre-arranged vacation, otherwise I'd have been there.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Sean Dague
On 07/17/2014 02:13 PM, Mark McLoughlin wrote:
 On Thu, 2014-07-17 at 09:58 +0100, Daniel P. Berrange wrote:
 On Thu, Jul 17, 2014 at 08:46:12AM +1000, Michael Still wrote:
 Top posting to the original email because I want this to stand out...

 I've added this to the agenda for the nova mid cycle meetup, I think
 most of the contributors to this thread will be there. So, if we can
 nail this down here then that's great, but if we think we'd be more
 productive in person chatting about this then we have that option too.

 FYI, I'm afraid I won't be at the mid-cycle meetup since it clashed with
 my being on holiday. So I'd really prefer if we keep the discussion on
 this mailing list where everyone has a chance to participate.
 
 Same here. Pre-arranged vacation, otherwise I'd have been there.

I'll be there, but I agree that we should do this somewhere we have a
record for later. Recorded memory is important so that in a years time
whatever reasoning we come to is somewhere we can look at the archives.

Which is also why I think this ought to remain on email and not IRC, as
we have a record of it here.

-Sean

-- 
Sean Dague
http://dague.net



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Sean Dague
On 07/16/2014 08:15 PM, Eric Windisch wrote:
 
 
 
 On Wed, Jul 16, 2014 at 12:55 PM, Roman Bogorodskiy
 rbogorods...@mirantis.com mailto:rbogorods...@mirantis.com wrote:
 
   Eric Windisch wrote:
 
  This thread highlights more deeply the problems for the FreeBSD folks.
  First, I still disagree with the recommendation that they
 contribute to
  libvirt. It's a classic example of creating two or more problems
 from one.
  Once they have support in libvirt, how long before their code is in a
  version of libvirt acceptable to Nova? When they hit edge-cases or
 bugs,
  requiring changes in libvirt, how long before those fixes are
 accepted by
  Nova?
 
 Could you please elaborate why you disagree on the contributing patches
 to libvirt approach and what the alternative approach do you propose?
 
 
 I don't necessarily disagree with contributing patches to libvirt. I
 believe that the current system makes it difficult to perform quick,
 iterative development. I wish to see this thread attempt to solve that
 problem and reduce the barrier to getting stuff done.
  
 
 Also, could you please elaborate on what is 'version of libvirt
 acceptable to Nova'? Cannot we just say that e.g. Nova requires libvirt
 X.Y to be deployed on FreeBSD?
 
 
 This is precisely my point, that we need to support different versions
 of libvirt and to test those versions. If we're going to support
  different versions of libvirt on FreeBSD, Ubuntu, and RedHat - those
 should be tested, possibly as third-party options.
 
 The primary testing path for libvirt upstream should be with the latest
 stable release with a non-voting test against trunk. There might be
 value in testing against a development snapshot as well, where we know
 there are features we want in an unreleased version of libvirt but where
 we cannot trust trunk to be stable enough for gate.
  
 
 Anyway, speaking about FreeBSD support I assume we actually talking
 about Bhyve support. I think it'd be good to break the task and
 implement FreeBSD support for libvirt/Qemu first
 
 
  I believe Sean was referencing to Bhyve support, this is how I
 interpreted it.

Yes, I meant Bhyve.

-Sean

-- 
Sean Dague
http://dague.net



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Daniel P. Berrange
On Wed, Jul 16, 2014 at 09:44:55AM -0700, Johannes Erdfelt wrote:
 On Wed, Jul 16, 2014, Mark McLoughlin mar...@redhat.com wrote:
  No, there are features or code paths of the libvirt 1.2.5+ driver that
  aren't as well tested as the class A designation implies. And we have
  a proposal to make sure these aren't used by default:
  
https://review.openstack.org/107119
  
  i.e. to stray off the class A path, an operator has to opt into it by
  changing a configuration option that explains they will be enabling code
  paths which aren't yet tested upstream.
 
 So that means the libvirt driver will be a mix of tested and untested
 features, but only the tested code paths will be enabled by default?
 
 The gate not only tests code as it gets merged, it tests to make sure it
 doesn't get broken in the future by other changes.
 
 What happens when it comes time to bump the default version_cap in the
 future? It looks like there could potentially be a scramble to fix code
 that has been merged but doesn't work now that it's being tested. Which
 potentially further slows down development since now unrelated code
 needs to be fixed.
 
 This sounds like we're actively weakening the gate we currently have.

If the gate has libvirt 1.2.2 and a feature is added to Nova that
depends on libvirt 1.2.5, then the gate is already not testing that
codepath since it lacks the libvirt version neccessary to test it.
The version cap should not be changing that, it is just making it
more explicit that it hasn't been tested

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Chuck Short
On Thu, Jul 17, 2014 at 8:13 AM, Mark McLoughlin mar...@redhat.com wrote:

 On Thu, 2014-07-17 at 09:58 +0100, Daniel P. Berrange wrote:
  On Thu, Jul 17, 2014 at 08:46:12AM +1000, Michael Still wrote:
   Top posting to the original email because I want this to stand out...
  
   I've added this to the agenda for the nova mid cycle meetup, I think
   most of the contributors to this thread will be there. So, if we can
   nail this down here then that's great, but if we think we'd be more
   productive in person chatting about this then we have that option too.
 
  FYI, I'm afraid I won't be at the mid-cycle meetup since it clashed with
  my being on holiday. So I'd really prefer if we keep the discussion on
  this mailing list where everyone has a chance to participate.

 Same here. Pre-arranged vacation, otherwise I'd have been there.

 Mark.


Ill be there.



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Clint Byrum
Excerpts from Chris Friesen's message of 2014-07-16 11:38:44 -0700:
 On 07/16/2014 11:59 AM, Monty Taylor wrote:
  On 07/16/2014 07:27 PM, Vishvananda Ishaya wrote:
 
  This is a really good point. As someone who has to deal with packaging
  issues constantly, it is odd to me that libvirt is one of the few places
  where we depend on upstream packaging. We constantly pull in new python
  dependencies from pypi that are not packaged in ubuntu. If we had to
  wait for packaging before merging the whole system would grind to a halt.
 
  I think we should be updating our libvirt version more frequently vy
  installing from source or our own ppa instead of waiting for the ubuntu
  team to package it.
 
  Shrinking in terror from what I'm about to say ... but I actually agree
  with this, There are SEVERAL logistical issues we'd need to sort, not
  the least of which involve the actual mechanics of us doing that and
  properly gating,etc. But I think that, like the python depends where we
  tell distros what version we _need_ rather than using what version they
  have, libvirt, qemu, ovs and maybe one or two other things are areas in
  which we may want or need to have a strongish opinion.
 
  I'll bring this up in the room tomorrow at the Infra/QA meetup, and will
  probably be flayed alive for it - but maybe I can put forward a
  straw-man proposal on how this might work.
 
 How would this work...would you have them uninstall the distro-provided 
 libvirt/qemu and replace them with newer ones?  (In which case what 
 happens if the version desired by OpenStack has bugs in features that 
 OpenStack doesn't use, but that some other software that the user wants 
 to run does use?)
 
 Or would you have OpenStack versions of them installed in parallel in an 
 alternate location?

Yes. See: docker, lxc, chroot. (Listed in descending hipsterness order).

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Russell Bryant
On 07/17/2014 11:40 AM, Daniel P. Berrange wrote:
 On Wed, Jul 16, 2014 at 09:44:55AM -0700, Johannes Erdfelt wrote:
 On Wed, Jul 16, 2014, Mark McLoughlin mar...@redhat.com wrote:
 No, there are features or code paths of the libvirt 1.2.5+ driver that
 aren't as well tested as the class A designation implies. And we have
 a proposal to make sure these aren't used by default:

   https://review.openstack.org/107119

 i.e. to stray off the class A path, an operator has to opt into it by
 changing a configuration option that explains they will be enabling code
 paths which aren't yet tested upstream.

 So that means the libvirt driver will be a mix of tested and untested
 features, but only the tested code paths will be enabled by default?

 The gate not only tests code as it gets merged, it tests to make sure it
 doesn't get broken in the future by other changes.

 What happens when it comes time to bump the default version_cap in the
 future? It looks like there could potentially be a scramble to fix code
 that has been merged but doesn't work now that it's being tested. Which
 potentially further slows down development since now unrelated code
 needs to be fixed.

 This sounds like we're actively weakening the gate we currently have.
 
 If the gate has libvirt 1.2.2 and a feature is added to Nova that
 depends on libvirt 1.2.5, then the gate is already not testing that
 codepath since it lacks the libvirt version neccessary to test it.
 The version cap should not be changing that, it is just making it
 more explicit that it hasn't been tested

And hopefully it will make future updates a little smoother.  We can
turn on the new features in only a subset of jobs to minimize potential
disruption.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Johannes Erdfelt
On Thu, Jul 17, 2014, Daniel P. Berrange berra...@redhat.com wrote:
 On Wed, Jul 16, 2014 at 09:44:55AM -0700, Johannes Erdfelt wrote:
  So that means the libvirt driver will be a mix of tested and untested
  features, but only the tested code paths will be enabled by default?
  
  The gate not only tests code as it gets merged, it tests to make sure it
  doesn't get broken in the future by other changes.
  
  What happens when it comes time to bump the default version_cap in the
  future? It looks like there could potentially be a scramble to fix code
  that has been merged but doesn't work now that it's being tested. Which
  potentially further slows down development since now unrelated code
  needs to be fixed.
  
  This sounds like we're actively weakening the gate we currently have.
 
 If the gate has libvirt 1.2.2 and a feature is added to Nova that
 depends on libvirt 1.2.5, then the gate is already not testing that
 codepath since it lacks the libvirt version neccessary to test it.
 The version cap should not be changing that, it is just making it
 more explicit that it hasn't been tested

It kind of helps. It's still implicit in that you need to look at what
features are enabled at what version and determine if it is being
tested.

But the behavior is still broken since code is still getting merged that
isn't tested. Saying that is by design doesn't help the fact that
potentially broken code exists.

Also, this explanation doesn't answer my question about what happens
when the gate finally gets around to actually testing those potentially
broken code paths.

JE


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Russell Bryant
On 07/17/2014 02:31 PM, Johannes Erdfelt wrote:
 On Thu, Jul 17, 2014, Daniel P. Berrange berra...@redhat.com wrote:
 On Wed, Jul 16, 2014 at 09:44:55AM -0700, Johannes Erdfelt wrote:
 So that means the libvirt driver will be a mix of tested and untested
 features, but only the tested code paths will be enabled by default?

 The gate not only tests code as it gets merged, it tests to make sure it
 doesn't get broken in the future by other changes.

 What happens when it comes time to bump the default version_cap in the
 future? It looks like there could potentially be a scramble to fix code
 that has been merged but doesn't work now that it's being tested. Which
 potentially further slows down development since now unrelated code
 needs to be fixed.

 This sounds like we're actively weakening the gate we currently have.

 If the gate has libvirt 1.2.2 and a feature is added to Nova that
 depends on libvirt 1.2.5, then the gate is already not testing that
 codepath since it lacks the libvirt version neccessary to test it.
 The version cap should not be changing that, it is just making it
 more explicit that it hasn't been tested
 
 It kind of helps. It's still implicit in that you need to look at what
 features are enabled at what version and determine if it is being
 tested.
 
 But the behavior is still broken since code is still getting merged that
 isn't tested. Saying that is by design doesn't help the fact that
 potentially broken code exists.

Well, it may not be tested in our CI yet, but that doesn't mean it's not
tested some other way, at least.

I think there are some good ideas in other parts of this thread to look
at how we can more reguarly rev libvirt in the gate to mitigate this.

There's also been work going on to get Fedora enabled in the gate, which
is a distro that regularly carries a much more recent version of libvirt
(among other things), so that's another angle that may help.

 Also, this explanation doesn't answer my question about what happens
 when the gate finally gets around to actually testing those potentially
 broken code paths.

I think we would just test out the bump and make sure it's working fine
before it's enabled for every job.  That would keep potential breakage
localized to people working on debugging/fixing it until it's ready to go.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Johannes Erdfelt
On Thu, Jul 17, 2014, Russell Bryant rbry...@redhat.com wrote:
 On 07/17/2014 02:31 PM, Johannes Erdfelt wrote:
  It kind of helps. It's still implicit in that you need to look at what
  features are enabled at what version and determine if it is being
  tested.
  
  But the behavior is still broken since code is still getting merged that
  isn't tested. Saying that is by design doesn't help the fact that
  potentially broken code exists.
 
 Well, it may not be tested in our CI yet, but that doesn't mean it's not
 tested some other way, at least.

I'm skeptical. Unless it's tested continuously, it'll likely break at
some time.

We seem to be selectively choosing the continuous part of CI. I'd
understand if it was reluctantly because of immediate problems but
this reads like it's acceptable long-term too.

 I think there are some good ideas in other parts of this thread to look
 at how we can more reguarly rev libvirt in the gate to mitigate this.
 
 There's also been work going on to get Fedora enabled in the gate, which
 is a distro that regularly carries a much more recent version of libvirt
 (among other things), so that's another angle that may help.

That's an improvement, but I'm still not sure I understand what the
workflow will be for developers.

Do they need to now wait for Fedora to ship a new version of libvirt?
Fedora is likely to help the problem because of how quickly it generally
ships new packages and their release schedule but it would still hold
back some features?

  Also, this explanation doesn't answer my question about what happens
  when the gate finally gets around to actually testing those potentially
  broken code paths.
 
 I think we would just test out the bump and make sure it's working fine
 before it's enabled for every job.  That would keep potential breakage
 localized to people working on debugging/fixing it until it's ready to go.

The downside is that new features for libvirt could be held back by
needing to fix other unrelated features. This is certainly not a bigger
problem than users potentially running untested code simply because they
are on a newer version of libvirt.

I understand we have an immediate problem and I see the short-term value
in the libvirt version cap.

I try to look at the long-term and unless it's clear to me that a
solution is proposed to be short-term and there are some understood
trade-offs then I'll question the long-term implications of it.

JE


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Daniel P. Berrange
On Wed, Jul 16, 2014 at 04:15:40PM +0200, Sean Dague wrote:
 Recently the main gate updated from Ubuntu 12.04 to 14.04, and in doing
 so we started executing the livesnapshot code in the nova libvirt
 driver. Which fails about 20% of the time in the gate, as we're bringing
 computes up and down while doing a snapshot. Dan Berange did a bunch of
 debug on that and thinks it might be a qemu bug. We disabled these code
 paths, so live snapshot has now been ripped out.
 
 In January we also triggered a libvirt bug, and had to carry a private
 build of libvirt for 6 weeks in order to let people merge code in OpenStack.
 
 We never were able to switch to libvirt 1.1.1 in the gate using the
 Ubuntu Cloud Archive during Icehouse development, because it has a
 different set of failures that would have prevented people from merging
 code.
 
 Based on these experiences, libvirt version differences seem to be as
 substantial as major hypervisor differences.

I think that is a pretty dubious conclusion to draw from just a
couple of bugs. The reason they really caused pain is that because
the CI test system was based on old version for too long. If it
were tracking current upstream version of libvirt/KVM we'd have
seen the problem much sooner  been able to resolve it during
review of the change introducing the feature, as we do with any
other bugs we encounter in software such as the breakage we see
with my stuff off pypi.

 There is a proposal here -
 https://review.openstack.org/#/c/103923/ to hold newer versions of
 libvirt to the same standard we hold xen, vmware, hyperv, docker,
 ironic, etc.

That is rather misleading statement you're making there. Libvirt is
in fact held to *higher* standards than xen/vmware/hypver because it
is actually gating all commits. The 3rd party CI systems can be
broken for days, weeks and we still happily accept code for those
virt. drivers.

AFAIK there has never been any statement that every feature added
to xen/vmware/hyperv must be tested by the 3rd party CI system.
All of the CI systems, for whatever driver, are currently testing
some arbitrary subset of the overall features of that driver, and
by no means every new feature being approved in review has coverage.

 I'm somewhat concerned that the -2 pile on in this review is a double
 standard of libvirt features, and features exploiting really new
 upstream features. I feel like a lot of the language being used here
 about the burden of doing this testing is exactly the same as was
 presented by the docker team before their driver was removed, which was
 ignored by the Nova team at the time. It was the concern by the freebsd
 team, which was also ignored and they were told to go land libvirt
 patches instead.

As above the only double standard is that libvirt tests are all gating
and 3rd party tests are non-gating. 

 If we want to reduce the standards for libvirt we should reconsider
 what's being asked of 3rd party CI teams, and things like the docker
 driver, as well as the A, B, C driver classification. Because clearly
 libvirt 1.2.5+ isn't actually class A supported.

AFAIK the requirement for 3rd party CI is merely that it has to exist,
running some arbitrary version of the hypervisor in question. We've
not said that 3rd party CI has to be covering every version or every
feature, as is trying to be pushed on libvirt here.

The Class A, Class B, Class C classifications were always only
ever going to be a crude approximation. Unless you define them to be
wrt the explicit version of every single deb/pypi package installed
in the gate system (which I don't believe anyone has every suggested)
there is always risk that a different version of some package has a
bug that Nova tickles.

IMHO the classification we do for drivers provides an indication as 
to the quality of the *Nova* code. IOW class A indicates that we've
throughly tested the Nova code and believe it to be free of bugs for
the features we've tested. If there is a bug in a 3rd party package
that doesn't imply that the Nova code is any less well tested or
more buggy. Replace libvirt with mysql in your example above. A new
version of mysql with a bug does not imply that Nova is suddenly not
class A tested.

IMHO it is upto the downstream vendors to run testing to ensure that
what they give to their customers, still achieves the quality level
indicated by the tests upstream has performed on the Nova code.

 Anyway, discussion welcomed. My primary concern right now isn't actually
 where we set the bar, but that we set the same bar for everyone.

As above, aside from the question of gating vs non-gating, the bar is
already set at the same level of everyone. There has to be a CI system
somewhere testing some arbitrary version of the software. Everyone meets
that requirement.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Mark McLoughlin
On Wed, 2014-07-16 at 16:15 +0200, Sean Dague wrote:
..
 Based on these experiences, libvirt version differences seem to be as
 substantial as major hypervisor differences. There is a proposal here -
 https://review.openstack.org/#/c/103923/ to hold newer versions of
 libvirt to the same standard we hold xen, vmware, hyperv, docker,
 ironic, etc.

That's a bit of a mis-characterization - in terms of functional test
coverage, the libvirt driver is the bar that all the other drivers
struggle to meet.

And I doubt any of us pay too close attention to the feature coverage
that the 3rd party CI test jobs have.

 I'm somewhat concerned that the -2 pile on in this review is a double
 standard of libvirt features, and features exploiting really new
 upstream features. I feel like a lot of the language being used here
 about the burden of doing this testing is exactly the same as was
 presented by the docker team before their driver was removed, which was
 ignored by the Nova team at the time.

Personally, I wasn't very comfortable with the docker driver move. It
certainly gave an outward impression that we're an unfriendly community.
The mitigating factor was that a lot of friendly, collaborative,
coaching work went on in the background for months. Expectations were
communicated well in advance.

Kicking the docker driver out of the tree has resulted in an uptick in
the amount of work happening on it, but I suspect most people involved
have a bad taste in their mouths. I guess there's incentives at play
which mean they'll continue plugging away at it, but those incentives
aren't always at play.

 It was the concern by the freebsd
 team, which was also ignored and they were told to go land libvirt
 patches instead.
 
 I'm ok with us as a project changing our mind and deciding that the test
 bar needs to be taken down a notch or two because it's too burdensome to
 contributors and vendors, but if we are doing that, we need to do it for
 everyone. A lot of other organizations have put a ton of time and energy
 into this, and are carrying a maintenance cost of running these systems
 to get results back in a timely basis.

I don't agree that we need to apply the same rules equally to everyone.

At least part of the reasoning behind the emphasis on 3rd party CI
testing was that projects (Neutron in particular) were being overwhelmed
by contributions to drivers from developers who never contributed in any
way to the core. The corollary of that is the contributors who do
contribute to the core should be given a bit more leeway in return.

There's a natural building of trust and element of human relationships
here. As a reviewer, you learn to trust contributors with a good track
record and perhaps prioritize contributions from them.

 As we seem deadlocked in the review, I think the mailing list is
 probably a better place for this.
 
 If we want to reduce the standards for libvirt we should reconsider
 what's being asked of 3rd party CI teams, and things like the docker
 driver, as well as the A, B, C driver classification. Because clearly
 libvirt 1.2.5+ isn't actually class A supported.

No, there are features or code paths of the libvirt 1.2.5+ driver that
aren't as well tested as the class A designation implies. And we have
a proposal to make sure these aren't used by default:

  https://review.openstack.org/107119

i.e. to stray off the class A path, an operator has to opt into it by
changing a configuration option that explains they will be enabling code
paths which aren't yet tested upstream.

These features have value to some people now, they don't risk regressing
the class A driver and there's a clear path to them being elevated to
class A in time. We should value these contributions and nurture these
contributors.

Appending some of my comments from the review below. The tl;dr is that I
think we're losing sight of the importance of welcoming and nurturing
contributors, and valuing whatever contributions they can make. That
terrifies me. 

Mark.

---

Compared to other open source projects, we have done an awesome job in
OpenStack of having good functional test coverage. Arguably, given the
complexity of the system, we couldn't have got this far without it. I
can take zero credit for any of it.

However, not everything is tested now, nor is the tests we have
foolproof. When you consider the number of configuration options we
have, the supported distros, the ranges of library versions we claim to
support, etc., etc. I don't think we can ever get to an everything is
tested point.

In the absence of that, I think we should aim to be more clear what *is*
tested. The config option I suggest does that, which is a big part of
its merit IMHO.

We've had some success with the be nasty enough to driver contributors
and they'll do what we want approach so far, but IMHO that was an
exceptional approach for an exceptional situation - drivers that were
completely broken, and driver developers who didn't contribute to the
core 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Clark Boylan
On Wed, Jul 16, 2014 at 7:50 AM, Daniel P. Berrange berra...@redhat.com wrote:
 On Wed, Jul 16, 2014 at 04:15:40PM +0200, Sean Dague wrote:
 Recently the main gate updated from Ubuntu 12.04 to 14.04, and in doing
 so we started executing the livesnapshot code in the nova libvirt
 driver. Which fails about 20% of the time in the gate, as we're bringing
 computes up and down while doing a snapshot. Dan Berange did a bunch of
 debug on that and thinks it might be a qemu bug. We disabled these code
 paths, so live snapshot has now been ripped out.

 In January we also triggered a libvirt bug, and had to carry a private
 build of libvirt for 6 weeks in order to let people merge code in OpenStack.

 We never were able to switch to libvirt 1.1.1 in the gate using the
 Ubuntu Cloud Archive during Icehouse development, because it has a
 different set of failures that would have prevented people from merging
 code.

 Based on these experiences, libvirt version differences seem to be as
 substantial as major hypervisor differences.

 I think that is a pretty dubious conclusion to draw from just a
 couple of bugs. The reason they really caused pain is that because
 the CI test system was based on old version for too long. If it
 were tracking current upstream version of libvirt/KVM we'd have
 seen the problem much sooner  been able to resolve it during
 review of the change introducing the feature, as we do with any
 other bugs we encounter in software such as the breakage we see
 with my stuff off pypi.

How do you suggest we do this effectively with libvirt? In the past we
have tried to use newer versions of libvirt and they completely broke.
And the time to fixing that was non trivial. For most of our pypi
stuff we attempt to fix upstream and if that does not happen quickly
we pin (arguably we don't do this well either, see the sqlalchemy=0.7
issues of the past).

I am worried that we would just regress to the current process because
we have tried something similar to this previously and were forced to
regress to the current process.

 There is a proposal here -
 https://review.openstack.org/#/c/103923/ to hold newer versions of
 libvirt to the same standard we hold xen, vmware, hyperv, docker,
 ironic, etc.

 That is rather misleading statement you're making there. Libvirt is
 in fact held to *higher* standards than xen/vmware/hypver because it
 is actually gating all commits. The 3rd party CI systems can be
 broken for days, weeks and we still happily accept code for those
 virt. drivers.

 AFAIK there has never been any statement that every feature added
 to xen/vmware/hyperv must be tested by the 3rd party CI system.
 All of the CI systems, for whatever driver, are currently testing
 some arbitrary subset of the overall features of that driver, and
 by no means every new feature being approved in review has coverage.

 I'm somewhat concerned that the -2 pile on in this review is a double
 standard of libvirt features, and features exploiting really new
 upstream features. I feel like a lot of the language being used here
 about the burden of doing this testing is exactly the same as was
 presented by the docker team before their driver was removed, which was
 ignored by the Nova team at the time. It was the concern by the freebsd
 team, which was also ignored and they were told to go land libvirt
 patches instead.

 As above the only double standard is that libvirt tests are all gating
 and 3rd party tests are non-gating.

 If we want to reduce the standards for libvirt we should reconsider
 what's being asked of 3rd party CI teams, and things like the docker
 driver, as well as the A, B, C driver classification. Because clearly
 libvirt 1.2.5+ isn't actually class A supported.

 AFAIK the requirement for 3rd party CI is merely that it has to exist,
 running some arbitrary version of the hypervisor in question. We've
 not said that 3rd party CI has to be covering every version or every
 feature, as is trying to be pushed on libvirt here.

 The Class A, Class B, Class C classifications were always only
 ever going to be a crude approximation. Unless you define them to be
 wrt the explicit version of every single deb/pypi package installed
 in the gate system (which I don't believe anyone has every suggested)
 there is always risk that a different version of some package has a
 bug that Nova tickles.

 IMHO the classification we do for drivers provides an indication as
 to the quality of the *Nova* code. IOW class A indicates that we've
 throughly tested the Nova code and believe it to be free of bugs for
 the features we've tested. If there is a bug in a 3rd party package
 that doesn't imply that the Nova code is any less well tested or
 more buggy. Replace libvirt with mysql in your example above. A new
 version of mysql with a bug does not imply that Nova is suddenly not
 class A tested.

 IMHO it is upto the downstream vendors to run testing to ensure that
 what 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Daniel P. Berrange
On Wed, Jul 16, 2014 at 08:12:47AM -0700, Clark Boylan wrote:
 On Wed, Jul 16, 2014 at 7:50 AM, Daniel P. Berrange berra...@redhat.com 
 wrote:
  On Wed, Jul 16, 2014 at 04:15:40PM +0200, Sean Dague wrote:
  Recently the main gate updated from Ubuntu 12.04 to 14.04, and in doing
  so we started executing the livesnapshot code in the nova libvirt
  driver. Which fails about 20% of the time in the gate, as we're bringing
  computes up and down while doing a snapshot. Dan Berange did a bunch of
  debug on that and thinks it might be a qemu bug. We disabled these code
  paths, so live snapshot has now been ripped out.
 
  In January we also triggered a libvirt bug, and had to carry a private
  build of libvirt for 6 weeks in order to let people merge code in 
  OpenStack.
 
  We never were able to switch to libvirt 1.1.1 in the gate using the
  Ubuntu Cloud Archive during Icehouse development, because it has a
  different set of failures that would have prevented people from merging
  code.
 
  Based on these experiences, libvirt version differences seem to be as
  substantial as major hypervisor differences.
 
  I think that is a pretty dubious conclusion to draw from just a
  couple of bugs. The reason they really caused pain is that because
  the CI test system was based on old version for too long. If it
  were tracking current upstream version of libvirt/KVM we'd have
  seen the problem much sooner  been able to resolve it during
  review of the change introducing the feature, as we do with any
  other bugs we encounter in software such as the breakage we see
  with my stuff off pypi.
 
 How do you suggest we do this effectively with libvirt? In the past we
 have tried to use newer versions of libvirt and they completely broke.
 And the time to fixing that was non trivial. For most of our pypi
 stuff we attempt to fix upstream and if that does not happen quickly
 we pin (arguably we don't do this well either, see the sqlalchemy=0.7
 issues of the past).

The real big problem we had was the firewall deadlock problem. When
I was made aware of that problem I worked on fixing that in upstream
libvirt immediately. IIRC we had a solution in a week or two which
was added to a libvirt stable release update. Much of the further
delay was in waiting for the fixes to make their way into the
Ubuntu repositories. If the gate were ignoring Ubuntu repos and
pulling latest upstream libvirt, then we could have just pinned
to an older libvirt until the fix was pushed out to a stable
libvirt release. The libvirt community release process is flexible
enough to push out priority bug fix releases in a matter of days,
or less,  if needed. So temporarily pinning isn't the end of the
world in that respect.

 I am worried that we would just regress to the current process because
 we have tried something similar to this previously and were forced to
 regress to the current process.

IMHO the longer we wait between updating the gate to new versions
the bigger the problems we create for ourselves. eg we were switching
from 0.9.8 released Dec 2011, to  1.1.1 released Jun 2013, so we
were exposed to over 1 + 1/2 years worth of code churn in a single
event. The fact that we only hit a couple of bugs in that, is actually
remarkable given the amount of feature development that had gone into
libvirt in that time. If we had been tracking each intervening libvirt
release I expect the majority of updates would have had no ill effect
on us at all. For the couple of releases where there was a problem we
would not be forced to rollback to a version years older again, we'd
just drop back to the previous release at most 1 month older.

Ultimately, thanks to us identifying  fixing those previously seen
bugs, we did just switch from 0.9.8 to 1.2.2 which is a 2+1/2 year
jump, and the only problem we've hit is the live snapshot problem
which appears to be a QEMU bug.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Dan Smith
 Based on these experiences, libvirt version differences seem to be as
 substantial as major hypervisor differences.
 
 I think that is a pretty dubious conclusion to draw from just a
 couple of bugs. The reason they really caused pain is that because
 the CI test system was based on old version for too long.

I think the conclusion being made is that libvirt versions two years
apart are effectively like different major versions of a hypervisor. I
don't think that's wrong.

 That is rather misleading statement you're making there. Libvirt is
 in fact held to *higher* standards than xen/vmware/hypver because it
 is actually gating all commits. The 3rd party CI systems can be
 broken for days, weeks and we still happily accept code for those
 virt. drivers.

Right, and we've talked about raising that bar as well, by tracking
their status more closely, automatically -2'ing patches that touch the
subdirectory but don't get a passing vote from the associated CI system,
etc.

You're definitely right that libvirt is held to a higher bar in terms of
it being required to pass tests before we can even mechanically land a
patch. However, there is a lot of function in the driver that we don't
test right now because of the version we're tied to in the gate nodes.
It's actually *easier* for a 3rd party system like vmware to roll their
environment and enable tests of newer features, so I don't think that
this requirement would cause existing 3rd party CI systems any trouble.

 AFAIK there has never been any statement that every feature added
 to xen/vmware/hyperv must be tested by the 3rd party CI system.

On almost every spec that doesn't already call it out, a reviewer asks
how are you going to test this beyond just unit tests? I think the
assumption and feeling among most reviewers is that new features,
especially that depend on new things (be it storage drivers, hypervisor
versions, etc) are concerned about approving without testing.

 AFAIK the requirement for 3rd party CI is merely that it has to exist,
 running some arbitrary version of the hypervisor in question. We've
 not said that 3rd party CI has to be covering every version or every
 feature, as is trying to be pushed on libvirt here.

The requirement in the past has been that it has to exist. At the last
summit, we had a discussion about how to raise the bar on what we
currently have. We made a lot of progress getting those systems
established (only because we had a requirement, by the way) in the last
cycle. Going forward, we need to have new levels of expectations in
terms of coverage and reliability of those things, IMHO.

 As above, aside from the question of gating vs non-gating, the bar is
 already set at the same level of everyone. There has to be a CI system
 somewhere testing some arbitrary version of the software. Everyone meets
 that requirement.

Wording our current requirement as you have here makes it sound like an
arbitrary ticky mark, which saddens and kind of offends me. What we
currently have was a step in the right direction. It was a lot of work,
but it's by no means arbitrary nor sufficient, IMHO.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Eric Windisch
On Wed, Jul 16, 2014 at 10:15 AM, Sean Dague s...@dague.net wrote:

 Recently the main gate updated from Ubuntu 12.04 to 14.04, and in doing
 so we started executing the livesnapshot code in the nova libvirt
 driver. Which fails about 20% of the time in the gate, as we're bringing
 computes up and down while doing a snapshot. Dan Berange did a bunch of
 debug on that and thinks it might be a qemu bug. We disabled these code
 paths, so live snapshot has now been ripped out.

 In January we also triggered a libvirt bug, and had to carry a private
 build of libvirt for 6 weeks in order to let people merge code in
 OpenStack.

 We never were able to switch to libvirt 1.1.1 in the gate using the
 Ubuntu Cloud Archive during Icehouse development, because it has a
 different set of failures that would have prevented people from merging
 code.

 Based on these experiences, libvirt version differences seem to be as
 substantial as major hypervisor differences. There is a proposal here -
 https://review.openstack.org/#/c/103923/ to hold newer versions of
 libvirt to the same standard we hold xen, vmware, hyperv, docker,
 ironic, etc.

 I'm somewhat concerned that the -2 pile on in this review is a double
 standard of libvirt features, and features exploiting really new
 upstream features. I feel like a lot of the language being used here
 about the burden of doing this testing is exactly the same as was
 presented by the docker team before their driver was removed, which was
 ignored by the Nova team at the time. It was the concern by the freebsd
 team, which was also ignored and they were told to go land libvirt
 patches instead.


For running our own CI, the burden was largely a matter of resource and
time constraints for individual contributors and/or startups to setup and
maintain 3rd-party CI, especially in light of a parallel requirement to
pass the CI itself. I received community responses that equated to, if you
were serious, you'd dedicate several full-time developers and/or
infrastructure engineers available for OpenStack development, plus several
thousand a month in infrastructure itself.  For Docker, these were simply
not options. Back in January, putting 2-3 engineers fulltime toward
OpenStack would have been a contribution of 10-20% of our engineering
force. OpenStack is not more important to us than Docker itself.

This thread highlights more deeply the problems for the FreeBSD folks.
First, I still disagree with the recommendation that they contribute to
libvirt. It's a classic example of creating two or more problems from one.
Once they have support in libvirt, how long before their code is in a
version of libvirt acceptable to Nova? When they hit edge-cases or bugs,
requiring changes in libvirt, how long before those fixes are accepted by
Nova?

I concur with thoughts in the Gerrit review which suggest there should be a
non-voting gate for testing against the latest libvirt.

I think the ideal situation would be to functionally test against multiple
versions of libvirt. We'd have at least two versions: trunk,
latest-stable. We might want trunk, trunk-snapshot-XYZ, latest-stable,
version-in-ubuntu, version-in-rhel, or any number of back-versions
included in the gate. The version-in-rhel and version-in-ubuntu might be
good candidates for 3rd-party CI.


Regards,
Eric Windisch
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Daniel P. Berrange
On Wed, Jul 16, 2014 at 08:29:26AM -0700, Dan Smith wrote:
  Based on these experiences, libvirt version differences seem to be as
  substantial as major hypervisor differences.
  
  I think that is a pretty dubious conclusion to draw from just a
  couple of bugs. The reason they really caused pain is that because
  the CI test system was based on old version for too long.
 
 I think the conclusion being made is that libvirt versions two years
 apart are effectively like different major versions of a hypervisor. I
 don't think that's wrong.
 
  That is rather misleading statement you're making there. Libvirt is
  in fact held to *higher* standards than xen/vmware/hypver because it
  is actually gating all commits. The 3rd party CI systems can be
  broken for days, weeks and we still happily accept code for those
  virt. drivers.
 
 Right, and we've talked about raising that bar as well, by tracking
 their status more closely, automatically -2'ing patches that touch the
 subdirectory but don't get a passing vote from the associated CI system,
 etc.
 
 You're definitely right that libvirt is held to a higher bar in terms of
 it being required to pass tests before we can even mechanically land a
 patch. However, there is a lot of function in the driver that we don't
 test right now because of the version we're tied to in the gate nodes.
 It's actually *easier* for a 3rd party system like vmware to roll their
 environment and enable tests of newer features, so I don't think that
 this requirement would cause existing 3rd party CI systems any trouble.
 
  AFAIK there has never been any statement that every feature added
  to xen/vmware/hyperv must be tested by the 3rd party CI system.
 
 On almost every spec that doesn't already call it out, a reviewer asks
 how are you going to test this beyond just unit tests? I think the
 assumption and feeling among most reviewers is that new features,
 especially that depend on new things (be it storage drivers, hypervisor
 versions, etc) are concerned about approving without testing.

Expecting new functionality to have testing coverage in the common
case is entirely reasonable. What I disagree with is the proposal
to say it is mandatory, when the current CI system is not able to
test it for any given reason. In some cases it might be reasonable
to expect the contributor to setup 3rd party CI, but we absolutely
cannot make that a fixed rule or we'll kill contributions from
people who are not backed by vendors in a position to spend the
significant resource it takes to setup  maintain CI.  IMHO the
burden is on the maintainer of the CI to ensure it is able to
follow the needs of the contributors. ie if the feature needs a
newer libvirt version in order to test with, the CI maintainer(s)
should deal with that. We should not turn away the contributor
for a problem that is outside their control.

  AFAIK the requirement for 3rd party CI is merely that it has to exist,
  running some arbitrary version of the hypervisor in question. We've
  not said that 3rd party CI has to be covering every version or every
  feature, as is trying to be pushed on libvirt here.
 
 The requirement in the past has been that it has to exist. At the last
 summit, we had a discussion about how to raise the bar on what we
 currently have. We made a lot of progress getting those systems
 established (only because we had a requirement, by the way) in the last
 cycle. Going forward, we need to have new levels of expectations in
 terms of coverage and reliability of those things, IMHO.

IMHO we need to maintain a balance between ensuring code quality
and being welcoming  accepting to new contributors. 

New features have a certain value $NNN to the project  our users.
The lack of CI testing does not automatically imply that the value
of that work is erased to $0 or negative $MMM. Of course the lack
of CI will create uncertainty in how valuable it is, and potentially
imply costs for us if we have to deal with resolving bugs later.
We must be careful not to overly obsess on the problems of work
that might have bugs, to the detriment of all the many submissions
that work well.

We need to take a pragmatic view of this tradeoff based on the risk
implied by the new feature. If the new work is impacting existing
functional codepaths then this clearly exposes existing users to
risk of regressions, so if that codepath is not tested this is
something to be very wary of. If the new work is adding new code
paths that existing deployments wouldn't exercise unless they 
explicitly opt in to the feature, the risk is significantly lower.
The existence of unit tests will also serve to limit the risk in
many, but not all, situations. If something is not CI tested then
I'd also expect it to get greater attention during review, with
the reviewers actually testing it functionally themselves as well
as code inspection. Finally we should also have some good faith in
our contributors that they are not in fact just submitting 

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Kashyap Chamarthy
On Wed, Jul 16, 2014 at 04:15:40PM +0200, Sean Dague wrote:

[. . .]

 Anyway, discussion welcomed. My primary concern right now isn't actually
 where we set the bar, but that we set the same bar for everyone.

As someone who tries to test Nova w/ upstream libvirt/QEMU, couple of
points why I disagree with your above comments:


  - From time time I find myself frustrated due to older versions of
libvirt on CI infra systems: I try to investigate a bug, 2 hours
into debugging, it turns out that CI system is using very old
libvirt, alas - it's not in my control. Consequence: The bug
needlessly got bumped up in priority for investigation, while
it's already solved in an existing upstream release, just waiting to
be picked up CI infra.

  - Also, as a frequent tester of libvirt upstream, and a participant
in debugging the recent Nova snapshots issue mentioned here, the
comment[1] (by Daniel Berrange) debunks the illusion of the
required verison of libvirt should have been released for at least
30 days very convincingly in crystal clear language.

  - FWIW, I feel the libvirt version cap[2] is a fine idea to alleviate
this.

[1] https://review.openstack.org/#/c/103923/ (Comment:Jul 14 9:24 PM)
  -
  The kind of new features we're depending on in Nova (looking at specs
  proposed for Juno) are not the kind of features that users in any distro
  are liable to test themselves, outside of the context of Nova (or
  perhaps oVirt) applications. eg Users in a distro aren't likely to
  seriously test the NUMA/Hugepages stuff in libvirt until it is part of
  Nova and that Nova release is in their distro, which creates a
  chicken+egg problem wrt your proposal. In addition I have not seen any
  evidence of significant libvirt testing by the distro maintainers
  themselves either, except for the enterprise distros and we if we wait
  for enterprise distros to pick up a new libvirt we'd be talking 1 year+
  of delay. Finally if just having it in a distro is your benchmark,
  then this is satisfied by Fedora rawhide inclusion, but there's
  basically no user testing of that. So if you instead set the
  benchmark to be a released distro, then saying this is a 1 month
  delay is rather misleading, because distros only release once every
  6 months, so you'd really be talking about a 7 month delay on using
  new features. For all these reasons, tieing Nova acceptance to
  distro inclusion of libvirt is a fundamentally flawed idea that does
  not achieve what it purports to achieve  is detrimental to Nova.
  
  I think the key problem here is that our testing is inadequate and we
  need to address that aspect of it rather than crippling our development
  process.
  -

 [2] https://review.openstack.org/#/c/107119/ -- libvirt: add version
 cap tied to gate CI testing

-- 
/kashyap

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Johannes Erdfelt
On Wed, Jul 16, 2014, Mark McLoughlin mar...@redhat.com wrote:
 No, there are features or code paths of the libvirt 1.2.5+ driver that
 aren't as well tested as the class A designation implies. And we have
 a proposal to make sure these aren't used by default:
 
   https://review.openstack.org/107119
 
 i.e. to stray off the class A path, an operator has to opt into it by
 changing a configuration option that explains they will be enabling code
 paths which aren't yet tested upstream.

So that means the libvirt driver will be a mix of tested and untested
features, but only the tested code paths will be enabled by default?

The gate not only tests code as it gets merged, it tests to make sure it
doesn't get broken in the future by other changes.

What happens when it comes time to bump the default version_cap in the
future? It looks like there could potentially be a scramble to fix code
that has been merged but doesn't work now that it's being tested. Which
potentially further slows down development since now unrelated code
needs to be fixed.

This sounds like we're actively weakening the gate we currently have.

 However, not everything is tested now, nor is the tests we have
 foolproof. When you consider the number of configuration options we
 have, the supported distros, the ranges of library versions we claim to
 support, etc., etc. I don't think we can ever get to an everything is
 tested point.
 
 In the absence of that, I think we should aim to be more clear what *is*
 tested. The config option I suggest does that, which is a big part of
 its merit IMHO.

I like the sound of this especially since it's not clear right now at
all.

JE


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Roman Bogorodskiy
  Eric Windisch wrote:

 This thread highlights more deeply the problems for the FreeBSD folks.
 First, I still disagree with the recommendation that they contribute to
 libvirt. It's a classic example of creating two or more problems from one.
 Once they have support in libvirt, how long before their code is in a
 version of libvirt acceptable to Nova? When they hit edge-cases or bugs,
 requiring changes in libvirt, how long before those fixes are accepted by
 Nova?

Could you please elaborate why you disagree on the contributing patches
to libvirt approach and what the alternative approach do you propose?

Also, could you please elaborate on what is 'version of libvirt
acceptable to Nova'? Cannot we just say that e.g. Nova requires libvirt
X.Y to be deployed on FreeBSD?

Anyway, speaking about FreeBSD support I assume we actually talking
about Bhyve support. I think it'd be good to break the task and
implement FreeBSD support for libvirt/Qemu first.

Qemu driver of libvirt works fine with FreeBSD for quite some time
already and adding support for that in Nova will allow to do all the
ground work before we could move to the libvirt/bhyve support.

I'm planning to start with adding networking support. Unfortunately, it
seems I got late with the spec for Juno though:

https://review.openstack.org/#/c/95328/

Roman Bogorodskiy


pgpkeNEjFWmYC.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Vishvananda Ishaya

On Jul 16, 2014, at 8:28 AM, Daniel P. Berrange berra...@redhat.com wrote:

 On Wed, Jul 16, 2014 at 08:12:47AM -0700, Clark Boylan wrote:
 
 I am worried that we would just regress to the current process because
 we have tried something similar to this previously and were forced to
 regress to the current process.
 
 IMHO the longer we wait between updating the gate to new versions
 the bigger the problems we create for ourselves. eg we were switching
 from 0.9.8 released Dec 2011, to  1.1.1 released Jun 2013, so we
 were exposed to over 1 + 1/2 years worth of code churn in a single
 event. The fact that we only hit a couple of bugs in that, is actually
 remarkable given the amount of feature development that had gone into
 libvirt in that time. If we had been tracking each intervening libvirt
 release I expect the majority of updates would have had no ill effect
 on us at all. For the couple of releases where there was a problem we
 would not be forced to rollback to a version years older again, we'd
 just drop back to the previous release at most 1 month older.

This is a really good point. As someone who has to deal with packaging
issues constantly, it is odd to me that libvirt is one of the few places
where we depend on upstream packaging. We constantly pull in new python
dependencies from pypi that are not packaged in ubuntu. If we had to
wait for packaging before merging the whole system would grind to a halt.

I think we should be updating our libvirt version more frequently vy
installing from source or our own ppa instead of waiting for the ubuntu
team to package it.

Vish


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Monty Taylor
On 07/16/2014 07:27 PM, Vishvananda Ishaya wrote:
 
 On Jul 16, 2014, at 8:28 AM, Daniel P. Berrange berra...@redhat.com wrote:
 
 On Wed, Jul 16, 2014 at 08:12:47AM -0700, Clark Boylan wrote:

 I am worried that we would just regress to the current process because
 we have tried something similar to this previously and were forced to
 regress to the current process.

 IMHO the longer we wait between updating the gate to new versions
 the bigger the problems we create for ourselves. eg we were switching
 from 0.9.8 released Dec 2011, to  1.1.1 released Jun 2013, so we
 were exposed to over 1 + 1/2 years worth of code churn in a single
 event. The fact that we only hit a couple of bugs in that, is actually
 remarkable given the amount of feature development that had gone into
 libvirt in that time. If we had been tracking each intervening libvirt
 release I expect the majority of updates would have had no ill effect
 on us at all. For the couple of releases where there was a problem we
 would not be forced to rollback to a version years older again, we'd
 just drop back to the previous release at most 1 month older.
 
 This is a really good point. As someone who has to deal with packaging
 issues constantly, it is odd to me that libvirt is one of the few places
 where we depend on upstream packaging. We constantly pull in new python
 dependencies from pypi that are not packaged in ubuntu. If we had to
 wait for packaging before merging the whole system would grind to a halt.
 
 I think we should be updating our libvirt version more frequently vy
 installing from source or our own ppa instead of waiting for the ubuntu
 team to package it.

Shrinking in terror from what I'm about to say ... but I actually agree
with this, There are SEVERAL logistical issues we'd need to sort, not
the least of which involve the actual mechanics of us doing that and
properly gating,etc. But I think that, like the python depends where we
tell distros what version we _need_ rather than using what version they
have, libvirt, qemu, ovs and maybe one or two other things are areas in
which we may want or need to have a strongish opinion.

I'll bring this up in the room tomorrow at the Infra/QA meetup, and will
probably be flayed alive for it - but maybe I can put forward a
straw-man proposal on how this might work.

Monty



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Eric Windisch
On Wed, Jul 16, 2014 at 12:55 PM, Roman Bogorodskiy 
rbogorods...@mirantis.com wrote:

   Eric Windisch wrote:

  This thread highlights more deeply the problems for the FreeBSD folks.
  First, I still disagree with the recommendation that they contribute to
  libvirt. It's a classic example of creating two or more problems from
 one.
  Once they have support in libvirt, how long before their code is in a
  version of libvirt acceptable to Nova? When they hit edge-cases or bugs,
  requiring changes in libvirt, how long before those fixes are accepted by
  Nova?

 Could you please elaborate why you disagree on the contributing patches
 to libvirt approach and what the alternative approach do you propose?


I don't necessarily disagree with contributing patches to libvirt. I
believe that the current system makes it difficult to perform quick,
iterative development. I wish to see this thread attempt to solve that
problem and reduce the barrier to getting stuff done.


 Also, could you please elaborate on what is 'version of libvirt
 acceptable to Nova'? Cannot we just say that e.g. Nova requires libvirt
 X.Y to be deployed on FreeBSD?


This is precisely my point, that we need to support different versions of
libvirt and to test those versions. If we're going to support  different
versions of libvirt on FreeBSD, Ubuntu, and RedHat - those should be
tested, possibly as third-party options.

The primary testing path for libvirt upstream should be with the latest
stable release with a non-voting test against trunk. There might be value
in testing against a development snapshot as well, where we know there are
features we want in an unreleased version of libvirt but where we cannot
trust trunk to be stable enough for gate.


 Anyway, speaking about FreeBSD support I assume we actually talking
 about Bhyve support. I think it'd be good to break the task and
 implement FreeBSD support for libvirt/Qemu first


 I believe Sean was referencing to Bhyve support, this is how I interpreted
it.


-- 
Regards,
Eric Windisch
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Chris Friesen

On 07/16/2014 11:59 AM, Monty Taylor wrote:

On 07/16/2014 07:27 PM, Vishvananda Ishaya wrote:



This is a really good point. As someone who has to deal with packaging
issues constantly, it is odd to me that libvirt is one of the few places
where we depend on upstream packaging. We constantly pull in new python
dependencies from pypi that are not packaged in ubuntu. If we had to
wait for packaging before merging the whole system would grind to a halt.

I think we should be updating our libvirt version more frequently vy
installing from source or our own ppa instead of waiting for the ubuntu
team to package it.


Shrinking in terror from what I'm about to say ... but I actually agree
with this, There are SEVERAL logistical issues we'd need to sort, not
the least of which involve the actual mechanics of us doing that and
properly gating,etc. But I think that, like the python depends where we
tell distros what version we _need_ rather than using what version they
have, libvirt, qemu, ovs and maybe one or two other things are areas in
which we may want or need to have a strongish opinion.

I'll bring this up in the room tomorrow at the Infra/QA meetup, and will
probably be flayed alive for it - but maybe I can put forward a
straw-man proposal on how this might work.


How would this work...would you have them uninstall the distro-provided 
libvirt/qemu and replace them with newer ones?  (In which case what 
happens if the version desired by OpenStack has bugs in features that 
OpenStack doesn't use, but that some other software that the user wants 
to run does use?)


Or would you have OpenStack versions of them installed in parallel in an 
alternate location?


Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev