On Tue, Jun 14, 2016 at 12:19:54PM -0700, Chris Hoge wrote: > > > On Jun 14, 2016, at 11:21 AM, Matthew Treinish <[email protected]> wrote: > > > > On Tue, Jun 14, 2016 at 10:57:05AM -0700, Chris Hoge wrote: > >> Last year, in response to Nova micro-versioning and extension updates[1], > >> the QA team added strict API schema checking to Tempest to ensure that > >> no additional properties were added to Nova API responses[2][3]. In the > >> last year, at least three vendors participating the the OpenStack Powered > >> Trademark program have been impacted by this change, two of which > >> reported this to the DefCore Working Group mailing list earlier this > >> year[4]. > >> > >> The DefCore Working Group determines guidelines for the OpenStack Powered > >> program, which includes capabilities with associated functional tests > >> from Tempest that must be passed, and designated sections with associated > >> upstream code [5][6]. In determining these guidelines, the working group > >> attempts to balance the future direction of development with lagging > >> indicators of deployments and user adoption. > >> > >> After a tremendous amount of consideration, I believe that the DefCore > >> Working Group needs to implement a temporary waiver for the strict API > >> checking requirements that were introduced last year, to give downstream > >> deployers more time to catch up with the strict micro-versioning > >> requirements determined by the Nova/Compute team and enforced by the > >> Tempest/QA team. > > > > I'm very much opposed to this being done. If we're actually concerned with > > interoperability and verify that things behave in the same manner between > > multiple > > clouds then doing this would be a big step backwards. The fundamental > > disconnect > > here is that the vendors who have implemented out of band extensions or were > > taking advantage of previously available places to inject extra attributes > > believe that doing so means they're interoperable, which is quite far from > > reality. **The API is not a place for vendor differentiation.** > > Yes, it’s bad practice, but it’s also a reality, and I honestly believe that > vendors have received the message and are working on changing.
They might be working on this, but this change was coming for quite some time it shouldn't be a surprise to anyone at this point. I mean seriously, it's been in tempest for 1 year, and it took 6months to land. Also, lets say we set a hard deadline on this new option to disable the enforcement and enforce it. Then we implement a similar change on keystone are we gonna have to do the same thing again when vendors who have custom things running there fail. > > > As a user of several clouds myself I can say that having random gorp in a > > response makes it much more difficult to use my code against multiple > > clouds. I > > have to determine which properties being returned are specific to that > > vendor's > > cloud and if I actually need to depend on them for anything it makes > > whatever > > code I'm writing incompatible for using against any other cloud. (unless I > > special case that block for each cloud) Sean Dague wrote a good post where > > a lot > > of this was covered a year ago when microversions was starting to pick up > > steam: > > > > https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2 > > <https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2> > > > > I'd recommend giving it a read, he explains the user first perspective more > > clearly there. > > > > I believe Tempest in this case is doing the right thing from an > > interoperability > > perspective and ensuring that the API is actually the API. Not an API with > > extra > > bits a vendor decided to add. > > A few points on this, though. Right now, Nova is the only API that is > enforcing this, and the clients. While this may change in the > future, I don’t think it accurately represents the reality of what’s > happening in the ecosystem. This in itself doesn't make a difference. There is a disparity in the level of testing across all the projects. Nova happens to be further along in regards to api stability and testing things compared to a lot of projects, it's not really a surprise that they're the first for this to come up on. It's only a matter of time for other projects to follow nova's example and implement similar enforcement. > > As mentioned before, we also need to balance the lagging nature of > DefCore as an interoperability guideline with the needs of testing > upstream changes. I’m not asking for a permanent change that > undermines the goals of Tempest for QA, rather a temporary > upstream modification that recognizes the challenges faced by > vendors in the market right now, and gives them room to continue > to align themselves with upstream. Without this, the two other > alternatives are to: > > * Have some vendors leave the Powered program unnecessarily, > weakening it. > * Force DefCore to adopt non-upstream testing, either as a fork > or an independent test suite. > > Neither seem ideal to me. It might not be ideal for a vendor to leave the program, but I think it's a necessary consequence of evolving the guidlines to become stricter over time. What we define as the minimum requirements for interoperability and by extension use of the trademark will continue to evolve. Every time we add additional tests, more stringent checking, or change something inevitably someone is going to fail no matter how slowly we ramp it out. There's a limit to how accommodating we should be here. This change has been in the wild for a year, and also took 6 months to land. The issue in question literally only ever will cause and issue if you add something extra, not OpenStack, to the API. All versions of the nova API (maybe not really old releases like <= folsom) should get passed this check without any issue. I still fail to see how a vendor failing the guidelines here is a bad thing. Isn't this what we're supposed to be doing. Also, defcore already has a mechanism for slowly rolling out changes like this. The guidelines contain a tempest sha1 (for better or worse): https://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json#n113 If the defcore committee still feels there needs to be a more gradual roll out of > 1yr (which I strongly disagree with) then the minimum sha1 should be set more conservatively to a point before the change in question. Yes that means old bugs will still be present in tempest, but I don't think we can have it both ways here. Either we say you have to pass stricter requirements or we don't. We added idempotent ids to tempest exactly for this reason so you can keep track of tests as things change. > > One of my goals is to transparently strengthen the ties between > upstream and downstream development. There is a deadline > built into this proposal, and my intention is to enforce it. My argument is that the deadline has already passed. We've been enforcing this in tempest for 1 year already. It's only coming up now because some vendors didn't pay attention to anything happening in the community or with changes in the testing guidelines were incoming and now are stuck. From my perspective this will always happen no matter how gradually we make changes and how much we advertise it. > > > I don't think a cloud or product that does this > > to the api should be considered an interoperable OpenStack cloud and > > failing the > > tests is the correct behavior. > > I think it’s more nuanced than this, especially right now. > Only additions to responses will be considered, not changes. > These additions will be clearly labelled as variations, > signaling the differences to users. Existing clients in use > will not break. Correct behavior will eventually be enforced, > and this would be clearly signaled by both the test tool and > through the administrative program. You're making large assumptions about how the APIs are actually consumed here. You can't assume that only one of the clients you know about is being used to talk to APIs. For example, I have a bunch of code I wrote a while ago that uses the tempest clients with [1] to interact with clouds. That code would fail the second I talked to a cloud with these extra bits enabled. Granted that's a bit of a contrived example, but if I'm dealing with the api at a lower level (using my hypothetical hand built fortran client) it's perfectly reasonable to assume that I start on vendor A's "openstack" cloud see the extra params in the response and assume they're everywhere and make my code depend on that. Then when I use a cloud deployed on the 3 spare machines in my basement from the latest release tarballs everything starts failing without any indication where that extra parameter went. That's the kind of experience we're trying to avoid. Also, there is also no guarantee that the extra fields are clearly marked. If we disable this checking literally anything can be added to the responses from nova and still pass for example if we're not explicitly checking for it. For example, I could add a top level field to the server response "useful: True" for things that use my proprietary hypervisor and "useful: False" for libvirt guests. There is nothing stopping me from writing an extension that does that and adding it to the API and then passing all the tests. Nothing would catch this if we disable the strict validation. My fundamental concern here is that we're optimizing for the wrong set of priorities. As a community do we want to prioritize enforcing interoperability with guidelines we define and develop in the open and that things that we say are openstack behave in a manner for a user as we've developed in the community. Or do we want to optimize for ensuring that vendors who are continually slow to adapt don't ever fail guidelines when they've passed things in the past. I'm all for doing a slow roll out of changes, to give people a chance to adopt as new constraints are added, doing otherwise would be reckless. But, I feel in this case the time for that has past. I also don't think we should add workarounds to avoid adding constraints as things move forward, we should set reasonable min version of tempest to use. -Matt Treinish [1] https://github.com/mtreinish/mesocyclone
signature.asc
Description: PGP signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
