Zuul just caused my brain to overload ;) thx for the detailed explanation. Sent from my really tiny device...
> On Oct 29, 2013, at 3:42 AM, "Sean Dague" <s...@dague.net> wrote: > > Andrew Laski correctly called us out for not really proving enough > information n the python-novaclient revert yesterday - > https://review.openstack.org/#/c/54108/. Appologies there. At the time we > were dealing with a gate that grenade was failing every change (for the prior > 6 hours), we were all on our first cup of coffee, and while we got to > resolution, we did so with an entirely unuseful commit message to explain it. > > Here's what happened. python-novaclient landed a change that changed the user > interface. This change meant that devstack exercises failed on validating the > details on getting aggregates. > > However, upgrade testing is hard, and we had a loophole, that led us to a > wedge in the gate. > > For the grenade jobs we prep 2 versions of the OpenStack codebase, grizzly > and master (yes, still grizzly and master, we're working on that). The > grizzly tree is grizzly devstack, which means it's grizzly on all the core > servers, but master on all the clients. However, the grizzly tree doesn't get > "zuulified", which was the crux of the issue. > > By zuulified I mean think about the zuul queue. How do we actually test a > change 15 deep in the gate? We aren't testing just that change, but all the > gerrit proposed changes above it. That means that zuul needs to go through > and update relevant git trees beyond master, but to the proposed change sets > for all the jobs in front of it. This is accross projects, and should be > across branches. > > But we'd not gotten the system to do this correctly on the "old" side yet. > Which means that python-novaclient landed a breaking change, but the "old" > side built a grizzly cloud with only master, not master + gerrit. It passed > the verification of the "old" cloud, then moved to the new cloud, then ran a > different set of tests to verify the new cloud, which passed. > > However, by threading the needle in this way, it meant no one else could ever > pass grenade again. The quick fix was the python-novaclient revert. The real > fix is probably this - https://review.openstack.org/#/c/53940/ which we were > actually working on last week, to both update the set of trees we are using, > and update the zuul refs on the "old" side of the equation. Once that lands > I'll attempt to revert the revert, and ensure that it actually gets caught in > the system. Then we can work on updating tests so it can get through. But > right now it's a perfect test case to proove that we did this right, so > leaving it in the reverted state is critical. > > This also highlights one of the reasons I've been hard on folks recently > about some alternative upgrade or mixed version testing models, and doing it > outside of grenade. Everything is simple when you talk about a single change. > But when you are 15 or 20 deep in zuul gate, and have to handle 3 proposed > stable nova changes, 5 proposed master nova changes, a keystone stable, a > keystone master, and a few cinder master changes in front of you to build the > environments you need to test in the gate.... this gets complicated fast. > Basically you aren't allowed to use git inside your upgrade tool for this > reason, because your tool has no idea what it's supposed to actually test, > only ZUUL knows. And, as you can see, we've yet to get this whole thing > mapped out the first time. :) > > -Sean > > -- > Sean Dague > http://dague.net > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev