Hi Gema, Thanks for starting this thread off. :)
On Fri, 2012-08-31 at 11:38 +0100, Gema Gomez wrote: > On 30/08/12 19:53, Stéphane Graber wrote: > > The release team is in charge of releasing a pre-defined set of > > images, for a given list of media at a given date. That's how > > things are. > > > > When we unfortunately hit a bug at the last minute, like happened > > last week, the release team needs to check how critical it's. If > > it's considered as a show-stopper, like was the case here, the > > only action to take is to fix it as soon as possible, re-test and > > then release. > > > > If we know it's technically impossible to get it re-tested in > > time, then we need to release a day later, but that's a very last > > resort as releasing on a Friday brings its own set of problems. > > Which kind of problems do you face when releasing on Friday? I think > it'd be good for us to know the consequences as well. The problem with releasing on Friday is that we don't have as good coverage available to react to problems if they occur. This is standard policy for stable release updates as well as releases, and has become so as a result lessons learned the hard way. Exceptions do occur, but they are very special cases, and contingency/monitoring plans need to be figured out in advance. > > > In the case of 12.04.1, we noticed on release day that an image > > didn't actually fit on its target media and apparently no tester > > bothered to actually burn it to a standard CD... > > You could use du next time right after the image is built to satisfy > yourself that the size is good, it could be a standard check that you > guys do. Certain mandatory manual tests can only be run if a CD is burned, specifically the AMD64+MAC based systems don't work with USB. > We have added some static validation tests to jenkins and are > in the process of publishing them. The information was already published on http://cdimage.ubuntu.com/precise/daily-live/current/ (and related pages). Its standard practice that an oversize indicator in bold red is published on the page, when an image is over a predetermined size as specified by the development teams. It was a failure of both the QA and release teams that no one looked at the page before Thursday. Release team had been looking at them pretty heavily the prior week, and thought we had all the issues solved. Based on discussions with Stéphane there are now plans to be adding an indicator to the ISO tracker to make oversize issues more visible in the future, as that is where some folks are now focusing, rather than the original publishing pages. > I don't think we need to burn a CD > to know if the image is going to fit or not. But if you want us to > validate things manually, adding a test case to the current set in the > iso tracker will help track that someone has bothered. Unfortunately I > don't feel confident enough yet with the admin mode of the iso tracker > to change anything, so your expertise there would be appreciated. There is the implication that a CD is burned in some of the test cases already, so I'm not sure that another test case need to be added, but rather an existing one be split to make it explicit a CD or when appropriate a DVD be burned as part of the test. If you have specific questions on admin'ing the iso tracker, please feel free to join us in #ubuntu-iso-tracker. There are multiple folk available (me, Jean-Baptiste, Nick), that can help as well. > Anyway, looking forward rather than backwards, for Quantal the size is > 800MB so, what media do you suggest we test on for size next week? > > https://blueprints.launchpad.net/ubuntu/+spec/desktop-q-one-iso-for-q > For Desktop, please test on both USB "and" DVD. For Ubuntu Server, please test on both USB "and" CD. We want to make sure both paths work since they are likely to be common based on what hardware folks have access to, and we'll be manufacturing CD's for Server, and DVD's for desktop, so making sure there are no significant problems is important. > > We found an obvious way of fixing it (removing a langpack) within > > just a couple of hours, got the change reviewed, tested, the image > > rebuilt, the content checked and then fully re-tested by 3 testers > > in less than 3 hours. Leaving us a good 10-12h before we actually > > released the set. > > In my opinion, it is not possible for 3 people to do 10 installs + 3 > upgrades each to a good level of details in less than 3 hours. Yes, > you can rush through things or split the test cases between the three > of you, or consider some tests done because one test case is sort of a > subset of another, and do some risk based testing, but the level of > risk we are accepting is not clear nor understood by all the parties. Its a case of testing smart, which we should all be aiming for. I had good confidence after Stéphane, Jean-Baptiste, and Nick had exercised the tests, after discussing their methodology with them. We understood the scope of the changes and possible impact. Possibly we should look at getting their best practices understood wider though, so we can get the newer QA team members more efficient? > > So we had more peer review than required at every step, I'm really > > not understanding what you're complaining about. > > See above for explanation. We clearly have different views on what is > required/acceptable, and we need to reach an agreement, something that > works for all of us. The QA contact should always feel free to ask questions in #ubuntu-release if there are concerns as we're discussing issues that might motivate a respin. > > We always respin responsibly, believe it or not, respinning is at > > least as much pain for the release team as it's for the testers, > > we never take such action lightly. > > I believe it, I'd like to reduce the pain for everyone. We all want that. > > Critical installation bugs, security bugs, immediate post-install > > bugs and CD size problems are usually considered show-stoppers as > > these can't be worked around by the user. It'd be wrong not to > > respin for these. > > This is good information. What do you mean by usually? do you mean > always? What would be an example case of those showstoppers that > doesn't grant a respin? We discussed the fact that the Chinese 12.04.1 images were oversize with PES management resposible for those images. They decided that it was better for that market to release them oversize, than to have to live with the bugs that would be present if we didn't. > > The "corner case" in [4] is a supported upgrade path, used by > > governments and other internet-less environments. Not fixing that > > bug was resulting in completely broken, unbootable systems, and as > > such definitely fits respin criteria. Our alternative would have > > been to drop support for these, which we considered and decided > > not to for 12.04. However 12.04 is going to be the last release > > where such an upgrade path is supported. > > Useful information, thanks, we will add this test case earlier in the > testing next time, so that we don't find ourselves in that situation > again. Does this only apply to LTSs or does it also apply to > intermediate releases? Are such customers be likely to upgrade to > releases in between LTSs? Internet-less environments are a reality in a large number of corporations due to firewalls, etc. as well as those without reliable connection. The extent that this is a priority for a specific release is determined by the development teams and marketing. > > I suppose we can do that, ultimately it's always going to be up to > > the release team to do a go/no-go on case by case basis, but > > writting some generic guidelines can't hurt. > > > > At least for me, anything that fits one of the following is release > > critical: - Security issues affecting the live/install environment > > - Kernel bugs preventing the boot of the image for commonly > > available hardware - Installer bugs leading to installation failure > > or broken post-install experience without obvious workaround - > > Upgrade bugs leading to broken/non-working systems that can't be > > fixed post-upgrade through SRU. - Critical bugs affecting common > > software used immediately post-installation > > This is a good starting point, and will help us focus our testing > going forward. Any other show-stopper kind of issue you can think of? > Or someone else? I've started off a page at: https://wiki.ubuntu.com/ReleaseManagement/ImageRespinCriteria If others spot things missing, please feel free to add. > > >> - Let's improve the static analysis of images so that we don't > >> have the image size problem again, we are adding a job for this > >> to Jenkins this week. > > > > Can you also make sure someone actually burns the image on the > > supported media? > > As I asked before, what media is the preferred one for the 800MB > Quantal images? We'll be happy to procure those and make sure we burn > it. I'd like to be able to track this has been done with a test case > on the tracker. see answer above, it depends on the product. > > The static validation is added to jenkins and results will start to be > published to the public instance today. > > > I'm still amazed that for a whole week, nobody even tried to burn > > a CD with our image... > > I am amazed that you expect things to happen without actually having > this documented anywhere, we don't have a test case for this. I think there has been some test case drift in certain areas, and they could benefit from a scrub to make the media a bit more explicit. For instance: http://testcases.qa.ubuntu.com/Install/ServerWhole clearly references the expectation of it being a CD. However, http://testcases.qa.ubuntu.com/Install/DesktopWhole permits an "or" ie. USB or CD. These should probably become separate tests, as both cases are important to check are working. And possibly a case be added for DVD, then when planning a release, and we know what image size development is targeting for release, the appropriate test could be indicated as mandatory. > This is > a problem for us, because the team is growing and not everyone has the > years of experience that jibel has, nor has seen things fail in so > many different ways as to use intuition. Jibel is trying to move on to > do upstream testing, so it is our responsibility as a team to be able > to do a good job and we need transparency from the release team to > achieve that. Transparency on the QA team and effective mechanisms for transferring institutional knowledge from experienced testers to the new ones is also important. If a new tester has questions, I'd expect them to ask the experienced testers first, and then seek clarification from the development or release team, as appropriate. The friday meeting is a good forum and ubuntu-release mail list are good forums for asking for clarification on ambiguities. We're all working to the same goal here, getting the image customers to not be unpleasantly surprised. > > As I said in a different email, jibel won't be the QA single point of > contact anymore, plars will be leading the milestone testing efforts > for the last milestones in Quantal with psivaa and babyface's help, > and then we will work out a schedule or a plan for R that we will > communicate in due time, so that we are all clear what is going > to be tested and can tell us if anything is missing. If there is no longer going to be a single point of contact, its going to be important to track this as well. I've added a column to: https://wiki.ubuntu.com/QuantalQuetzal/ReleaseTaskSignup for the QA contact. Please update to the plans for the Beta 2 and Release as they are known. So others can know to the "goto" QA person to consult if they see concerns during their testing. > >> - Let's require more than just one run of the test cases to > >> validate an image. What is reasonable in terms of ensuring > >> reasonable HW coverage? I'd like to see at least 3 x 100% run > >> rate with 100% pass rate on the current test cases, from people > >> different from the release engineer. > > > > For final images I usually look for at least 2 people testing the > > various code paths. Unfortunately these code paths can't be easily > > represented in the UI, so ultimately it's a release team decision > > to know whether the threshold has been met. > > I'd like someone from QA at least involved in the decision process, > even if only as an assessor, to voice our concerns. We have had someone from QA involved in the past and plan to continue to do so in the future. Jean-Baptiste was the contact point for QA for the last several releases, and his input was involved in our decision process, so I'm a bit confused why you imply that QA has not been involved. > > Sorry for the rather long e-mail, but I hope it's explaining a bit > > more how things work. > > It is very helpful, thanks. This email contains information we can use > to change things preemptively rather than reactively, like we've done > in the past. We do make changes preemptively, based on discussion, so I'm not sure I agree with this judgment from you. This is the main purpose of our feedback sessions at UDS, and the release team is pretty good about ongoing adding to our process pages when new changes occur during the cycle that impact our processes. see the pages under: https://wiki.ubuntu.com/ReleaseManagement/#Release_Processes Is there a similar set for the QA team, that the Release and Development teams can consult so we can all figure out how to work better together? Anyhow, thanks again for starting this thread off. :) Its healthy to question our institutional and historical assumptions and strive to improve. Reading the documentation that folks have taken the time to write, based on historical lessons learned, and asking for clarification when something is unclear or suggesting specific improvements is always welcomed. Thanks, Kate -- Ubuntu-release mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-release
