Re: [ASFCS42] Proposed schedule for our next release

David Nalley Thu, 18 Apr 2013 18:42:14 -0700

On Thu, Apr 18, 2013 at 6:26 PM, Will Chan <[email protected]> wrote:
>
> > -----Original Message-----
> > From: Chip Childers [mailto:[email protected]]
> > Sent: Monday, April 15, 2013 7:22 AM
> > To: [email protected]
> > Cc: [email protected]
> > Subject: Re: [ASFCS42] Proposed schedule for our next release
> >
> > On Thu, Apr 11, 2013 at 02:50:02PM -0700, Animesh Chaturvedi wrote:
> > >
> > > I want to call out my concern on technical debt we have accumulated so
> > far.
> > >
> > >  I did an analysis on JIRA bugs yesterday night PST on "Affects
> > > Version = 4.1" and created since Dec 2012
> > >
> > > Total records : 429
> > > Resolution Type (Invalid, Duplicate, Cannot reproduce etc.) : 87 (30
> > > Blockers, 27 Critical, 27 Major, 4 Minor) Valid Defects  : 429-87= 342
> > > Fixed : 246 (60 Blockers, 70 Critical, 99 Majors) out of which 217
> > > were fixed since Feb Unresolved : 96 (1 Blocker, 8 Critical, 64 Major)
> > >
> > > With this data it looks like we have fixed 2/3 of valid defects in little 
> > > over
> > 2 months and pretty much deferring around 1/3 rd of issues for future
> > release.
> > >
> > > I also looked at overall backlog of bugs (Critical, Major and Blockers 
> > > only)
> > as of 4/10/2013 - 10:0PM PST.
> > >
> > > 284 open (18 Blocker, 38 Critical, 228 Major) ; By Fix version
> > >     -  Release 4.0.x and prior: 13
> > >     -  4.1: 70
> > >     -  4.2 : 97
> > >     -  Future: 8
> > >     -  No version: 107
> > >
> > > Looking at that we fixed 217 bugs in roughly 2 months during 4.1 cycle,
> > fixing the backlog of bug  will probably take us 2 months.  Should we extend
> > the 4.2 test cycle by 2 months [Original Schedule: 6/1 - 7/22, Extended
> > Schedule: 6/1-9/22] to reduce the technical debt significantly? I would like
> > to hear how community wants to address technical debt. Based on the
> > input and consensus I will publish the agreed schedule next week.
> > >
> > >
> >
> > I don't think that an extension of time changes bug counts really.  IMO, we
> > need to pull together to have some bug-fix focused effort applied to the
> > code-base.  It's also another reason that I'm so big on making sure that
> > automated tests come in with the new features.  That doesn't address test
> > scenarios that human testers can come up with, but if a developer spends
> > the time to think about testing the basic feature and codifies that, we
> > should at least avoid the "this actually doesn't work at all" types of bugs.
> >
> > There's a school of thought that says, don't build another feature until you
> > have sorted out the known bugs in the current features.  I don't think we
> > could really pull that off, but perhaps a different thread to rally people
> > around the bug backlog is in order?
> >
> > -chip
>
> Sorry to chime in so late to this thread as I've been offsite for the better 
> part of this week.  I was one of the original 4 month release crowd but after 
> the recent two releases of ACS, I'm starting to wonder if we shouldn't start 
> moving this to a 6 month cycle instead of two.  Here are some high level 
> observations based on the previous two releases:
>
> 1. It doesn't seem like we are on a true 4 month time based release schedule. 
>  Both 4.0 and 4.1 were delayed more than several weeks past the
> original proposed GA date.  4.0 was released 11/6 and let's assume that 4.1 
> will ship within a week or two.  That's almost a 6 month release cycle.


So both 4.0 and 4.1 strike me as extraordinary. 4.0 was our first
release - and we had lots of issues to resolve. 4.1 introduced a ton
of packaging and name changes that I also consider to be hopefully one
time. Really - we've only been through our release cycle once, so I am
not ready to declare it perpetually behind schedule.


> Every release incurs a fixed cost of release notes, upgrade testing, etc. 
> that I suspect at least eats a month worth of time depending on people's
> schedule.  That's 3 months out of the year rather than two if we can get a 6 
> months cycle.  We can use that extra month for other purposes if need
> be.  I suppose if we want to continue to release past the proposed hard GA 
> date, then I guess it doesn't matter if it's 4 or 6 months.  It's basically a
> release when the release mgmt. team feels it's right to release based on 
> current bugs, etc.
>

Having seen the point releases twice now, which still need upgrade
testing, release notes, etc I don't get the feeling that the
'overhread' referred to above is the problem. Joe may disagree with
me.

> 2. As more and more features/development go in, it just means more 
> destabilization of the code.  4.0 was delayed and the majority of that work 
> was
> licensing files.  4.1 got just a bit more complicated with new feature 
> development and the delay is now much longer.  Not all features are created
> equal in terms of testing.  Some may require more time to develop but may not 
> impact the entire system like for example, adding a new hypervisor.
> However, work like refactoring vm sync or other more internal code could 
> affect the entire stack and require more QA time.  We need extra time for
> new code to settle in.
>

I wonder why we would merge feature that we can't prove doesn't break
the entire stack and prove that it works. Some of this is the missing
automation you talk about below. Essentially we have no way, sometimes
until months after the merge, to tell if something works or not
because we relay on manual QA to test it.

> 3. ACS is still dependent largely on manual QA.  Let's face it, our automated 
> testing/unit testing isn't mature enough quite yet and we cannot always 
> expect manual QA to be there and on ACS schedule.  CloudStack releases have 
> some type of quality expectations as well as support for upgrades.  Upgrades 
> and migration scripts aren't that easily automatable.  Chip and others have 
> been very diligent on ensuring that code check in has the appropriate tests 
> but it's not there yet.
>
> 4. ACS development is based on volunteer work and many of us have a $dayjob 
> and may not be able to assist with fixing bugs in ACS schedule.  Having only 
> a couple of months to fix bugs and expect others to follow our ACS schedule 
> seems a bit rushed.  Wearing my Citrix hat now, I can tell you that 2 months 
> of QA and bug fixing  is not enough to release quality GA release.  And that 
> is with me breathing down the necks of many of the engineers to get them 
> fixed on time.  ACS does not have this type of culture and nor should it.   
> Given that, we should be a bit more flexible in terms of allowing people 
> eventually to act on issues.
>


So a couple of other comments.
We have folks clamoring for the awesome new features. To the point
they are creating derivative works (which tells me we are doing some
things right as folks are finding it easy enough to do)

What I gathered from reading the above doesn't really have anything to
do with schedule:
* New development destabilizes our code base, and is a threat to
quality and the release schedule
* We can not depend on the current level of manual QA to be present
going forward.

This brings me to conclusion that as a community we should seriously
temper our inclusion of new features and make our focus automated
testing until such time as pushing a release out is less months of
manual QA processes and more of a decision. This makes me want to
raise the barrier for merges even higher. Perhaps running the entire
Marvin suite with the proposed merge is what we need to begin
mandating.

--David "who wishes he had kept working on Automated QA tasks" Nalley :)

Re: [ASFCS42] Proposed schedule for our next release

Reply via email to