Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-22 Thread Daniel P. Berrange
On Wed, Aug 21, 2013 at 11:20:08AM -0400, Doug Hellmann wrote:
 On Mon, Aug 19, 2013 at 11:47 AM, Daniel P. Berrange 
 berra...@redhat.comwrote:
 
  In this thread about code review:
 
 
  http://lists.openstack.org/pipermail/openstack-dev/2013-August/013701.html
 
  I mentioned that I thought there were too many blueprints created without
  sufficient supporting design information and were being used for tickbox
  process compliance only. I based this assertion on a gut feeling I have
  from experiance in reviewing.
 
  To try and get a handle on whether there is truely a problem, I used the
  launchpadlib API to extract some data on blueprints [1].
 
  In particular I was interested in seeing:
 
- What portion of blueprints have an URL containing an associated
  design doc,
 
- How long the descriptive text was in typical blueprints
 
- Whether a blueprint was created before or after the dev period
  started for that major release.
 
 
  The first two items are easy to get data on. On the second point, I redid
  line wrapping on description text to normalize the line count across all
  blueprints. This is because many blueprints had all their text on one
  giant long line, which would skew results. I thus wrapped all blueprints
  at 70 characters.
 
  The blueprint creation date vs release cycle dev start date is a little
  harder. I inferred the start date of each release, by using the end date
  of the previous release. This is probably a little out but hopefully not
  by enough to totally invalidate the usefulness of the stats below. Below,
  Early means created before start of devel, Late means created after
  the start of devel period.
 
  The data for the last 3 releases is:
 
Series: folsom
  Specs: 178
  Specs (no URL): 144
  Specs (w/ URL): 34
  Specs (Early): 38
  Specs (Late): 140
  Average lines: 5
  Average words: 55
 
 
Series: grizzly
  Specs: 227
  Specs (no URL): 175
  Specs (w/ URL): 52
  Specs (Early): 42
  Specs (Late): 185
  Average lines: 5
  Average words: 56
 
 
Series: havana
  Specs: 415
  Specs (no URL): 336
  Specs (w/ URL): 79
  Specs (Early): 117
  Specs (Late): 298
  Average lines: 6
  Average words: 68
 
 
  Looking at this data there are 4 key take away points
 
- We're creating more blueprints in every release.
 
- Less than 1 in 4 blueprints has a link to a design document.
 
- The description text for blueprints is consistently short
  (6 lines) across releases.
 
- Less than 1 in 4 blueprints is created before the devel
  period starts for a release.
 
 
  You can view the full data set + the script to generate the
  data which you can look at to see if I made any logic mistakes:
 
http://berrange.fedorapeople.org/openstack-blueprints/
 
 
  There's only so much you can infer from stats like this, but IMHO think the
  stats show that we ought to think about how well we are using blueprints as
  design / feature approval / planning tools.
 
 
  That 3 in 4 blueprint lack any link to a design doc and have only 6 lines
  of
  text description, is a cause for concern IMHO. The blueprints should be
  giving
  code reviewers useful background on the motivation of the dev work  any
  design planning that took place. While there are no doubt some simple
  features
  where 6 lines of text is sufficient info in the blueprint, I don't think
  that
  holds true for the majority.
 
 
 How many of those blueprints without links or expansive documentation are
 related to some other blueprint that does have documentation? In
 ceilometer, we have several blueprint clusters where one main blueprint
 has some documentation and the others are present for assigning and
 scheduling the work of a multi-part feature, or vice versa. For example,
 https://blueprints.launchpad.net/ceilometer/+spec/alarming has no real doc
 because it's an umbrella blueprint for a lot of other pieces, many of
 which *do* have documentation.

I don't know about ceilometer but for Nova I don't think there are
such a large number of linked blueprints, as to make any significant
difference to the stats here.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-21 Thread Doug Hellmann
On Mon, Aug 19, 2013 at 11:47 AM, Daniel P. Berrange berra...@redhat.comwrote:

 In this thread about code review:


 http://lists.openstack.org/pipermail/openstack-dev/2013-August/013701.html

 I mentioned that I thought there were too many blueprints created without
 sufficient supporting design information and were being used for tickbox
 process compliance only. I based this assertion on a gut feeling I have
 from experiance in reviewing.

 To try and get a handle on whether there is truely a problem, I used the
 launchpadlib API to extract some data on blueprints [1].

 In particular I was interested in seeing:

   - What portion of blueprints have an URL containing an associated
 design doc,

   - How long the descriptive text was in typical blueprints

   - Whether a blueprint was created before or after the dev period
 started for that major release.


 The first two items are easy to get data on. On the second point, I redid
 line wrapping on description text to normalize the line count across all
 blueprints. This is because many blueprints had all their text on one
 giant long line, which would skew results. I thus wrapped all blueprints
 at 70 characters.

 The blueprint creation date vs release cycle dev start date is a little
 harder. I inferred the start date of each release, by using the end date
 of the previous release. This is probably a little out but hopefully not
 by enough to totally invalidate the usefulness of the stats below. Below,
 Early means created before start of devel, Late means created after
 the start of devel period.

 The data for the last 3 releases is:

   Series: folsom
 Specs: 178
 Specs (no URL): 144
 Specs (w/ URL): 34
 Specs (Early): 38
 Specs (Late): 140
 Average lines: 5
 Average words: 55


   Series: grizzly
 Specs: 227
 Specs (no URL): 175
 Specs (w/ URL): 52
 Specs (Early): 42
 Specs (Late): 185
 Average lines: 5
 Average words: 56


   Series: havana
 Specs: 415
 Specs (no URL): 336
 Specs (w/ URL): 79
 Specs (Early): 117
 Specs (Late): 298
 Average lines: 6
 Average words: 68


 Looking at this data there are 4 key take away points

   - We're creating more blueprints in every release.

   - Less than 1 in 4 blueprints has a link to a design document.

   - The description text for blueprints is consistently short
 (6 lines) across releases.

   - Less than 1 in 4 blueprints is created before the devel
 period starts for a release.


 You can view the full data set + the script to generate the
 data which you can look at to see if I made any logic mistakes:

   http://berrange.fedorapeople.org/openstack-blueprints/


 There's only so much you can infer from stats like this, but IMHO think the
 stats show that we ought to think about how well we are using blueprints as
 design / feature approval / planning tools.


 That 3 in 4 blueprint lack any link to a design doc and have only 6 lines
 of
 text description, is a cause for concern IMHO. The blueprints should be
 giving
 code reviewers useful background on the motivation of the dev work  any
 design planning that took place. While there are no doubt some simple
 features
 where 6 lines of text is sufficient info in the blueprint, I don't think
 that
 holds true for the majority.


How many of those blueprints without links or expansive documentation are
related to some other blueprint that does have documentation? In
ceilometer, we have several blueprint clusters where one main blueprint
has some documentation and the others are present for assigning and
scheduling the work of a multi-part feature, or vice versa. For example,
https://blueprints.launchpad.net/ceilometer/+spec/alarming has no real doc
because it's an umbrella blueprint for a lot of other pieces, many of
which *do* have documentation.

Doug



 In addition to helping code reviewers, the blueprints are also arguably a
 source of info for QA people testing OpenStack and for the docs teams
 documenting new features in each release. I'm not convinced that there is
 enough info in many of the blueprints to be of use to QA / docs people.


 The creation dates of the blueprints are also an interesting data point.
 If the design summit is our place for reviewing blueprints and 3 in 4
 blueprints in a release are created after the summit, that's alot of
 blueprints potentially missing summit discussions. On the other hand many
 blueprints will have corresponding discussions on mailing lists too,
 which is arguably just as good, or even better than, summit discussions.

 Based on the creation dates though  terseness of design info, I think
 there is a valid concern here that blueprints are being created just for
 reason of tickbox process compliance.

 In theory we have an approval process for blueprints, but are we ever
 rejecting code submissions for blueprints which are not yet approved ?
 I've only noticed that happen a couple of times in Nova 

Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-21 Thread Mike Spreitzer
For the case of an item that has no significant doc of its own but is 
related to an extensive blueprint, how about linking to that extensive 
blueprint?
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-21 Thread Rochelle.Grober
+100

If one blueprint points to another, then the pointers should be present and 
available in both blueprints.  Dependency linking, folks.

--Rocky

From: Mike Spreitzer [mailto:mspre...@us.ibm.com]
Sent: Wednesday, August 21, 2013 9:04 AM
To: Daniel P. Berrange
Cc: OpenStack Development Mailing List
Subject: Re: [openstack-dev] Stats on blueprint design info / creation times

For the case of an item that has no significant doc of its own but is related 
to an extensive blueprint, how about linking to that extensive blueprint?
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-20 Thread Anne Gentle
On Mon, Aug 19, 2013 at 10:47 AM, Daniel P. Berrange berra...@redhat.comwrote:

 In this thread about code review:


 http://lists.openstack.org/pipermail/openstack-dev/2013-August/013701.html

 I mentioned that I thought there were too many blueprints created without
 sufficient supporting design information and were being used for tickbox
 process compliance only. I based this assertion on a gut feeling I have
 from experiance in reviewing.

 To try and get a handle on whether there is truely a problem, I used the
 launchpadlib API to extract some data on blueprints [1].

 In particular I was interested in seeing:

   - What portion of blueprints have an URL containing an associated
 design doc,

   - How long the descriptive text was in typical blueprints

   - Whether a blueprint was created before or after the dev period
 started for that major release.


 The first two items are easy to get data on. On the second point, I redid
 line wrapping on description text to normalize the line count across all
 blueprints. This is because many blueprints had all their text on one
 giant long line, which would skew results. I thus wrapped all blueprints
 at 70 characters.

 The blueprint creation date vs release cycle dev start date is a little
 harder. I inferred the start date of each release, by using the end date
 of the previous release. This is probably a little out but hopefully not
 by enough to totally invalidate the usefulness of the stats below. Below,
 Early means created before start of devel, Late means created after
 the start of devel period.

 The data for the last 3 releases is:

   Series: folsom
 Specs: 178
 Specs (no URL): 144
 Specs (w/ URL): 34
 Specs (Early): 38
 Specs (Late): 140
 Average lines: 5
 Average words: 55


   Series: grizzly
 Specs: 227
 Specs (no URL): 175
 Specs (w/ URL): 52
 Specs (Early): 42
 Specs (Late): 185
 Average lines: 5
 Average words: 56


   Series: havana
 Specs: 415
 Specs (no URL): 336
 Specs (w/ URL): 79
 Specs (Early): 117
 Specs (Late): 298
 Average lines: 6
 Average words: 68


 Looking at this data there are 4 key take away points

   - We're creating more blueprints in every release.

   - Less than 1 in 4 blueprints has a link to a design document.

   - The description text for blueprints is consistently short
 (6 lines) across releases.


Thanks for running the numbers. My instinct told me this was the case, but
the data is especially helpful. Sometimes six lines is enough, but mostly I
rely on the linked spec for writing docs. If those links are at 25% that's
a bad trend.


   - Less than 1 in 4 blueprints is created before the devel
 period starts for a release.


I find this date mismatch especially intriguing, because the Foundation and
member company sponsors spend millions on Design Summits annually and
caters so much to getting people together in person. Yet the blueprints
aren't created in enough detail for discussion before the Summit dates? Is
that really what the data says? Is any one project skewing this (as in,
they haven't been at a Summit or they don't follow integrated release
dates?)

I'll dig in further to the data set below.



 You can view the full data set + the script to generate the
 data which you can look at to see if I made any logic mistakes:

   http://berrange.fedorapeople.org/openstack-blueprints/



I wouldn't think to include marconi in the dataset as they've just asked
about incubation in June 2013. I think you excluded keystone. You also want
ceilometer and oslo if you already included heat. Is it fairly easy to
re-run? I'd like to see it re-run with the correct program listings.

Also please rerun with Swift excluded as their release dates are not on the
mark with the other projects. I'd like more info around the actual timing.



 There's only so much you can infer from stats like this, but IMHO think the
 stats show that we ought to think about how well we are using blueprints as
 design / feature approval / planning tools.


 That 3 in 4 blueprint lack any link to a design doc and have only 6 lines
 of
 text description, is a cause for concern IMHO. The blueprints should be
 giving
 code reviewers useful background on the motivation of the dev work  any
 design planning that took place. While there are no doubt some simple
 features
 where 6 lines of text is sufficient info in the blueprint, I don't think
 that
 holds true for the majority.

 In addition to helping code reviewers, the blueprints are also arguably a
 source of info for QA people testing OpenStack and for the docs teams
 documenting new features in each release. I'm not convinced that there is
 enough info in many of the blueprints to be of use to QA / docs people.


 The creation dates of the blueprints are also an interesting data point.
 If the design summit is our place for reviewing blueprints and 3 in 4
 blueprints in a release are created 

Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-20 Thread Thierry Carrez
Anne Gentle wrote:
   - Less than 1 in 4 blueprints is created before the devel
 period starts for a release.
 
 I find this date mismatch especially intriguing, because the Foundation
 and member company sponsors spend millions on Design Summits annually
 and caters so much to getting people together in person. Yet the
 blueprints aren't created in enough detail for discussion before the
 Summit dates? Is that really what the data says? Is any one project
 skewing this (as in, they haven't been at a Summit or they don't follow
 integrated release dates?)

That does not surprise me. A lot of people do not link a blueprint to
their session proposal on the design summit session suggestion system --
sometimes it's the discussion itself which allows to formulate the right
blueprints, and those are filed in the weeks just after the summit. And
I think that's fine.

It would be more interesting to check how many blueprints are created
more than two weeks after the design summit. Those would be the late
blueprints (or the ones created as a tickbox), which escape the release
planning process.

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-20 Thread Mark McLoughlin
On Mon, 2013-08-19 at 14:38 -0300, Thierry Carrez wrote:
 Note that in some cases, some improvements that do not clearly fall
 into the bug category are landed without a blueprint link (or a bug
 link). So a first step could be to require that a review always
 references a bug or a blueprint before it's landed. Then, improve the
 quality of the information present in said bug/blueprint.

I think that a every review must reference a bug or blueprint rule
would encourage more of this process checkbox behaviour.

Blueprints are useful for some things:

  - where a longer design rationale than is appropriate for a commit 
message is required
  - where it's a feature we want to raise awareness around
  - where it's something that's going to take a while to bake and we 
need to track its progress
  - etc.

(We've seen already how DocImpact can be used to obviate the need for
docs folks should look at this use case. We could do similarly for
QA.)

And for bugs:

  - where the person with info about a problem isn't the person fixing 
it
  - where there's important background information (like detailed logs) 
which can't be summarized appropriately in a commit message
  - where we want it tracked as a must-have for a given release
  - etc.

So, I'm totally fine with someone showing up with a standalone commit
(i.e. no bug or blueprint) and a nice explanatory commit message, if the
bug or blueprint would not provide any of the value listed above.

Where a bug or blueprint doesn't provide such value, you often see
people with terse commit messages referencing a process checkbox bug or
blueprint ... and that isn't helping anything.

Cheers,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-20 Thread Daniel P. Berrange
On Tue, Aug 20, 2013 at 10:36:39AM -0500, Anne Gentle wrote:
 On Mon, Aug 19, 2013 at 10:47 AM, Daniel P. Berrange 
 berra...@redhat.comwrote:
  The data for the last 3 releases is:
 
Series: folsom
  Specs: 178
  Specs (no URL): 144
  Specs (w/ URL): 34
  Specs (Early): 38
  Specs (Late): 140
  Average lines: 5
  Average words: 55
 
 
Series: grizzly
  Specs: 227
  Specs (no URL): 175
  Specs (w/ URL): 52
  Specs (Early): 42
  Specs (Late): 185
  Average lines: 5
  Average words: 56
 
 
Series: havana
  Specs: 415
  Specs (no URL): 336
  Specs (w/ URL): 79
  Specs (Early): 117
  Specs (Late): 298
  Average lines: 6
  Average words: 68
 
 
  Looking at this data there are 4 key take away points
 
- We're creating more blueprints in every release.
 
- Less than 1 in 4 blueprints has a link to a design document.
 
- The description text for blueprints is consistently short
  (6 lines) across releases.
 
 
 Thanks for running the numbers. My instinct told me this was the case, but
 the data is especially helpful. Sometimes six lines is enough, but mostly I
 rely on the linked spec for writing docs. If those links are at 25% that's
 a bad trend.
 
 
- Less than 1 in 4 blueprints is created before the devel
  period starts for a release.
 
 
 I find this date mismatch especially intriguing, because the Foundation and
 member company sponsors spend millions on Design Summits annually and
 caters so much to getting people together in person. Yet the blueprints
 aren't created in enough detail for discussion before the Summit dates? Is
 that really what the data says? Is any one project skewing this (as in,
 they haven't been at a Summit or they don't follow integrated release
 dates?)
 
 I'll dig in further to the data set below.
 
 
 
  You can view the full data set + the script to generate the
  data which you can look at to see if I made any logic mistakes:
 
http://berrange.fedorapeople.org/openstack-blueprints/
 
 
 
 I wouldn't think to include marconi in the dataset as they've just asked
 about incubation in June 2013. I think you excluded keystone. You also want
 ceilometer and oslo if you already included heat. Is it fairly easy to
 re-run? I'd like to see it re-run with the correct program listings.
 
 Also please rerun with Swift excluded as their release dates are not on the
 mark with the other projects. I'd like more info around the actual timing.

Ok, I've changed the projects it analyses per your recommendation.
Also I've made it print the cut off date I used to assign blueprints
to the early or late creation date buckets

The overall results are approximately the same though

Series: folsom all
  Specs: 177
  Specs (no URL): 145
  Specs (w/ URL): 32
  Specs (Before 2012-04-05 14:43:29.870782+00:00): 39
  Specs (After 2012-04-05 14:43:29.870782+00:00): 138
  Average lines: 5
  Average words: 54


Series: grizzly all
  Specs: 255
  Specs (no URL): 187
  Specs (w/ URL): 68
  Specs (Before 2012-09-27 00:00:00+00:00): 47
  Specs (After 2012-09-27 00:00:00+00:00): 208
  Average lines: 6
  Average words: 61


Series: havana all
  Specs: 470
  Specs (no URL): 379
  Specs (w/ URL): 91
  Specs (Before 2013-04-04 12:59:00+00:00): 137
  Specs (After 2013-04-04 12:59:00+00:00): 333
  Average lines: 6
  Average words: 69


I also produced summary stats per project. Showing Nova 


Series: folsom nova
  Specs: 54
  Specs (no URL): 37
  Specs (w/ URL): 17
  Specs (Before 2012-04-05 14:43:29.870782+00:00): 20
  Specs (After 2012-04-05 14:43:29.870782+00:00): 34
  Average lines: 6
  Average words: 61


Series: grizzly nova
  Specs: 68
  Specs (no URL): 51
  Specs (w/ URL): 17
  Specs (Before 2012-09-27 00:00:00+00:00): 17
  Specs (After 2012-09-27 00:00:00+00:00): 51
  Average lines: 6
  Average words: 65


Series: havana nova
  Specs: 131
  Specs (no URL): 107
  Specs (w/ URL): 24
  Specs (Before 2013-04-04 12:59:00+00:00): 31
  Specs (After 2013-04-04 12:59:00+00:00): 100
  Average lines: 7
  Average words: 72


And keystone



Series: folsom keystone
  Specs: 9
  Specs (no URL): 8
  Specs (w/ URL): 1
  Specs (Before 2012-04-05 14:43:29.870782+00:00): 3
  Specs (After 2012-04-05 14:43:29.870782+00:00): 6
  Average lines: 4
  Average words: 37


Series: grizzly keystone
  Specs: 16
  Specs (no URL): 9
  Specs (w/ URL): 7
  Specs (Before 2012-09-27 00:00:00+00:00): 7
  Specs (After 2012-09-27 00:00:00+00:00): 9
  Average lines: 11
  Average words: 117


Series: havana keystone
  Specs: 25
  Specs (no URL): 13
  Specs (w/ URL): 12
  Specs (Before 2013-04-04 12:59:00+00:00): 5
  Specs (After 2013-04-04 12:59:00+00:00): 20
  Average lines: 9
  Average words: 95


I won't include the other projects in this mail, but you can see them in
the blueprint-summary-XXX.txt files here:

  http://berrange.fedorapeople.org/openstack-blueprints/v2/

there is some a bit of variance between projects, but 

Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-20 Thread Daniel P. Berrange
On Tue, Aug 20, 2013 at 05:18:21PM +0100, Daniel P. Berrange wrote:
 On Tue, Aug 20, 2013 at 12:53:25PM -0300, Thierry Carrez wrote:
  Anne Gentle wrote:
 - Less than 1 in 4 blueprints is created before the devel
   period starts for a release.
   
   I find this date mismatch especially intriguing, because the Foundation
   and member company sponsors spend millions on Design Summits annually
   and caters so much to getting people together in person. Yet the
   blueprints aren't created in enough detail for discussion before the
   Summit dates? Is that really what the data says? Is any one project
   skewing this (as in, they haven't been at a Summit or they don't follow
   integrated release dates?)
  
  That does not surprise me. A lot of people do not link a blueprint to
  their session proposal on the design summit session suggestion system --
  sometimes it's the discussion itself which allows to formulate the right
  blueprints, and those are filed in the weeks just after the summit. And
  I think that's fine.
  
  It would be more interesting to check how many blueprints are created
  more than two weeks after the design summit. Those would be the late
  blueprints (or the ones created as a tickbox), which escape the release
  planning process.
 
 I'll look up the historic dates for each summit, and try to generate
 some stats based on blueprint creation date vs  summit date + 2 weeks.

Re-running using the summit date + 2 weeks shift things a little bit.
Here is the  summary for 3 most recent series:


Series: folsom all
  Specs: 177
  Specs (no URL): 145
  Specs (w/ URL): 32
  Specs (Before Mon, 30 Apr 2012 00:00:00 +): 62
  Specs (After Mon, 30 Apr 2012 00:00:00 +): 115
  Average lines: 5
  Average words: 54


Series: grizzly all
  Specs: 255
  Specs (no URL): 187
  Specs (w/ URL): 68
  Specs (Before Sun, 28 Oct 2012 23:00:00 +): 81
  Specs (After Sun, 28 Oct 2012 23:00:00 +): 174
  Average lines: 6
  Average words: 61


Series: havana all
  Specs: 470
  Specs (no URL): 378
  Specs (w/ URL): 92
  Specs (Before Mon, 29 Apr 2013 00:00:00 +): 197
  Specs (After Mon, 29 Apr 2013 00:00:00 +): 273
  Average lines: 6
  Average words: 69


Full data set for version 3 of the stats is now here

  http://berrange.fedorapeople.org/openstack-blueprints/v3/

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-20 Thread Thierry Carrez
Daniel P. Berrange wrote:
 On Tue, Aug 20, 2013 at 05:18:21PM +0100, Daniel P. Berrange wrote:
 On Tue, Aug 20, 2013 at 12:53:25PM -0300, Thierry Carrez wrote:
 It would be more interesting to check how many blueprints are created
 more than two weeks after the design summit. Those would be the late
 blueprints (or the ones created as a tickbox), which escape the release
 planning process.

 I'll look up the historic dates for each summit, and try to generate
 some stats based on blueprint creation date vs  summit date + 2 weeks.
 
 Re-running using the summit date + 2 weeks shift things a little bit.
 Here is the  summary for 3 most recent series:
 
 Series: folsom all
   Specs (Before Mon, 30 Apr 2012 00:00:00 +): 62
   Specs (After Mon, 30 Apr 2012 00:00:00 +): 115
 
 Series: grizzly all
   Specs (Before Sun, 28 Oct 2012 23:00:00 +): 81
   Specs (After Sun, 28 Oct 2012 23:00:00 +): 174
 
 Series: havana all
   Specs (Before Mon, 29 Apr 2013 00:00:00 +): 197
   Specs (After Mon, 29 Apr 2013 00:00:00 +): 273

Interesting, looks like we actually did a better job with havana,
jumping from 31% to 42% of planned blueprints. That may be Nova and
Neutron PTLs enforcing more rules to retain their sanity.

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Stats on blueprint design info / creation times

2013-08-19 Thread Daniel P. Berrange
In this thread about code review:

  http://lists.openstack.org/pipermail/openstack-dev/2013-August/013701.html

I mentioned that I thought there were too many blueprints created without
sufficient supporting design information and were being used for tickbox
process compliance only. I based this assertion on a gut feeling I have
from experiance in reviewing.

To try and get a handle on whether there is truely a problem, I used the
launchpadlib API to extract some data on blueprints [1].

In particular I was interested in seeing:

  - What portion of blueprints have an URL containing an associated
design doc,

  - How long the descriptive text was in typical blueprints

  - Whether a blueprint was created before or after the dev period
started for that major release.


The first two items are easy to get data on. On the second point, I redid
line wrapping on description text to normalize the line count across all
blueprints. This is because many blueprints had all their text on one
giant long line, which would skew results. I thus wrapped all blueprints
at 70 characters.

The blueprint creation date vs release cycle dev start date is a little
harder. I inferred the start date of each release, by using the end date
of the previous release. This is probably a little out but hopefully not
by enough to totally invalidate the usefulness of the stats below. Below,
Early means created before start of devel, Late means created after
the start of devel period.

The data for the last 3 releases is:

  Series: folsom
Specs: 178
Specs (no URL): 144
Specs (w/ URL): 34
Specs (Early): 38
Specs (Late): 140
Average lines: 5
Average words: 55


  Series: grizzly
Specs: 227
Specs (no URL): 175
Specs (w/ URL): 52
Specs (Early): 42
Specs (Late): 185
Average lines: 5
Average words: 56


  Series: havana
Specs: 415
Specs (no URL): 336
Specs (w/ URL): 79
Specs (Early): 117
Specs (Late): 298
Average lines: 6
Average words: 68


Looking at this data there are 4 key take away points

  - We're creating more blueprints in every release.

  - Less than 1 in 4 blueprints has a link to a design document. 

  - The description text for blueprints is consistently short
(6 lines) across releases.

  - Less than 1 in 4 blueprints is created before the devel
period starts for a release.


You can view the full data set + the script to generate the
data which you can look at to see if I made any logic mistakes:

  http://berrange.fedorapeople.org/openstack-blueprints/


There's only so much you can infer from stats like this, but IMHO think the
stats show that we ought to think about how well we are using blueprints as
design / feature approval / planning tools.


That 3 in 4 blueprint lack any link to a design doc and have only 6 lines of
text description, is a cause for concern IMHO. The blueprints should be giving
code reviewers useful background on the motivation of the dev work  any
design planning that took place. While there are no doubt some simple features
where 6 lines of text is sufficient info in the blueprint, I don't think that
holds true for the majority.

In addition to helping code reviewers, the blueprints are also arguably a
source of info for QA people testing OpenStack and for the docs teams
documenting new features in each release. I'm not convinced that there is
enough info in many of the blueprints to be of use to QA / docs people.


The creation dates of the blueprints are also an interesting data point.
If the design summit is our place for reviewing blueprints and 3 in 4
blueprints in a release are created after the summit, that's alot of
blueprints potentially missing summit discussions. On the other hand many
blueprints will have corresponding discussions on mailing lists too,
which is arguably just as good, or even better than, summit discussions.

Based on the creation dates though  terseness of design info, I think
there is a valid concern here that blueprints are being created just for
reason of tickbox process compliance. 

In theory we have an approval process for blueprints, but are we ever
rejecting code submissions for blueprints which are not yet approved ?
I've only noticed that happen a couple of times in Nova for things that
were pretty clearly controversial.

I don't intend to suggest that we have strict rules that all blueprints
must be min X lines of text, or be created by date Y. It is important
to keep the flexibility there to avoid development being drowned in
process without benefits.

I do think we have scope for being more rigourous in our review of
blueprints, asking people to expand on the design info associated with
a blueprint. Perhaps also require that a blueprint is actually approved
by the core team before we go to the trouble of reviewing  approving
the code implementing a blueprint in Gerrit.

Regards,
Daniel

[1] http://berrange.fedorapeople.org/openstack-blueprints/blueprint.py
-- 

Re: [openstack-dev] Stats on blueprint design info / creation times

2013-08-19 Thread Thierry Carrez
Daniel P. Berrange wrote:
 In this thread about code review:
 
   http://lists.openstack.org/pipermail/openstack-dev/2013-August/013701.html
 
 I mentioned that I thought there were too many blueprints created without
 sufficient supporting design information and were being used for tickbox
 process compliance only. I based this assertion on a gut feeling I have
 from experiance in reviewing.
 [...]

Nice analysis, Daniel.

One side of this issue is that the blueprints tool no longer matches our
needs (can't have a blueprint that affects multiple projects, can't
discuss in blueprints the same way we do with bugs...).

So I suspect part of the tickbox effect is due to people not getting
enough value from blueprints. They are essential for project management
types (think PTLs or me), but feel like a process tickbox for everyone
else. I hope that StoryBoard will one day fix that for us.

 I do think we have scope for being more rigourous in our review of
 blueprints, asking people to expand on the design info associated with
 a blueprint. Perhaps also require that a blueprint is actually approved
 by the core team before we go to the trouble of reviewing  approving
 the code implementing a blueprint in Gerrit.

The approval process has been simplified lately: if a blueprint is
targeted to a milestone and has a priority set (not Undefined) then it
is considered approved. I agree you could require that the blueprint was
reviewed/prioritized before landing a feature associated with it.

Note that in some cases, some improvements that do not clearly fall
into the bug category are landed without a blueprint link (or a bug
link). So a first step could be to require that a review always
references a bug or a blueprint before it's landed. Then, improve the
quality of the information present in said bug/blueprint.

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev