Hi Steve (and sc-l),

I'll invoke my skiing with Eli excuse again on this thread as well...

On Tue, 2 Feb 2010, Wall, Kevin wrote:
> To study something scientifically goes _beyond_ simply gathering
> observable and measurable evidence. Not only does data needs to be
> collected, but it also needs to be tested against a hypotheses that offers
> a tentative *explanation* of the observed phenomena;
> i.e., the hypotheses should offer some predictive value.

On 2/2/10 4:12 PM, "Steven M. Christey" <co...@linus.mitre.org> wrote:
>>I believe that the cross-industry efforts like BSIMM, ESAPI, top-n lists,
>>SAMATE, etc. are largely at the beginning of the data collection phase.

I agree 100%.  It's high time we gathered some data to back up our claims.  I 
would love to see the top-n lists do more with data.

Here's an example.  In the BSIMM,  10 of 30 firms have built top-N bug lists 
based on their own data culled from their own code.  I would love to see how 
those top-n lists compare to the OWASP top ten or the CWE-25.  I would also 
love to see whether the union of these lists is even remotely interesting.  One 
of my (many) worries about top-n lists that are NOT bound to a particular code 
base is that the lists are so generic as to be useless and maybe even unhelpful 
if adopted wholesale without understanding what's actually going on in a 
codebase. [see <http://www.informit.com/articles/article.aspx?p=1322398>].

Note for the record that "asking lots of people what they think should be in 
the top-10" is not quite the same as taking the union of particular top-n lists 
which are tied to particular code bases.  Popularity contests are not the kind 
of data we should count on.  But maybe we'll make some progress on that one day.

>Ultimately, I would love to see the kind of linkage between the collected
>data ("evidence") and some larger goal ("higher security" whatever THAT
>means in quantitative terms) but if it's out there, I don't see it

Neither do I, and that is a serious issue with models like the BSIMM that 
measure "second order" effects like activities.  Do the activities actually do 
any good?  Important question!

>The 2010 OWASP Top 10 RC1 is more data-driven than previous versions; same
>with the 2010 Top 25 (whose release has been delayed to Feb 16, btw).
>Unlike last year's Top 25 effort, this time I received several sources of
>raw prevalence data, but unfortunately it wasn't in sufficiently
>consumable form to combine.

I was with you up until that last part.  Combining the prevalence data is 
something you guys should definitely do.  BTW, how is the 2010 CWE-25 (which 
doesn't yet exist) more data driven??

>I for one am pretty satisfied with the rate at which things are
>progressing and am delighted to see that we're finally getting some raw
>data, as good (or as bad) as it may be.  The data collection process,
>source data, metrics, and conclusions associated with the 2010 Top 25 will
>probably be controversial, but at least there's some data to argue about.

Cool!

>So in that sense, I see Gary's article not so much as a clarion call for
>action to a reluctant and primitive industry, but an early announcement of
>a shift that is already underway.

Well put.

gem

company www.cigital.com
podcast www.cigital.com/~gem
blog www.cigital.com/justiceleague
book www.swsec.com


_______________________________________________
Secure Coding mailing list (SC-L) SC-L@securecoding.org
List information, subscriptions, etc - http://krvw.com/mailman/listinfo/sc-l
List charter available at - http://www.securecoding.org/list/charter.php
SC-L is hosted and moderated by KRvW Associates, LLC (http://www.KRvW.com)
as a free, non-commercial service to the software security community.
_______________________________________________

Reply via email to