On Tue, 2 Feb 2010, Wall, Kevin wrote:

To study something scientifically goes _beyond_ simply gathering
observable and measurable evidence. Not only does data needs to be
collected, but it also needs to be tested against a hypotheses that offers
a tentative *explanation* of the observed phenomena;
i.e., the hypotheses should offer some predictive value. Furthermore,
the steps of the experiment must be _repeatable_, not just by
those currently involved in the attempted scientific endeavor, but by
*anyone* who would care to repeat the experiment. If the
steps are not repeatable, then any predictive value of the study is lost.

I believe that the cross-industry efforts like BSIMM, ESAPI, top-n lists, SAMATE, etc. are largely at the beginning of the data collection phase. It shouldn't be much of a surprise that the many companies participate in two or more of these efforts (although simultaneously disconcerting, but that's probably what happens in brand-new areas).

Ultimately, I would love to see the kind of linkage between the collected data ("evidence") and some larger goal ("higher security" whatever THAT means in quantitative terms) but if it's out there, I don't see it, or it's in tiny pieces... and it may be a few years before we get to that point. CVE data and trends have been used in recent years, or should I say abused or misused, because of inherent bias problems that I'm too lazy to talk about at the moment.

In CWE, one aspect of our research is to tie attacks to weaknesses, weaknesses to mitigations, etc. so that there is better understanding of all the inter-related pieces. So when you look at the CERT C coding standard and its ties back to CWE, you see which rules directly reduce/affect which weaknesses, and which ones don't. (Or, you *could*, if you wanted to look at it closely enough).

The 2010 OWASP Top 10 RC1 is more data-driven than previous versions; same with the 2010 Top 25 (whose release has been delayed to Feb 16, btw). Unlike last year's Top 25 effort, this time I received several sources of raw prevalence data, but unfortunately it wasn't in sufficiently consumable form to combine.

In tool analysis efforts such as SAMATE, we are still wrestling with the notion of what a "false positive" really means, not to mention the challenge of analyzing mountains of raw data, using tools that were intended for developers in a third-party consulting context, combined with the multitude of perspectives in how weaknesses are described (e.g., what do you do if there's a chain from weakness X to Y, and tool 1 reports X, and tool 2 reports Y?)

In fact, I am willing to bet that the different members of my Application Security team who have all worked together for about 8 years would answer a significant number of the BSIMM Begin survey questions quite differently.

Even surveys using much lower-level detailed questions - such as which weaknesses on a "nominee list" of 41 are the most important and prevalent - have had distinct responses from multiple people within the same organization. (I'll touch on this a little more when the 2010 Top 25 is released). Arguably many of these differences in opinion come down to variations in context and experience, but unless and until we can model "context" in a way that makes our results somewhat shareable, we can't get beyond the data collection phase.

I for one am pretty satisfied with the rate at which things are progressing and am delighted to see that we're finally getting some raw data, as good (or as bad) as it may be. The data collection process, source data, metrics, and conclusions associated with the 2010 Top 25 will probably be controversial, but at least there's some data to argue about. So in that sense, I see Gary's article not so much as a clarion call for action to a reluctant and primitive industry, but an early announcement of a shift that is already underway.

- Steve
_______________________________________________
Secure Coding mailing list (SC-L) SC-L@securecoding.org
List information, subscriptions, etc - http://krvw.com/mailman/listinfo/sc-l
List charter available at - http://www.securecoding.org/list/charter.php
SC-L is hosted and moderated by KRvW Associates, LLC (http://www.KRvW.com)
as a free, non-commercial service to the software security community.
_______________________________________________

Reply via email to