---------- Forwarded message ----------
From: "Steven M. Christey" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED], [email protected],
[email protected]
Date: Thu, 5 Jan 2006 02:12:32 -0500 (EST)
Subject: Open Letter on the Interpretation of "Vulnerability Statistics"

Open Letter on the Interpretation of "Vulnerability Statistics"
---------------------------------------------------------------
Author: Steve Christey, CVE Editor
Date: January 4, 2006


All,

As the new year begins, there will be many temptations to generate,
comment, or report on vulnerability statistics based on totals from
2005.  The original reports will likely come from publicly available
Refined Vulnerability Information (RVI) sources - that is,
vulnerability databases (including CVE/NVD), notification services,
and periodic summary producers.

RVI sources collect unstructured vulnerability information from Raw
Sources.  Then, they refine, correlate, and redistribute the
information to others.  Raw sources include mailing lists like
Bugtraq, Vulnwatch, and Full-Disclosure, web sites like PacketStorm
and Securiteam, blogs, conferences, newsgroups, direct emails, etc.

In my opinion, RVI sources are still a year or two away from being
able to produce reliable, repeatable, and COMPARABLE statistics.  In
general, consumers should treat current statistics as suggestive, not
conclusive.

Vulnerability statistics are difficult to interpret due to several
factors:

 - VARIATIONS IN EDITORIAL POLICY.  An RVI source's editorial policy
  dictates HOW MANY vulnerabilities are reported, and WHICH
  vulnerabilities are reported.  RVIs have widely varying policies.
  You can't even compare an RVI against itself, unless you can be
  sure that its editorial policy has not changed within the relevant
  data set.  The editorial policies of RVIs seem to take a few years
  before they stabilize, and there is evidence that they can change
  periodically.

 - FRACTURED VULNERABILITY INFORMATION.  Each RVI source collects its
  information from its own list of raw sources - web sites, mailing
  lists, blogs, etc.  RVIs can also use other RVIs as sources.
  Apparently for competitive reasons, some RVIs might not identify
  the raw source that was used for a vulnerability item, which is one
  aspect of what I refer to as the provenance problem.  Long gone are
  the days when a couple mailing lists or newsgroups were the raw
  source for 90% of widely available vulnerability information.
  Based on what I have seen, the provenance problem is only going to
  get worse.

 - LACK OF COMPLETE CROSS-REFERENCING BETWEEN RVI SOURCES.  No RVI has
  an exhaustive set of cross-references, so no RVI can be sure that
  it is 100% comprehensive, even with respect to its own editorial
  policy.  Some RVIs compete with each other directly, so they don't
  cross-reference each other.  Some sources could theoretically
  support all public cross-references - most notably OSVDB and CVE -
  but they do not, due to resource limitations or other priorities.

 - UNMEASURABLE RESEARCH COMMUNITY BIAS.  Vulnerability researchers
  vary widely in skill sets, thoroughness, preference for certain
  vulnerability types or product classes, and so on.  This
  collectively produces a bias that is not currently measurable
  against the number of latent vulnerabilities that actually exist.
  Example: web browser vulnerabilities were once thought to belong to
  Internet Explorer only, until people actually started researching
  other browsers; many elite researchers concentrate on a small
  number of operating systems or product classes; basic SQL injection
  and XSS are very easy to find manually; etc.

 - UNMEASURABLE DISCLOSURE BIAS.  Vendors and researchers vary widely
  in their disclosure models, which creates an unmeasurable bias.
  For example, one vendor might hire an independent auditor and patch
  all reported vulnerabilities without publicly announcing any of
  them, or a different vendor might publish advisories even for very
  low-risk issues.  One researcher might disclose without
  coordinating with the vendor at all, whereas another researcher
  might never disclose an issue until a patch is provided, even if
  the vendor takes an inordinate amount of time to respond.

Note that many large-scale comparisons, such as "Linux vs. Windows,"
can not be verified due to unmeasurable bias, and/or editorial policy
of the core RVI that was used to conduct the comparison.


EDITORIAL POLICY VARIATIONS
---------------------------

This is just a sample of variations in editorial policy.  There are
legitimate reasons for each variation, usually due to audience needs
or availability of analytical resources.

COMPLETENESS (what is included):

 1) SEVERITY.  Some RVIs do not include very low-risk items such as a
    bug that causes path disclosure in an error message in certain
    non-operational configurations.  Secunia and SecurityFocus do not
    do this, although they might note this when other issues are
    identified.  Others include low-risk issues, such as CVE, ISS
    X-Force, US-CERT Security Bulletins, and OSVDB.

 2) VERACITY.  Some RVIs will only publish vulnerabilities when they
    are confident that the original, raw report is legitimate - or if
    they're verified it themselves.  Others will publish reports when
    they are first detected from the raw sources.  Still others will
    only publish reports when they are included in other RVIs, which
    makes them subject to the editorial policies of those RVIs unless
    care is taken.  For example, US-CERT's Vulnerability Notes have a
    high veracity requirement before they are published; OSVDB and
    CVE have a lower requirement for veracity, although they have
    correction mechanisms in place if veracity is questioned, and CVE
    has a two-stage approach (candidates and entries).

 3) PRODUCT SPACE.  Some RVIs might omit certain products that have
    very limited distribution, are in the beta development stage, or
    are not applicable to the intended audience.  For example,
    version 0.0.1 of a low-distribution package might be omitted, or
    if the RVI is intended for a business audience, video game
    vulnerabilities might be excluded.  On the other hand, some
    "beta" products have extremely wide distribution.

 4) OTHER VARIATIONS.  Other variations exist but have not been
    studied or categorized at this time.  One example, though, is
    historical completeness.  Most RVIs do not cover vulnerabilities
    before the RVI was first launched, whereas others - such as CVE
    and OSVDB - can include issues that are older than the RVI
    itself.  As another example: a few years ago, Neohapsis made an
    editorial decision to omit most PHP application vulnerabilities
    from their summaries, if they were obscure products, or if the
    vulnerability was not exploitable in a typical operational
    configuration.

ABSTRACTION (how vulnerabilities are "counted"):

 5) VULNERABILITY TYPE.  Some RVIs distinguish between types of
    vulnerabilities (e.g. buffer overflow, format string, symlink,
    XSS, SQL injection).  CVE, OSVDB, ISS X-Force, and US-CERT
    Vulnerability Notes perform this distinction; Secunia, FrSIRT,
    and US-CERT Cyber Security Bulletins do not.  Bugtraq IDs vary.
    As vulnerability classification becomes more detailed, there is
    more room for variation (e.g. integer overflows and off-by-ones
    might be separated from "classic" overflows).

 6) REPLICATION.  Some RVIs will produce multiple records for the
    same core vulnerability, even based on the RVI's own definition.
    Usually this is done when the same vulnerability affects multiple
    vendors, or if important information is released at a later date.
    Secunia and US-CERT Security Bulletins use replication; so might
    vendor advisories (for each supported distribution).  OSVDB,
    Bugtraq ID, CVE, US-CERT Vulnerability Notes, and ISS X-Force do
    not - or, they use different replication than others.
    Replication's impact on statistics is not well understood.

 7) OTHER VARIATIONS.  Other abstraction variations exist but have
    not been studied or categorized at this time.  As one example, if
    an SQL injection vulnerability affects multiple executables in
    the same product, OSVDB will create one record for each affected
    program, whereas CVE will combine them.

TIMELINESS:

 8) RVIs differ in how quickly they must release vulnerability
    information.  While this used to vary significantly in the past,
    these days most public RVIs have very short timelines, from the
    hour of release to within a few days.  Vulnerability information
    can be volatile in the early stages, so an RVI's requirements for
    timeliness directly affects its veracity and completeness.

REALITY:

 9) All RVIs deal with limited resources or time, which significantly
    affects completeness, especially with respect to veracity, or
    timeliness (which is strongly associated with the ability to
    achieve completeness).  Abstraction might also be affected,
    although usually to a lesser degree, except in the case of
    large-scale disclosures.


Conclusion
----------

In my opinion:

You should not interpret any RVI's statistics without considering its
editorial policy.  For example, the US-CERT Cyber Security Bulletin
Summary for 2005 uses statistics that include replication.  (As a side
note, a causal glance at the bulletin's contents makes it clear that
it cannot be used to compare Windows to Linux as operating systems.)

In addition, you should not compare statistics from different RVIs
until (a) the RVIs are clear about their editorial policy and (b) the
differences in editorial policy can be normalized.  Example: based on
my PRELIMINARY investigations of a few hours' work, OSVDB would have
about 50% more records than CVE, even though it has the same
underlying number of vulnerabilities and the same completeness policy
for recent issues.

Third, for the sake of more knowledgeable analysis, RVIs should
consider developing and publishing their own editorial policies.
(Note that based on CVE's experience, this can be difficult to do.)
Consumers should be aware that some RVIs might not be open about their
raw sources, veracity analysis, and/or completeness.

Finally: while RVIs are not yet ready to provide usable, conclusive
statistics, there is a solid chance that they will be able to do so in
the near future.  Then, the only problem will be whether the
statistics are properly interpreted.  But that is beyond the scope of
this letter.


Steve Christey
CVE Editor

P.S.  This post was written for the purpose of timely technical
exchange.  Members of the press are politely requested to consult me
before directly attributing quotes from this article, especially with
respect to stated opinion



On 1/7/06, Basem Narmok <[EMAIL PROTECTED]> wrote:
> On 1/6/06, Basem Narmok <[EMAIL PROTECTED]> wrote:
> > >On 1/6/06, Mohammad Halawah <[EMAIL PROTECTED]> wrote:
> > > On Friday 06 January 2006 11:56, Fadi Kahhaleh wrote:
> > > > Guyz. WTF!!!
> > > >
> > > >
> > > >
> > > > "
> > > >
> > > > Check it out
> > > >
> > > Note:
> > > just as reminder (I guess you know it) please start a new thread for a
> > > new subject. if you use "reply" for an old thread, the thread ID will
> > > be used and we can't see logical relations between messages.
> >
> > Sorry I will remember this next time, gmail does the job for me, thank you 
> > ;)
> >
> > Have fun
> > Basem
> >
> I miss understood! I thought Mohammed means "Bottom/Top Posting".
>
> _______________________________________________
> General mailing list
> [email protected]
> http://mail.jolug.org/mailman/listinfo/general_jolug.org
>


--
abulyomon

www.KiLLTHeUPLiNK.com

_______________________________________________
General mailing list
[email protected]
http://mail.jolug.org/mailman/listinfo/general_jolug.org

Reply via email to