Pascal Meunier wrote:
> Do you think it is possible to enumerate all the ways
> all vulnerabilities can be created? Is the set of all
> possible exploitable programming mistakes bounded?

By "bounded" I take you to mean "finite." In particular with reference
to your taxonomy below. By "enumerate" I take you to mean list out in
a finite way. Please note, these are not the standard mathematical
meanings for these terms. Though, they may be standard for CS folks.

If I interpreted you correctly, then the answer is, "no," as Crispin
indicated.

However, let's take "enumerate" to mean "list out, one by one" and allow
ourselves to consider infinite enumerations as acceptable. In this case,
the answer becomes, "yes."

This proof is abbreviated, but should be recognizable as a pretty
standard argument by those familiar with computable functions and/or
recursive function theory.

   Thm. The set of exploits for a program is enumerable.

   Pf.

   Let P(x) be a program computing the n-ary, partially computable
   function F(x). Let an "exploit" be a natural number input, y, such
   that at some time, t, during the computation performed by P(y) the
   fixed memory address, Z, contains the number k.**

   Then, there exists a computable function G(x,t) such that:

        - G(x, t) = 1 if and only if P(x) gives value k to address Z at
        some time less than or equal to t.

        - G(x, t) = 0 otherwise.

   The values of x for which G(x,t) = 1 is effectively enumerable (in
   the infinite sense) because it is the domain of a computable function.

                                                        Q.E.D.

You can look up the relevent theory behind this proof in [Davis].

So, where does this leave us? Well, what we don't have is a computable
predicate, "Exploit(p,y)", that always tells us if y is an exploit for
the program p. That's what Crispin was saying about Turing. This predicate
is equivalently hard to "Halt(p,y)", which is not computable.****

However, we can enumerate all the inputs that eventually result in the
computer's state satisfying the (Z == k) condition. I suspect this is
probably all you really need for a given program, as a practical matter.
Since, for example, most attackers probably will not wait for hours and
hours while an exploit develops.*****

I think the real issue here is complexity, not computability. It takes a
long time to come up with the exploits. Maybe the time it takes is too
long for the amount of real economic value gained by the knowledge of
what's in that set. That seems to be part of Crispin's objection (more or
less).


> I would think that what makes it possible to talk about design patterns and
> attack patterns is that they reflect intentional actions towards "desirable"
> (for the perpetrator) goals, and the set of desirable goals is bounded at
> any given time (assuming infinite time then perhaps it is not bounded).

I think this is a very reasonable working assumption. It seems
consistent with my experience that given any actual system at any actual
point in time there are only finitely many "desirable" objectives in
play. There are many more theoretical objectives, though, so how you
choose to pare down the list could determine whether you end up with a
useful scheme, or not.


> All we can hope is to come reasonably close and produce something useful,
> but not theoretically strong and closed.

I think that there's lots of work going on in proof theory and Semantics
that makes me hopeful we'll eventually get tools that are both useful
and strong. Model Checking is one approach and it seems to have alot of
promise. It's relatively fast, e.g., and unlike deductive approaches it
doesn't require a mathematician to drive it. See [Clarke] for details.
[Clarke] is very interesting, I think. He explicitly argues that model
checking beats other formal methods at dealing with the "state space
explosion" problem.

Those with a more practical mind-set are probably laughing that beating
the other formal methods isn't really saying much because they are all
pretty awful. ;-)

> Is it enough to look for violations of some
> invariants (rules) without knowing how they happened?

In the static checking sense, I don't see how this could be done.


> Any thoughts on this?  Any references to relevant theories of failures and
> errors, or to explorations of this or similar ideas, would be welcome.

There are academics active in this field of research. Here's a few
links:

    http://cm.bell-labs.com/cm/cs/what/spin2005/

    
http://www.google.com/search?q=international+SPIN+workshop&start=0&start=0&ie=utf-8&oe=utf-8&client=firefox-a&rls=org.mozilla:en-US:official


ciao,

 -nash

Notes:

** This definition of "exploit" is chosen more or less arbitrarily. It
seems reasonable to me. It might not be. I would conjecture that any
definition of "exploit" would be equivalent to this issue, though.

**** Halt(x,y) is not computable, but it is enumerable. That is, I can
list out, one by one, all the inputs y on which program x eventually halts.
I just don't get them in any kind of useful order. And, there is no
program to list out the inputs on which program x never halts.


****** It would be extremely interesting to know how many exploits could
be expected after a reasonable period of execution time. It seems that
as execution time went up we'd be less likely to have an exploit just
"show up". My intuition could be completely wrong, though.

References:

[Davis] "Computability, Complexity, and Languages", 2nd Ed.,
Martin Davis, Ron Sigal, Elaine Weyuker, Elsevier Science, 1983, 1994.
ISBN: 0-12-206382-1

[Clarke] "Model Checking", E.M. Clarke, MIT PRess, 1999.

-- 

An ideal world is left as an exercise for the reader.

        - Paul Graham


Reply via email to