Pascal Meunier wrote: > Do you think it is possible to enumerate all the ways > all vulnerabilities can be created? Is the set of all > possible exploitable programming mistakes bounded?
By "bounded" I take you to mean "finite." In particular with reference to your taxonomy below. By "enumerate" I take you to mean list out in a finite way. Please note, these are not the standard mathematical meanings for these terms. Though, they may be standard for CS folks. If I interpreted you correctly, then the answer is, "no," as Crispin indicated. However, let's take "enumerate" to mean "list out, one by one" and allow ourselves to consider infinite enumerations as acceptable. In this case, the answer becomes, "yes." This proof is abbreviated, but should be recognizable as a pretty standard argument by those familiar with computable functions and/or recursive function theory. Thm. The set of exploits for a program is enumerable. Pf. Let P(x) be a program computing the n-ary, partially computable function F(x). Let an "exploit" be a natural number input, y, such that at some time, t, during the computation performed by P(y) the fixed memory address, Z, contains the number k.** Then, there exists a computable function G(x,t) such that: - G(x, t) = 1 if and only if P(x) gives value k to address Z at some time less than or equal to t. - G(x, t) = 0 otherwise. The values of x for which G(x,t) = 1 is effectively enumerable (in the infinite sense) because it is the domain of a computable function. Q.E.D. You can look up the relevent theory behind this proof in [Davis]. So, where does this leave us? Well, what we don't have is a computable predicate, "Exploit(p,y)", that always tells us if y is an exploit for the program p. That's what Crispin was saying about Turing. This predicate is equivalently hard to "Halt(p,y)", which is not computable.**** However, we can enumerate all the inputs that eventually result in the computer's state satisfying the (Z == k) condition. I suspect this is probably all you really need for a given program, as a practical matter. Since, for example, most attackers probably will not wait for hours and hours while an exploit develops.***** I think the real issue here is complexity, not computability. It takes a long time to come up with the exploits. Maybe the time it takes is too long for the amount of real economic value gained by the knowledge of what's in that set. That seems to be part of Crispin's objection (more or less). > I would think that what makes it possible to talk about design patterns and > attack patterns is that they reflect intentional actions towards "desirable" > (for the perpetrator) goals, and the set of desirable goals is bounded at > any given time (assuming infinite time then perhaps it is not bounded). I think this is a very reasonable working assumption. It seems consistent with my experience that given any actual system at any actual point in time there are only finitely many "desirable" objectives in play. There are many more theoretical objectives, though, so how you choose to pare down the list could determine whether you end up with a useful scheme, or not. > All we can hope is to come reasonably close and produce something useful, > but not theoretically strong and closed. I think that there's lots of work going on in proof theory and Semantics that makes me hopeful we'll eventually get tools that are both useful and strong. Model Checking is one approach and it seems to have alot of promise. It's relatively fast, e.g., and unlike deductive approaches it doesn't require a mathematician to drive it. See [Clarke] for details. [Clarke] is very interesting, I think. He explicitly argues that model checking beats other formal methods at dealing with the "state space explosion" problem. Those with a more practical mind-set are probably laughing that beating the other formal methods isn't really saying much because they are all pretty awful. ;-) > Is it enough to look for violations of some > invariants (rules) without knowing how they happened? In the static checking sense, I don't see how this could be done. > Any thoughts on this? Any references to relevant theories of failures and > errors, or to explorations of this or similar ideas, would be welcome. There are academics active in this field of research. Here's a few links: http://cm.bell-labs.com/cm/cs/what/spin2005/ http://www.google.com/search?q=international+SPIN+workshop&start=0&start=0&ie=utf-8&oe=utf-8&client=firefox-a&rls=org.mozilla:en-US:official ciao, -nash Notes: ** This definition of "exploit" is chosen more or less arbitrarily. It seems reasonable to me. It might not be. I would conjecture that any definition of "exploit" would be equivalent to this issue, though. **** Halt(x,y) is not computable, but it is enumerable. That is, I can list out, one by one, all the inputs y on which program x eventually halts. I just don't get them in any kind of useful order. And, there is no program to list out the inputs on which program x never halts. ****** It would be extremely interesting to know how many exploits could be expected after a reasonable period of execution time. It seems that as execution time went up we'd be less likely to have an exploit just "show up". My intuition could be completely wrong, though. References: [Davis] "Computability, Complexity, and Languages", 2nd Ed., Martin Davis, Ron Sigal, Elaine Weyuker, Elsevier Science, 1983, 1994. ISBN: 0-12-206382-1 [Clarke] "Model Checking", E.M. Clarke, MIT PRess, 1999. -- An ideal world is left as an exercise for the reader. - Paul Graham