I was recently looking at some code to do regular expression matching,
when it occurred to me that one can produce fairly small regular
expressions that require huge amounts of space and time.  There's
nothing in the slightest bit illegal about such regexp's - it's just
inherent in regular expressions that such things exist.

Or consider file compression formats.  Someone out there has a hand-
constructed zip file that corresponds to a file with more bytes than
there are particles in the universe.  Again, perfectly legal as it
stands.

Back in the old days, when users ran programs in their own processes and
operating systems actually bothered to have a model of resource usage
that they enforced, you could at least ensure that the user could only
hurt himself if handed such an object.  These days, OS's tend to ignore
resource issues - memory and time are, for most legitimate purposes,
"too cheap to meter" - and in any case this has long moved outside of
their visibility:  Clients are attaching to multi-thread servers, and
all the OS sees is the aggregate demand.

Allocating huge amounts of memory in almost any multi-threaded app is
likely to cause problems.  Yes, the thread asking for the memory will
die - but unless the code is written very defensively, it stands a
good chance of bring down other threads, or the whole application,
along with it:  Memory is a global resource.

We recently hardened a network protocol against this kind of problem.
You could transfer arbitrary-sized strings over the link.  A string
was sent as a 4-byte length in bytes, followed by the actual data.
A request for 4 GB would fail quickly, breaking the connection.  But
a request for 2 GB might well succeed, starving the rest of the
application.  Worse, the API supports groups of requests - e.g.,
arguments to a function.  Even though the individual requests might
look reasonable, the sum of them could crash the application.  This
makes the hardened code more complex:  You can't just limit the
size of an individual request, you have to limit the total amount
of memory allocated in multiple requests.  Also, because in general
you don't know what the total will be ahead of time, you end up
having to be conservative, so that if a request gets right up close
to the limit, you won't cause the application problems.  (This, of
course, could cause the application *other* problems.)

Is anyone aware of any efforts to control these kinds of vulnerabili-
ties?  It's something that cries out for automation:  Getting it right
by hand is way too hard.  Traditional techniques - strong typing,
unavoidable checking of array bounds and such - may be required for a
more sophisticated approach, but they don't in and of themselves help:
One can exhaust resources with entirely "legal" requests.

In addition, the kinds of resources that you can exhaust this way is
broader than you'd first guess.  Memory is obvious; overrunning a thread
stack is perhaps less so.  (That will *usually* only affect the thread
in question, but not always.)  How about file descriptors?  File space?
Available transmission capacity for a variety of kinds of connections?

                                                        -- Jerry

_______________________________________________
Secure Coding mailing list (SC-L)
SC-L@securecoding.org
List information, subscriptions, etc - http://krvw.com/mailman/listinfo/sc-l
List charter available at - http://www.securecoding.org/list/charter.php

Reply via email to