David Crocker wrote: Apart from the obvious solution of choosing another language, there are at least two ways to avoid these problems in C++: 1. Ban arrays (to quote Marshall Cline's C++ FAQ Lite, arrays are evil!). Use ... 2. If you really must have naked arrays, ban the use of indexing and arithmetic on naked pointers to arrays (i.e. if p is a pointer, then p[x], p+x, p-x, ++p If you want safer C and you want the compiler to enforce it, and you don't mind having to re-write your code some, then use one of the safer C dialects (CCured http://manju.cs.berkeley.edu/ccured/ and Cyclone http://www.research.att.com/projects/cyclone/). These tools provide a nice mid-point in the amount of work you have to do to reach various levels of security in C/C++: * low security, low effort o do nothing o code carefully o apply defensive compilers, e.g. StackGuard o apply code auditors, e.g. RATS, Flawfinder o port code to safer C dialects like CCured and Cyclone o re-write code in type safe languages like Java and C# o apply further code security techniques, e.g. formal theorem provers WRT a formal spec * high security, high effort Crispin -- Crispin Cowan, Ph.D. http://immunix.com/~crispin/ CTO, Immunix http://immunix.com
ljknews wrote: And there are ways of using Assembly Language to avoid pitfalls that it provides. There are ways of using horse-drawn carriages to avoid the major reason (think street cleaning) why the automobile was embraced in urban areas during the early part of the 20th century. What there are _not_ are reasons for new development to cling to languages which make flawed constructs easy for the individual programmer to misuse. There is often one very good reason for clinging to such languages: because there is no other suitable language available. As has already been pointed out, there is usually very little alternative to C/C++ when it comes to kernel mode code. I would love to see a safer alternative to C/C++ for writing operating system kernels, device drivers and real-time embedded applications (yes, I know Ada can be used for the latter, buts its package structure and associated with-ing problem can be a real pain in large systems). The newer programming languages such as Java, C# and Eiffel are all designed for applications and make heavy use of garbage collection (which is a boon to application writers, but a no-no for kernel code and many embedded apps). What I don't understand is why the language standardisation committees don't remove some of the unsafe dross when they bring out new language revisions. A classic example is the failure to define the type of a string literal in C as const char*. Just imagine if we had a new version of C++ with all the unsafe stuff removed. Backwards compatibility would not be a problem in practice, because the compiler suppliers would provide switches to make their new compilers accept obsolete constructs. As for non-real-time application-level code, I gave up writing them in C++ by hand long ago. [Actually I prefer to write specifications, verify them, and let a code generator produce correct C++ from them; but that is another story.] David Crocker Consultancy contracting for dependable software development www.eschertech.com
On Wed, Jun 09, 2004 at 03:34:52PM +0100, David Crocker wrote: Apart from the obvious solution of choosing another language, there are at least two ways to avoid these problems in C++: 1. Ban arrays (to quote Marshall Cline's C++ FAQ Lite, arrays are evil!). Use classes from the STL, or another template library instead. Arrays should be used only in the template library, where their use can be controlled. 2. If you really must have naked arrays, ban the use of indexing and arithmetic on naked pointers to arrays (i.e. if p is a pointer, then p[x], p+x, p-x, ++p and --p are all banned). Instead, refer to arrays using instances of a template class ArrayX that encapsulates both the pointer (an X*) and the limit (an unsigned int). Unfortunately, I don't think this advice will work for many projects. First, Many programs must make system calls that only use arrays. Some of those calls are unsafe. Second, There is a lot of legacy code written with the error-prone array indexing that you condemn. While the code must be maintained, changing it introduces risks of new bugs that lead to instability, and many people aren't willing to take that risk. So I think your advice to ban arrays could only be applied to new code, and new projects. Either that, or the conversion must be made gradually, and must be timed at the right stage of a maintenance cycle. - Jared
der Mouse (Maus surely?) wrote [snip] Well, actually, but for the world's addiction to sloppy coding. It's entirely possible to avoid buffer overflows in C; it just requires a little care in coding. C's major failing in this regard - and I don't actually consider it all that major - is that it doesn't provide any tools to help. It assumes that you the programmer know what you're doing, and the mismatch between that and the common reality is where the problem actually comes from. I dislike this commonly-used argument that essentially says you should only employ above average people who don't make mistakes. It is flawed on lots of levels. 1. On average ability over our industry is average! 2. Even brilliant, infallible programmers like me make mishtukes shummtimes. 3. Even if above average, non-sloppy programmers can avoid mistakes, the effort they spend doing so is a distraction from their real job of solving the problem the program is intended for. 4. The levels of mental abstraction needed to solve an application domain problem and to worry about operator precedence and buffer overflow are completely different; there is good evidence that humans don't work well at more than one abstraction level at a time. All that a better language will bring you in this regard is that it will (a) push the sloppiness into places the compiler can't check and (b) change the ways things break when confronted with input beyond the design underlying their code. This sounds like the Syrius Cybernetics defence (from the Hitch Hiker's Guide to the Galaxy); essentially you seem to be saying it is OK if all the deep and complex flaws in a product are completely obscured by all the shallow and obvious ones. You can't assume that the sloppy programmer in C /only/ introduces shallow errors. In practice, well designed languages can do much more than you claim. They can completely eliminate whole classes of error that currently exercise our attention, make sloppiness very hard to conceal and make it much easier to find any subtle errors that remain. Peter ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. The IT Department at Praxis Critical Systems can be contacted at [EMAIL PROTECTED] This footnote also confirms that this email message has been swept by MIMEsweeper for the presence of computer viruses. www.mimesweeper.com ** This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk
At 9:11 AM -0400 6/9/04, Gary McGraw wrote: Language makes a huge difference, eapecially in the realm of bugs. So not using C and C++ is smart. Use Java or C# instead. Or Ada, or PL/I, or Pascal, or Eiffel, etc. There are _lots_ of choices out there.
Sloppy coding can be done in any language, but C and C++ have 3 features that aggravate the problem: 1. The array=pointer idiom. Given a parameter which is an array, you can't ask at run-time how big the array is - you have to do extra work and pass the size in an additional parameter (whereas most languages pass the size of an array automatically as part of the array). So doing buffer overflow checks requires more than just inserting the check - you have to make sure that an extra array size parameter is passed all the way down the call stack. [C and C++ cannot be considered strongly typed, in part because of the lack of distinction between pointers and arrays]. 2. By permitting pointer arithmetic, C and C++ encourage you to pass a pointer into the middle of a buffer, rather than passing the [start of the] buffer and an index into it, which makes bounds checking even more tedious to do (you have to pass a pointer to one-past-the-end-of-the-array as well, and even then this is less useful than having an index and a limit). 3. The lack of automatic bounds checking. Of course, run-time bounds checking is by no means a complete solution, but it does at least prevent a buffer overflow being used as a security attack, turning it into a denial-of-service attack instead. Apart from the obvious solution of choosing another language, there are at least two ways to avoid these problems in C++: 1. Ban arrays (to quote Marshall Cline's C++ FAQ Lite, arrays are evil!). Use classes from the STL, or another template library instead. Arrays should be used only in the template library, where their use can be controlled. 2. If you really must have naked arrays, ban the use of indexing and arithmetic on naked pointers to arrays (i.e. if p is a pointer, then p[x], p+x, p-x, ++p and --p are all banned). Instead, refer to arrays using instances of a template class ArrayX that encapsulates both the pointer (an X*) and the limit (an unsigned int). Such an object needs only 2 words of storage (compared to 1 word for a naked pointer), so it can assigned and passed by value. You can provide an operator to return the value of the limit, and an indexing operator (with optional bounds checking). If you really must, you can even implement pointer arithmetic operators for the class which update the limit at the same time as updating the pointer. David Crocker Consultancy contracting for dependable software development www.eschertech.com
[EMAIL PROTECTED] wrote on Wednesday, June 09, 2004 7:58 AM: Although I am in favor of languages that help prevent such nasties as input buffer overruns, this is an excellent point. A sloppy programmer will write sloppy code. Reminds me of an old saying that I heard years ago while studying mechanical engineering: a determined programmer can write a FORTRAN program in ANY language. :-) (Well, notwithstanding FORTRAN's built-in ability of handling complex numbers, but I digress...) Going back over some of my old FORTRAN code, I find that I was writing object-oriented code in FORTRAN. Going over other people's C++ code, I can see that they're trying to make it work like FORTRAN, or QuickBASIC, or something like that. I did some work recently on .NET Security, trying to come up with some examples that would demonstrate how you'd screw it up in code. It's certainly difficult to come up with bad examples that aren't needlessly bone-headed, but when you look at other people's code, you realise that an awful lot of programmers are bone-headed. Buffer overflows can happen in any language, no matter what those languages do to prevent them. Okay, that's a bold statement. I'd better back it up. If you have a string-handling library of any kind, someone's going to come up with a program design that builds a twenty character string for a person's name, putting first name in the first ten characters, and last name in the last ten characters. Eric Smith changes his first name to Navratilova, and he's suddenly listed by the program as Navratilovamith amith - buffer overflow. Sure, it doesn't overflow into the stack, but it overflows into important data. And if you want to go further into insanity, you can manufacture a case where character 11 being lower case causes unwanted code to be executed (no default condition in a 'case' statement, no good error handling, etc). IMHO, the bottom line is that there's no excuse for sloppiness and a strong language can only do so much to prevent the programmer from his/her own sloppiness. The first defence against unsecure coding is to hire and educate your developers in such a way as to exclude the unsecure coding practices. It's not the only defence - but it's the first you're going to need, because if you don't have that, you've got programmers who will flout security prevention measures _because_ they don't understand how to do it properly, or why they're being strong-armed in a particular direction. And on the topic of hiring better programmers, I'm now in my third week as [EMAIL PROTECTED] [But my personal address remains this one] Alun.
At 1:10 PM -0400 6/8/04, Jose Nazario wrote: thought some of you may find this editorial from the May 04 ACM Queue worth a read. ACM Queue is an interesting magazine and has a website at acmqueue.org. Buffer Overrun Madness ACM Queue vol. 2, no. 3 - May 2004 by Rodney Bates, Wichita State University Why do good programmers follow bad practices? In January 2003, the Slammer worm was reported to be the fastest spreading ever. Slammer gets access by exploiting a buffer overrun. If you peruse CERT (Computer Emergency Readiness Team) advisories or security upgrade releases, you will see that the majority of computer security holes are buffer overruns. These would be minor irritations but for the world's addiction to the weakly typed programming languages C and its derivative C++. And yet this mailing list, supposedly devoted to secure coding, seem polarized around the notion of shoring up those languages.