There has been some demotivating discussions and statements on the mailing list about input validation, and I really don't understand the attitude. It might be based on misunderstandings and miscommunication. I have tried to write a (long!) summary of what I think is the topic of discussion, and I would appreciate if other more senior developers would comment on it and clarify the direction we want to take in FreeRDP.
Trade-off I assume we in principle can agree that security and stability is more important than features and performance. Security issues can prevent all use of the product, while missing features or bad performance only prevents some use of the product. We must obviously also keep in mind that missing features or bad performance can be so bad that they make the product unusable for some purposes. But features should never compromise security, and optimization should never compromise correctness. It must be accepted that fixes for security and correctness might cause a reasonable reduction of features and performance. Many program properties holds as invariants and can perhaps be proven, but for program input the only options are to either check it, or to be naive and just assume that it is correct and perhaps consider the potential consequences. Each check has a cost, both in code size and CPU consumption. Obviously, if the cost/benefit ratio is too high (because the check is expensive or the benefit is low) it shouldn't be done in production code but might be useful for debugging. Some assertions can be proven at compile time, and then there is no benefit from checking them at run-time and they could just be disabled. But what if the assertions states assumptions that can't be proven? The code has neither been tested nor designed to work without these assumptions. Is it fair to give users code that has unintended reachable code paths where anything can happen - especially security-wise? I don't think so. If the cost is reasonable the checks should be left in. Assert or not? How should such validating checks be implemented and what should we call them? C assert()s has traditionally been implemented as macros that often are disabled in release builds and completely disappears. When enabled an assert() expands to an if statement optimized for false that checks the negated boolean expression. If the assertion fails it will print location information and the text of the asserted expression, and then it will abort program execution immediately. I have seen several "modern Linux" projects that not necessarily disable assertions in release builds. I have seen several assertions from my desktop OS, and they have mde it possible to find or file relevant bug reports. If it wasn't for the assertions the result most likely would have been unusable reports of random crashes and segmentation faults and bugs that wouldn't be fixed. FreeRDP do have a number of assert() calls in asn1/ - do anyone do anything to disable them? We also have some exit() calls in libfreerdp. (Some "code quality" tools correctly complains about using that in a library.) These exit() calls also shows some kind "do whatever you want, but don't ever come back" assertion pattern. What I did in FreeRDP was to add some ASSERT macros in libfreerdp/debug.h that wrapped the standard assert() from assert.h. The macros makes it easy to find all uses, change them or change the implementation. assert() was a rough approximation of what I had in mind, so I based the macro implementations on that and named the macros after that. A consequence of that is that the user experience on assertion violations is that the application crashed. In a way they are right, but the user was protected from a potentially worse crash, and the crash provides some useful information to the developers. This scheme works, but we could do something better. Another consequence of the naming is that it might indicate that these are expensive checks and that they should be disabled in release builds. Neither of these connotations were intended. I intended (and used) it for general handling of cheap validating checks that I think we always should do. Perhaps I should rename ASSERT to ENSURE or VERIFY or GUARANTEE or VALIDATE or something else ? Perhaps the implementation should be changed to something that won't go away when exceptions are disabled? Thread safety A problem with all fatal exception handling in a complex C library like libfreerdp is how to report fatal errors back to the main application and allow the main application to continue running without interrupting other threads and without loosing any resources. The only good solution I know is to let _all_ functions take a parameter with a call-back for error reporting and make sure error codes are returned and checked all the time. All functions must make sure they release all resources on errors - often by using error labels and goto. That is really not a good solution and hardly feasible, but I don't think there are any other solutions. Do we want to do that? The alternative is to only handle the most common exceptions gracefully (such as ordinary logout and redirect) and accept that other exceptions can bring the main application in a state where it can't continue. That is what we have now. Input stream bounds checking I am guilty of introducing bounds checking in our low level input/output stream processing macros in libfreerdp/stream.h. I did not consider it controversial or risky at all. I carefully considered the potential performance impact. Tests showed no noticeable difference. Later benchmarks confirmed that the cost even in the most extremely pessimistic case would be less than 1% of the CPU. After an undiagnosed complaint I immediately disabled it so it now only is enabled when configured --with-debug or ENABLE_STREAM_CHECK is defined. The checks (instantiated inline almost 1000 times) did however increase the size of the executable with 40k. Half of that is error messages. That increase can in principle have a big impact on small devices. The size overhead could be reduced a lot at the cost of slightly less detailed error messages. I have not seen any evidence that these checks caused a noticeable overhead. No bisect that points at a specific change. No clear indication what difference it makes if ENABLE_STREAM_CHECK is toggled. These boundary checks are a simple and very reliable first level of input validation and protection from buffer overflows. We could and should do something that is a lot smarter and efficient, but this is all we have now. FWIW: Considering how simple it is for a rogue server to crash or attack a client without these checks I think it is irresponsible to ship freerdp without these checks. ( http://sourceforge.net/mailarchive/message.php?msg_id=26528018 ) Over-specified inconsistency The RDP protocol specifies over-specification lots of places, especially in nested structures where each nesting level has its own length field. Obviously these lengths should add up and match. What should we do if they don't match? Detected inconsistencies shows that either someone (an attacking or buggy server) sent us incorrect data that by definition can't be interpreted correctly, or that we have a bug in our implementation and can't reasonably expect that if we do our best and continue then nobody will notice. Ignoring inconsistencies and just try to continue using one of the inconsistent values might work in some cases, but obviously that is not a good solution. It is better to inform the user that bad things happened and that if the user thinks it is a bug in freerdp then the developers should be notified and add a bug-fix or workaround. I added some of that kind of checks - especially in libfreerdp/rdp.c for checking that we really processed exactly the right amount of data from the input streams. That has helped catching several bugs by pointing out early that a problem occurred, long time before we would have stumbled upon the consequences. It is annoying when these showstoppers pops up in cases where it would have worked if we just had used one or the other value, but I don't think that counts as an argument for just choosing one of them by default. Code review? I have described why I don't think the changes I made are controversial at all. I did not know that some developers had that strong feelings on this topic - and I still don't understand why. When I made the changes it was anarchy with no tradition for review. Fortunately that has improved a bit since then, but we don't review everything. Even today I doubt that I would have considered my changes in this area so interesting or controversial that I would have annoyed you with a review request. If someone thinks the guidelines should be further formalized, clarified or updated then please do so. What now? If someone will undo something then I can't and won't stop that. If fellow developers can explain and agree which changes they would like to see then I might be able to do that. Please share your opinion. Let's learn from the past and then move on. /Mads ------------------------------------------------------------------------------ Gaining the trust of online customers is vital for the success of any company that requires sensitive data to be transmitted over the Web. Learn how to best implement a security strategy that keeps consumers' information secure and instills the confidence they need to proceed with transactions. http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Freerdp-devel mailing list Freerdp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freerdp-devel