[Freerdp-devel] Input validation discussion

Mads Kiilerich Tue, 11 Jan 2011 06:11:38 -0800

There has been some demotivating discussions and statements on the 
mailing list about input validation, and I really don't understand the 
attitude. It might be based on misunderstandings and miscommunication. I 
have tried to write a (long!) summary of what I think is the topic of 
discussion, and I would appreciate if other more senior developers would 
comment on it and clarify the direction we want to take in FreeRDP.



Trade-off


I assume we in principle can agree that security and stability is more 
important than features and performance. Security issues can prevent all 
use of the product, while missing features or bad performance only 
prevents some use of the product. We must obviously also keep in mind 
that missing features or bad performance can be so bad that they make 
the product unusable for some purposes. But features should never 
compromise security, and optimization should never compromise 
correctness. It must be accepted that fixes for security and correctness 
might cause a reasonable reduction of features and performance.

Many program properties holds as invariants and can perhaps be proven, 
but for program input the only options are to either check it, or to be 
naive and just assume that it is correct and perhaps consider the 
potential consequences.

Each check has a cost, both in code size and CPU consumption. Obviously, 
if the cost/benefit ratio is too high (because the check is expensive or 
the benefit is low) it shouldn't be done in production code but might be 
useful for debugging.

Some assertions can be proven at compile time, and then there is no 
benefit from checking them at run-time and they could just be disabled.

But what if the assertions states assumptions that can't be proven? The 
code has neither been tested nor designed to work without these 
assumptions. Is it fair to give users code that has unintended reachable 
code paths where anything can happen - especially security-wise? I don't 
think so. If the cost is reasonable the checks should be left in.


Assert or not?


How should such validating checks be implemented and what should we call 
them?

C assert()s has traditionally been implemented as macros that often are 
disabled in release builds and completely disappears. When enabled an 
assert() expands to an if statement optimized for false that checks the 
negated boolean expression. If the assertion fails it will print 
location information and the text of the asserted expression, and then 
it will abort program execution immediately.

I have seen several "modern Linux" projects that not necessarily disable 
assertions in release builds. I have seen several assertions from my 
desktop OS, and they have mde it possible to find or file relevant bug 
reports. If it wasn't for the assertions the result most likely would 
have been unusable reports of random crashes and segmentation faults and 
bugs that wouldn't be fixed.

FreeRDP do have a number of assert() calls in asn1/ - do anyone do 
anything to disable them?

We also have some exit() calls in libfreerdp. (Some "code quality" tools 
correctly complains about using that in a library.) These exit() calls 
also shows some kind "do whatever you want, but don't ever come back" 
assertion pattern.

What I did in FreeRDP was to add some ASSERT macros in 
libfreerdp/debug.h that wrapped the standard assert() from assert.h. The 
macros makes it easy to find all uses, change them or change the 
implementation. assert() was a rough approximation of what I had in 
mind, so I based the macro implementations on that and named the macros 
after that.

A consequence of that is that the user experience on assertion 
violations is that the application crashed. In a way they are right, but 
the user was protected from a potentially worse crash, and the crash 
provides some useful information to the developers. This scheme works, 
but we could do something better.

Another consequence of the naming is that it might indicate that these 
are expensive checks and that they should be disabled in release builds. 
Neither of these connotations were intended. I intended (and used) it 
for general handling of cheap validating checks that I think we always 
should do.

Perhaps I should rename ASSERT to ENSURE or VERIFY or GUARANTEE or 
VALIDATE or something else ? Perhaps the implementation should be 
changed to something that won't go away when exceptions are disabled?


Thread safety


A problem with all fatal exception handling in a complex C library like 
libfreerdp is how to report fatal errors back to the main application 
and allow the main application to continue running without interrupting 
other threads and without loosing any resources.

The only good solution I know is to let _all_ functions take a parameter 
with a call-back for error reporting and make sure error codes are 
returned and checked all the time. All functions must make sure they 
release all resources on errors - often by using error labels and goto.

That is really not a good solution and hardly feasible, but I don't 
think there are any other solutions. Do we want to do that?

The alternative is to only handle the most common exceptions gracefully 
(such as ordinary logout and redirect) and accept that other exceptions 
can bring the main application in a state where it can't continue. That 
is what we have now.


Input stream bounds checking


I am guilty of introducing bounds checking in our low level input/output 
stream processing macros in libfreerdp/stream.h.

I did not consider it controversial or risky at all.

I carefully considered the potential performance impact. Tests showed no 
noticeable difference. Later benchmarks confirmed that the cost even in 
the most extremely pessimistic case would be less than 1% of the CPU. 
After an undiagnosed complaint I immediately disabled it so it now only 
is enabled when configured --with-debug or ENABLE_STREAM_CHECK is defined.

The checks (instantiated inline almost 1000 times) did however increase 
the size of the executable with 40k. Half of that is error messages. 
That increase can in principle have a big impact on small devices. The 
size overhead could be reduced a lot at the cost of slightly less 
detailed error messages.

I have not seen any evidence that these checks caused a noticeable 
overhead. No bisect that points at a specific change. No clear 
indication what difference it makes if ENABLE_STREAM_CHECK is toggled.

These boundary checks are a simple and very reliable first level of 
input validation and protection from buffer overflows. We could and 
should do something that is a lot smarter and efficient, but this is all 
we have now. FWIW: Considering how simple it is for a rogue server to 
crash or attack a client without these checks I think it is 
irresponsible to ship freerdp without these checks.

( http://sourceforge.net/mailarchive/message.php?msg_id=26528018 )


Over-specified inconsistency


The RDP protocol specifies over-specification lots of places, especially 
in nested structures where each nesting level has its own length field. 
Obviously these lengths should add up and match. What should we do if 
they don't match?

Detected inconsistencies shows that either someone (an attacking or 
buggy server) sent us incorrect data that by definition can't be 
interpreted correctly, or that we have a bug in our implementation and 
can't reasonably expect that if we do our best and continue then nobody 
will notice.

Ignoring inconsistencies and just try to continue using one of the 
inconsistent values might work in some cases, but obviously that is not 
a good solution. It is better to inform the user that bad things 
happened and that if the user thinks it is a bug in freerdp then the 
developers should be notified and add a bug-fix or workaround.

I added some of that kind of checks - especially in libfreerdp/rdp.c for 
checking that we really processed exactly the right amount of data from 
the input streams. That has helped catching several bugs by pointing out 
early that a problem occurred, long time before we would have stumbled 
upon the consequences.

It is annoying when these showstoppers pops up in cases where it would 
have worked if we just had used one or the other value, but I don't 
think that counts as an argument for just choosing one of them by default.


Code review?


I have described why I don't think the changes I made are controversial 
at all. I did not know that some developers had that strong feelings on 
this topic - and I still don't understand why.

When I made the changes it was anarchy with no tradition for review. 
Fortunately that has improved a bit since then, but we don't review 
everything. Even today I doubt that I would have considered my changes 
in this area so interesting or controversial that I would have annoyed 
you with a review request.

If someone thinks the guidelines should be further formalized, clarified 
or updated then please do so.


What now?


If someone will undo something then I can't and won't stop that.

If fellow developers can explain and agree which changes they would like 
to see then I might be able to do that.


Please share your opinion. Let's learn from the past and then move on.

/Mads


------------------------------------------------------------------------------
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
_______________________________________________
Freerdp-devel mailing list
Freerdp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freerdp-devel

[Freerdp-devel] Input validation discussion

Reply via email to