Oops.  Dont you hate it when you send an email while you are in the middle
of typing it?

I guess I better continue on..  I'll repeat the last little bit I was
saying...

Which is where we get to the first major bug.
evhttp_decode_uri assumes that you are decoding an entire query string which
includes a '?', but we can see here that it is being used to decode only a
portion of the query string, just a value part.

if (c == '?') {
>     in_query = 1;
> } else if (c == '+' && in_query) {
>     c = ' ';
>

In this code, if there is a '+' in the value, it will not be decoded to a
space like it is supposed to be.

Then there is another bug on the next line:

> } else if (c == '%' && isxdigit((unsigned char)uri[i+1]) &&
> isxdigit((unsigned char)uri[i+2])) {
>   char tmp[] = { uri[i+1], uri[i+2], '\0' };
>

Where we are not checking to make sure that 'i' is not already at the end of
the supplied string.   If using valgrind or other bounds checking tools,
this would
generate an exception if someone was to send a query string that happened to
end with a '%'.  This is rather minor, as it would be a little hard to
exploit this.

So now we get to the third bug... and the one that actually made me stop and
try to figure out what was going on.  I didn't find the previous mentioned
bugs until I started really examining the code, but this one stopped my
development in its tracks.

After evhttp_parse_query has seperated the key and value (and decoded the
value), it then tries to add them to a headers structure... which was
designed to handle the http headers, and not really the supplied query
parameters.   A peice of code in evhttp_add_header actually does this:
if (strchr(value, '\r') != NULL || strchr(value, '\n') != NULL ||
    strchr(key, '\r') != NULL || strchr(key, '\n') != NULL) {
        /* drop illegal headers */
        event_debug(("%s: dropping illegal header\n", __func__));
        return (-1);
}

Which actually causes a multi-line parameter (that was encoded by the
browser, but decoded before going into this function) to be dropped, and
ignored.  This means that this code would not be able to process any
multi-line <textarea> elements in forms that are sent by the browser, or
encoded multi-line entries sent via ajax, which is what I was actually
trying to do.

I know that work is being done on a 2.0 branch, so maybe this is different
there, dont know... but just wanted to let people know that there are some
decoding issues in the current version.

In summary, we shouldn't be decoding entire query strings before parsing out
the key/value pairs.  We should be decoding both key and value, and we
should not be excluding values that have CR or LF characters in them.

Apologies for the rather long email(s).



On Wed, Mar 11, 2009 at 10:57 PM, Clint Webb <webb.cl...@gmail.com> wrote:

> Hello all,
>
> memcached uses libevent, and I've been working recently on a web-based
> debugging console as part of it.  Since memcached already used libevent for
> socket notifications, I figured that evhttp would make it easy to add a
> web-based interface to it.   Which it did.
>
> In doing this though, I found a few serious bugs in evhttp_parse_query()
> that make me wonder if anyone is actually using libevent/http for anything
> serious.
>
> Of course, I may be completely miss-understanding how these functions are
> supposed to be used, but a lack of documentation found anywhere has caused
> me to look at the code to figure out how to use it.
>
> The first bit that piqued my interest was in the comment before the
> evhttp_parse_query code.
>
> /*
>>  * Helper function to parse out arguments in a query.
>>  * The arguments are separated by key and value.
>>  * URI should already be decoded.
>>  */
>>
>
> This seems to be markedly wrong.    You should not decode the entire URI
> query string in one go.  You need to parse out the key/value pairs and then
> decode the key and value bits.
>
> For example.... if you have a query with parameters like this:
> ?username=frank&password=secret
>
> The existing code works alright, but what if the persons password was
> se&cret?
>
> Anyway, assuming the developer did or didn't already decode the query
> string, we get to the evhttp_parse_query function, which will first find the
> '?', and then basically split the string using '&' as a delimeter.  Each
> key/value pair is split on the '=' so that you get seperate key and value.
>
> So if we supply a query of '?username=frank&password=secret', you get the
> the following pairs:
>
>> username=frank
>> password=secret
>>
>
> But what if frank used 'se&cret' as a password?  The browser would actually
> send '?username=frank&password=se#38cret', but since we are supposed to
> decode the entire query string before we parse out the params, then we end
> up actually processing '?username=frank&password=se&cret'.   Since we
> seperate the pairs by splitting on string on '&', we end up with:
> username=frank
> password=se
> cret
>
> Which is entirely not what we wanted.  But anyway, that is not really a bug
> in the event code, but an incorrect comment.   The only real example of code
> on how to use this function followed the instructions though, and did decode
> the entire query string before parsing it.
>
> Inside the function, we see that before the key and value is added to the
> 'headers' structure, we see that the value is then decoded, which is correct
> (although I would say the key should be decoded then too).
>
>> value = evhttp_decode_uri(value);
>>
>
> Which is where we get to the first major bug.
> evhttp_decode_uri assumes that you are decoding an entire query string
> which includes a '?', but we can see here that it is being used to decode
> only a portion of the query string, just a value part.
>
> if (c == '?') {
>     in_query = 1;
> } else if (c == '+' && in_query) {
> c = ' ';
>
>
>
> --
> "Be excellent to each other"
>



-- 
"Be excellent to each other"
_______________________________________________
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users

Reply via email to