2008/7/19 Alex Cichowski <[EMAIL PROTECTED]>:
> OK, I will investigate the LimitRequestBody problem.
>
> Graham Dumpleton wrote:
>>
>> What probably needs to be investigated is whether the default can't
>> just be REQUEST_CHUNKED_DECHUNK all the time and code behave just as
>> it did before. If this can be done, then all that needs to be solved
>> is your ability to do non blocking reads.
>>
>
> I suppose the only problem with making it the default is exposing security
> holes in people's apps that were previously protected against by
> CHUNKED_ERROR, since it's unlikely anyone will rely on getting a chunked
> failure in normal operation. The only thing I can think of is something
> depending on the content length (request.clength), but I would guess it's
> already possible for attackers to pass in requests without content lengths
> by using "Connection: close" or HTTP/1.0. Don't have any ideas of what else
> to investigate.

Can you explain more how it would be a security problem?

For mod_wsgi at least, a WSGI application is required to use the
Content-Length because it isn't meant to read more than that amount of
content. Thus, in WSGI you cannot call read() with no arguments to
read all available input. Thus in conforming WSGI applications, if
something sent content as chunked, they would pick it up and reject it
with an error about no content length.

So, only applications specifically written to be non conforming WSGI
applications would do a read() with no arguments and be able to read
all input of chunked response. This assumes of course that client has
sent all request content at once and not expecting a response before
sending more. In this latter case, will need to use the setblocking()
feature I mentioned.

>> For non blocking reads, I rather don't like the idea of a special read
>> function. What I have instead thought about for this in the past for
>> mod_wsgi, even though support for chunked and non blocking is outside
>> of WGSI specification, is to have a setblocking() function just like
>> sockets do. Technically one may even be able to implement the more
>> general settimeout() since timeouts can be specified for Apache
>> connections if you do it right.
>>
>
> I thought someone already submitted a patch because read(len) used to return
> < len bytes in the past, and it was considered to be breaking the file-like
> object API.

Never been the case with mod_python that I know of. In early pre 1.0
builds of mod_wsgi I was playing with that at one point for read()
with no arguments, since that was outside of WSGI specification
anyway, but was made to conform when 1.0 came out.

> And if you restrict it to returning either len bytes or a "would block"
> exception, it would not solve the original problem I was intending to solve
> - having to process and respond to < len bytes from the client before the
> client will send any more.

Yes it would. When blocking turned off, the intention would be that
read() with or without arguments would always return only what it
found. The returning of flag to indicate it would block, would only
occur if not data at all available. Thus, disabling blocking would
make it behave like a non blocking socket.

> Perhaps setblocking()/settimeout() is orthogonal to partial read support. In
> some cases you might like to read() in non-blocking mode to retrieve blocks
> of known size (or get an exception), and in some cases you might like to
> read_partial() in non-blocking mode too.

Intention is that non blocking mode would always be partial reads,
just like non blocking socket. If you want known block size you loop
and if subsequent call after partial read would block, then you would
deal with that.

> I think partial read support is useful even if you don't have full
> non-blocking support, and non-blocking could be significantly more work to
> add.

For mod_wsgi at least, don't believe at this point that non blocking
support is that hard. Just that users of mod_wsgi don't like it when I
start doing stuff which is outside of WSGI itself even though it would
make a difference to their application.

>> The only thing is what exception you use to indicate socket would
>> block and/or timeout. Do you reuse socket exception and use it in same
>> way as sockets do for indicating this, or use a new exception.
>
> I would suggest a new exception, as it looks like mod_python code is
> currently completely isolated from any use of sockets or the Python socket
> module by Apache.

Rather than an exception, the other option is to return None instead.
Thus, returning empty string means end of input, and returning None
means would block.

For mod_wsgi at least, not having an exception may be preferable as
WSGI doesn't currently define exception types for when there are input
errors and so no standardisation. Partly this is because there is no
standard 'wsgi' module which could hold the exception types. So,
returning None would make it easier.

Graham

Reply via email to