[modwsgi] Re: (9)Bad file descriptor: core_output_filter: writing data to the network

Graham Dumpleton Wed, 18 Feb 2009 02:23:48 -0800

FWIW, I did post on Apache dev list about this. There was a bit of
useless back and forth in the discussion, but from that has now come a
post that identifies there is a potential problem in Apache, even if
to the extent of documenting how the APIs needs to be used and/or
giving a gaurantee that a flush actually does flush.


Graham

2009/2/13 Graham Dumpleton <[email protected]>:
> 2009/2/9 Graham Dumpleton <[email protected]>:
>> 2009/2/9 gert <[email protected]>:
>>>
>>> when i comment out wsgi.file it works
>>>
>>> from os import path
>>>        f=open(path.join(path.dirname(__file__),'../bin/
>>> picture.png'),'rb')
>>>        # if 'wsgi.file_wrapper' in environ: return environ
>>> ['wsgi.file_wrapper'](f, 8192)
>>>        # else:
>>>        return iter(lambda: f.read(8192), '')
>>
>> Use the test script from genro's post on February 4th in discussion:
>>
>>  http://groups.google.com/group/modwsgi/browse_frm/thread/69d3076689a40aa9
>>
>> In particular:
>>
>> def application(environ, start_response):
>>    size=environ.get('QUERY_STRING')
>>    if size=='': size='255'
>>    size=int(size)
>>    file=open('tmp/wsgitest','wb')
>>    for i in range(size):
>>        file.write('*')
>>    file.close()
>>    file=open('tmp/wsgitest','rb')
>>    file_wrapper = environ.get('wsgi.file_wrapper', None)
>>    start_response('200 OK',[])
>>    return file_wrapper(file, 4096*16)
>>
>> See if you find similar results where whether it works depends on file size.
>>
>> It is interesting that genro sees a difference at 256, as that is a
>> key value in Apache for which below it doesn't try and use sendfile.
>>
>>  #define AP_MIN_SENDFILE_BYTES           (256)
>>
>> Another interesting constant in Apache is:
>>
>>  #define AP_MIN_BYTES_TO_WRITE  8000
>>
>> Which is close to the other end of the range where they see a problem.
>>
>> I haven't myself yet had a chance since getting back home yesterday to
>> try that test script.
>
> Okay, think I have worked it out. Quite serious problem and not sure
> why people haven't been complaining about it left right and centre.
>
> First off, Apache has this bit of code:
>
>            /* It doesn't make any sense to use sendfile for a file bucket
>             * that represents 10 bytes.
>             */
>            else if (APR_BUCKET_IS_FILE(e)
>                     && (e->length >= AP_MIN_SENDFILE_BYTES)) {
>
> What this does is say that if length is less than 256 bytes, then
> don't bother using sendfile as waste of time, so it just sends it out
> without trying to optimise it.
>
> Next relevant thing it has is:
>
>        if (nbytes + flen < AP_MIN_BYTES_TO_WRITE
>            && !AP_BUCKET_IS_EOC(last_e)
>            && ((!fd && !more && !APR_BUCKET_IS_FLUSH(last_e))
>                || (APR_BUCKET_IS_EOS(last_e)
>                    && c->keepalive == AP_CONN_KEEPALIVE))) {
>
> This is part of some optimisation. If the amount of data is less than
> 8000 bytes, minus nbytes whatever that works out as, then if next
> bucket is EOS, it decides that it will not actually send the data
> straight away but hold it over in case there is more data later that
> can be sent at the same time.
>
> The problem with this is that mod_wsgi is assuming that all data will
> be sent straight away. It relies on this as the Python fle object
> would then be destroyed. Because the data hasn't been sent, Apache
> still holds a reference to the file descriptor, yet Python file object
> destroyed and file descriptor closed. End result is that when at end
> of request Apache finally tries to send the data, the file descriptor
> is no longer valid and the error occurs.
>
> It would seem therefore that correct thing to do is to send a flush
> instead of an eos.
>
> So have:
>
>   b = apr_bucket_flush_create(r->connection->bucket_alloc);
>   APR_BRIGADE_INSERT_TAIL(bb, b);
>
> instead of:
>
>   b = apr_bucket_eos_create(r->connection->bucket_alloc);
>   APR_BRIGADE_INSERT_TAIL(bb, b);
>
> My only concern about this is that documentation says about flush:
>
> /**
>  * Create a flush  bucket.  This indicates that filters should flush their
>  * data.  There is no guarantee that they will flush it, but this is the
>  * best we can do.
>  * @param list The freelist from which this bucket should be allocated
>  * @return The new bucket, or NULL if allocation failed
>  */
> APU_DECLARE(apr_bucket *) apr_bucket_flush_create(apr_bucket_alloc_t *list);
>
> The bit that obviously isn't good is 'There is no guarantee that they
> will flush it'.
>
> Annoyingly, for eos documentation says:
>
> **
>  * Create an End of Stream bucket.  This indicates that there is no more data
>  * coming from down the filter stack.  All filters should flush at this point.
>  * @param list The freelist from which this bucket should be allocated
>  * @return The new bucket, or NULL if allocation failed
>  */
> APU_DECLARE(apr_bucket *) apr_bucket_eos_create(apr_bucket_alloc_t *list);
>
> Thus, it says that flush should occur, which was what I was relying
> upon, but as shown above it doesn't as stupid output filter attempts
> to optimise it.
>
> I need to figure how you can gaurantee that flush has actually occurred. :-(
>
> I'll think about it and may be post on one of the httpd developer lists.
>
> Graham
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en
-~----------~----~----~----~------~----~------~--~---

[modwsgi] Re: (9)Bad file descriptor: core_output_filter: writing data to the network

Reply via email to