FWIW, I did post on Apache dev list about this. There was a bit of useless back and forth in the discussion, but from that has now come a post that identifies there is a potential problem in Apache, even if to the extent of documenting how the APIs needs to be used and/or giving a gaurantee that a flush actually does flush.
Graham 2009/2/13 Graham Dumpleton <[email protected]>: > 2009/2/9 Graham Dumpleton <[email protected]>: >> 2009/2/9 gert <[email protected]>: >>> >>> when i comment out wsgi.file it works >>> >>> from os import path >>> f=open(path.join(path.dirname(__file__),'../bin/ >>> picture.png'),'rb') >>> # if 'wsgi.file_wrapper' in environ: return environ >>> ['wsgi.file_wrapper'](f, 8192) >>> # else: >>> return iter(lambda: f.read(8192), '') >> >> Use the test script from genro's post on February 4th in discussion: >> >> http://groups.google.com/group/modwsgi/browse_frm/thread/69d3076689a40aa9 >> >> In particular: >> >> def application(environ, start_response): >> size=environ.get('QUERY_STRING') >> if size=='': size='255' >> size=int(size) >> file=open('tmp/wsgitest','wb') >> for i in range(size): >> file.write('*') >> file.close() >> file=open('tmp/wsgitest','rb') >> file_wrapper = environ.get('wsgi.file_wrapper', None) >> start_response('200 OK',[]) >> return file_wrapper(file, 4096*16) >> >> See if you find similar results where whether it works depends on file size. >> >> It is interesting that genro sees a difference at 256, as that is a >> key value in Apache for which below it doesn't try and use sendfile. >> >> #define AP_MIN_SENDFILE_BYTES (256) >> >> Another interesting constant in Apache is: >> >> #define AP_MIN_BYTES_TO_WRITE 8000 >> >> Which is close to the other end of the range where they see a problem. >> >> I haven't myself yet had a chance since getting back home yesterday to >> try that test script. > > Okay, think I have worked it out. Quite serious problem and not sure > why people haven't been complaining about it left right and centre. > > First off, Apache has this bit of code: > > /* It doesn't make any sense to use sendfile for a file bucket > * that represents 10 bytes. > */ > else if (APR_BUCKET_IS_FILE(e) > && (e->length >= AP_MIN_SENDFILE_BYTES)) { > > What this does is say that if length is less than 256 bytes, then > don't bother using sendfile as waste of time, so it just sends it out > without trying to optimise it. > > Next relevant thing it has is: > > if (nbytes + flen < AP_MIN_BYTES_TO_WRITE > && !AP_BUCKET_IS_EOC(last_e) > && ((!fd && !more && !APR_BUCKET_IS_FLUSH(last_e)) > || (APR_BUCKET_IS_EOS(last_e) > && c->keepalive == AP_CONN_KEEPALIVE))) { > > This is part of some optimisation. If the amount of data is less than > 8000 bytes, minus nbytes whatever that works out as, then if next > bucket is EOS, it decides that it will not actually send the data > straight away but hold it over in case there is more data later that > can be sent at the same time. > > The problem with this is that mod_wsgi is assuming that all data will > be sent straight away. It relies on this as the Python fle object > would then be destroyed. Because the data hasn't been sent, Apache > still holds a reference to the file descriptor, yet Python file object > destroyed and file descriptor closed. End result is that when at end > of request Apache finally tries to send the data, the file descriptor > is no longer valid and the error occurs. > > It would seem therefore that correct thing to do is to send a flush > instead of an eos. > > So have: > > b = apr_bucket_flush_create(r->connection->bucket_alloc); > APR_BRIGADE_INSERT_TAIL(bb, b); > > instead of: > > b = apr_bucket_eos_create(r->connection->bucket_alloc); > APR_BRIGADE_INSERT_TAIL(bb, b); > > My only concern about this is that documentation says about flush: > > /** > * Create a flush bucket. This indicates that filters should flush their > * data. There is no guarantee that they will flush it, but this is the > * best we can do. > * @param list The freelist from which this bucket should be allocated > * @return The new bucket, or NULL if allocation failed > */ > APU_DECLARE(apr_bucket *) apr_bucket_flush_create(apr_bucket_alloc_t *list); > > The bit that obviously isn't good is 'There is no guarantee that they > will flush it'. > > Annoyingly, for eos documentation says: > > ** > * Create an End of Stream bucket. This indicates that there is no more data > * coming from down the filter stack. All filters should flush at this point. > * @param list The freelist from which this bucket should be allocated > * @return The new bucket, or NULL if allocation failed > */ > APU_DECLARE(apr_bucket *) apr_bucket_eos_create(apr_bucket_alloc_t *list); > > Thus, it says that flush should occur, which was what I was relying > upon, but as shown above it doesn't as stupid output filter attempts > to optimise it. > > I need to figure how you can gaurantee that flush has actually occurred. :-( > > I'll think about it and may be post on one of the httpd developer lists. > > Graham > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~----------~----~----~----~------~----~------~--~---
