I forgot - one other thing I tried was the file wrapper. The file wrapper does NOT experience the disconnects.
On Wednesday, February 7, 2024 at 2:54:47 PM UTC-6 Greg Popp wrote: > I'm still struggling with disconnects with my slow readers. Here is all > that I have experimented with: > > I downloaded the latest version of mod_wsgi source (5.0.0) and built it on > my Centos7 system. This all seemed to work well and I am now running that > version. > I modified my app to return an iterator and stopped calling "write" > directly. No real change in behavior. > > To eliminate network devices causing problems, I started experimenting > with using localhost. Paradoxically, the localhost connections seem to be > timing out *sooner* than remote ones and seem to time out 100% of the > time! > > The timeout seems to occur at around 17-ish minutes. I have the apache > config param "TimeOut" set to 43200 (12 hours). I know that is insane but > that should make a send that is blocked by a slow reader sit there for 12 > hours. > Alas, it does not. My slow readers are still timing out. > > The error log has this: > [Wed Feb 07 19:17:26.807222 2024] [wsgi:info] [pid 19013] [client > 127.0.0.1:38570] mod_wsgi (pid=19013, process='', application=' > xcrutils.exegy-appliance.net|/xcr'): Reloading WSGI script > '/var/web/sites/request_handler_wsgi.py'. > [Wed Feb 07 19:34:03.493988 2024] [wsgi:debug] [pid 19013] > src/server/mod_wsgi.c(2443): [client 127.0.0.1:38570] mod_wsgi > (pid=19013): Failed to write response data: Connection timed out. > > My python wsgi application outputs this: > Apache/mod_wsgi failed to write response data: Connetion timed out > > That error string is what is in the trapped exception itself. > > Looking at mod_wsgi, this call: > rv = ap_pass_brigade(r->output_filters, self->bb); > is resulting in rv being not equal to APR_SUCCESS and > exception_when_aborted is false. > > Could there be some kind of timeout implemented in the bucket brigade code? > > On Friday, January 12, 2024 at 8:35:00 AM UTC-6 Greg Popp wrote: > >> Thank you again! The data IS in a file, but it requires an application to >> extract the requested salient pieces. I will look at the file wrapper >> extension. >> >> After more testing, I now think that I can fix my problem just by using >> your second suggestion of increasing the Timeout configuration variable in >> Apache. That is an easy fix and so far seems to be working well. >> >> On Thursday, January 11, 2024 at 4:22:38 PM UTC-6 Graham Dumpleton wrote: >> >>> If using the file_wrapper feature, make sure you also add: >>> >>> WSGIEnableSendfile On >>> >>> to mod_wsgi configuration as not on by default: >>> >>> >>> https://modwsgi.readthedocs.io/en/master/release-notes/version-4.1.0.html#features-changed >>> >>> The file_wrapper mechanism would still have worked, but to use kernel >>> sendfile feature have to also have the directive enabled. >>> >>> Can't remember if also need to add: >>> >>> EnableSendfile On >>> >>> to enable it in Apache itself. I don't think so. >>> >>> Graham >>> >>> On 12 Jan 2024, at 9:14 am, Graham Dumpleton <graham.d...@gmail.com> >>> wrote: >>> >>> Also not sure whether it will help or not, but if the data you are >>> sending is stored in a file and not generated on demand, then you might >>> consider using the WSGI file_wrapper extension instead. >>> >>> >>> https://modwsgi.readthedocs.io/en/master/user-guides/file-wrapper-extension.html >>> >>> I don't know how this will behave when buffer fills up since when >>> working properly it is all handled in the OS kernel and not in Apache. >>> >>> Along similar lines, if is stored as a file, you might try mod_sendfile. >>> It also would use kernel sendfile mechanism, but way it interacts may also >>> see different behaviour in your situation. >>> >>> Graham >>> >>> On 12 Jan 2024, at 1:38 am, Greg Popp <pop...@gmail.com> wrote: >>> >>> Thank you very much! This is most helpful, though I don't think any of >>> them will actually solve my issues, for many of the reasons you mentioned. >>> >>> I was thinking that perhaps the mod_wsgi interface had access to the >>> file descriptor for the network socket used by Apache and could call >>> "select" to see if it had enough buffer space for the requested write. If >>> it didn't, it could (optionally) sleep some configurable duration and try >>> again some configurable number of times. I understand though, that for most >>> applications this would not be necessary. >>> >>> Yesterday, I tried implementing that same behavior in my wsgi app. I >>> don't set the SendBufferSize and so use the system default. I grab the >>> system TCP send buff value by running the 'sysctl' command. Then I keep >>> track of the total bytes sent. If that value exceeds the system tcp send >>> queue value, I run the 'ss' command from within my wsgi app to grab the >>> Send-Q value for this connection (fortunately wsgi gives us the source-ip >>> and source-port and I can filter the 'ss' output using that). If the Send-Q >>> value is too high to accommodate another write, I sleep a second and try >>> again until I get enough space. It's kind of a Rube Goldberg solution, but >>> so far it seems to be working! >>> >>> Thank you for taking the time to answer my questions! I very much >>> appreciate the assistance! >>> >>> On Wednesday, January 10, 2024 at 3:38:51 PM UTC-6 Graham Dumpleton >>> wrote: >>> >>>> So what you are encountering is limitations in the socket buffer size >>>> enforced by the operating system, in combination with Apache httpd >>>> applying >>>> a socket timeout. >>>> >>>> In other words what happens is that the HTTP client isn't reading data >>>> and so the operating system level socket buffer fills up. At that point >>>> the >>>> Apache httpd write of the response data blocks with it eventually timing >>>> out, causing the initial error you see. In that situation Apache httpd >>>> will >>>> close down the connection, which results in you seeing the second error >>>> when still trying to write out more data anyway. >>>> >>>> You may be able to adjust some Apache configuration settings to try and >>>> solve this, but it would affect all requests in the context which you >>>> apply >>>> the configuration (dependent on whether done in server, VirtualHost, >>>> Directory or Location contexts). So not something you could selectively do >>>> on a per client basis. >>>> >>>> The first Apache directive to look at is SendBufferSize. >>>> >>>> https://httpd.apache.org/docs/2.4/mod/mpm_common.html#sendbuffersize >>>> >>>> If this is not set it should default to 0, which means it uses the >>>> operating system default. >>>> >>>> So you might be able to fiddle with this by setting it larger than the >>>> operating system default (although there is still some upper bound set by >>>> operating system you can go to). >>>> >>>> The next Apache directive to look at is Timeout. >>>> >>>> https://httpd.apache.org/docs/2.4/mod/core.html#timeout >>>> >>>> This would usually default to 60 seconds but some Linux distributions >>>> may override this in the Apache configuration they ship. >>>> >>>> In very old Apache versions this actually defaulted to 300 seconds, but >>>> it was made lower at some point. >>>> >>>> If playing with these, do be careful since they cause increased memory >>>> usage or cause other undesirable effects depending on traffic profile your >>>> server gets. >>>> >>>> One other thing you may be able to use is mod_ratelimit. >>>> >>>> https://httpd.apache.org/docs/2.4/mod/mod_ratelimit.html >>>> >>>> I have never actually used this and not exactly sure how it works, so >>>> is a bit of a guess, but you may be able to use this to slow down how >>>> quickly your application outputs the data. >>>> >>>> I am assuming here that this module will introduce waits into your >>>> application, by blocking your writes for a bit, to keep the flow of data >>>> being written by it under the rate limit. This would have the effect of >>>> not >>>> stuffing so much data into the response pipeline such that things work >>>> better with slower clients. Obviously using it would though penalise >>>> faster >>>> clients but you might find an acceptable balance by setting a higher rate >>>> limit for the initial burst of data and then using a lower rate after that. >>>> >>>> Graham >>>> >>>> >>>> On 11 Jan 2024, at 7:09 am, Greg Popp <pop...@gmail.com> wrote: >>>> >>>> embedded >>>> >>>> On Wednesday, January 10, 2024 at 1:32:52 PM UTC-6 Graham Dumpleton >>>> wrote: >>>> >>>>> Are you using mod_wsgi embedded mode or daemon mode? >>>>> >>>>> Graham >>>>> >>>>> On 11 Jan 2024, at 2:44 am, Greg Popp <pop...@gmail.com> wrote: >>>>> >>>>> Hello! >>>>> >>>>> My version of mod_wsgi is running on a Centos-7 system and is at >>>>> version 3.4, (I know - very old) with python 2.7 >>>>> >>>>> I have been using mod_wsgi for a python application that runs a >>>>> command-line program and marshals the output of the command line program >>>>> back to an http client. The data being sent is binary and can be tens of >>>>> gigs in size. >>>>> >>>>> This app is "unconventional", in that it calls 'write' directly, >>>>> instead of returning an iterable. The problem I have had recently, is >>>>> that >>>>> some clients are slow to read the data and the TCP buffer gets filled up. >>>>> When this happens, the next call to write on a full buffer causes a >>>>> "failed >>>>> to write data" exception (which I trap) but if I try again to send the >>>>> data >>>>> I get "client connection closed". >>>>> >>>>> Is there some config setting or methodology I can use to alleviate >>>>> this issue? In other words, some way to back off and wait for the buffer >>>>> to >>>>> drain sufficiently to resume sending the data? OR - is there some way to >>>>> get the current size (fullness) of the TCP write buffer on the connected >>>>> socket? (Something like what you see from the 'ss' command line utility >>>>> "Send-Q" column). If I could tell how full it is and what the max size >>>>> is, >>>>> I could implement a sleep/retry cycle of some kind. >>>>> >>>>> I have looked - even in the source code - but haven't been able to >>>>> figure it out if there is a way to achieve this. Thanks in advance, for >>>>> your attention. >>>>> >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "modwsgi" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to modwsgi+u...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/modwsgi/3d97c06f-38ff-4345-af2f-eb86c2ef204cn%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/modwsgi/3d97c06f-38ff-4345-af2f-eb86c2ef204cn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "modwsgi" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to modwsgi+u...@googlegroups.com. >>>> >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/modwsgi/9fc6ab3e-b791-4503-a3c0-20ba273b92bdn%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/modwsgi/9fc6ab3e-b791-4503-a3c0-20ba273b92bdn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "modwsgi" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to modwsgi+u...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/modwsgi/2cdbd013-0f3f-4e38-bb05-dddbb5b0deaan%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/modwsgi/2cdbd013-0f3f-4e38-bb05-dddbb5b0deaan%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> >>> >>> -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/modwsgi/e472caff-9941-4e5a-a672-2f622a88c5e7n%40googlegroups.com.