> On 3 Mar 2021, at 10:46 pm, Ezra Peisach <[email protected]> wrote: > > Thank you for the followup. > > With your suggestion - two process groups, application-group on > WSGIScriptAlias - I am seeing file descriptor open on what appears to be the > parent and process, and two subprocesses. > >
Add the "display-name" option to WSGIDaemonProcess so you can distinguish what are the mod_wsgi daemon process. See: https://modwsgi.readthedocs.io/en/master/configuration-directives/WSGIDaemonProcess.html <https://modwsgi.readthedocs.io/en/master/configuration-directives/WSGIDaemonProcess.html> When using Apache/mod_wsgi, no WSGI process gets forked. The only process that forks is the Apache parent process, and it doesn't have the WSGI application code loaded and no requests are handled in it. So not sure if lsof is confusing things and showing notional separate process ID for each thread in the process, which under the covers Linux used to do not not sure if does now. Or, if your WSGI application code is doing something that causes forked processes to occur. If the later and that is at point before file descriptor cleaned up, and the forked process then exec's something else, the open file will still be marked against that forked sub process. > I have a sneaky suspicion that webob is causing a resource leak. Without > application-group specified - a similar pattern. > > Reducing file upload to under 10k - reduces to a single leak per process. > > I will take an independent server - and reduce everything down to as minimal > a test case as possible - and see if webob or my code is doing something odd > with the Request. My reading of the code is that it should fall out of scope > and cleanup. Similar python class arrangement suggests that cleanups should > be happening - but I need more testing. > > > > from lsof: > > httpd 38004 xdev 11u REG 253,0 695699 > 151322661 /tmp/#151322661 (deleted) > httpd 38004 xdev 12u REG 253,0 694951 > 151322663 /tmp/#151322663 (deleted) > httpd 38004 38254 xdev 11u REG 253,0 695699 > 151322661 /tmp/#151322661 (deleted) > httpd 38004 38254 xdev 12u REG 253,0 694951 > 151322663 /tmp/#151322663 (deleted) > httpd 38004 38255 xdev 11u REG 253,0 695699 > 151322661 /tmp/#151322661 (deleted) > httpd 38004 38255 xdev 12u REG 253,0 694951 > 151322663 /tmp/#151322663 (deleted) > httpd 38004 38256 xdev 11u REG 253,0 695699 > 151322661 /tmp/#151322661 (deleted) > httpd 38004 38256 xdev 12u REG 253,0 694951 > 151322663 /tmp/#151322663 (deleted) > > > On 3/3/21 6:14 AM, Graham Dumpleton wrote: >> >> >>> On 3 Mar 2021, at 10:11 pm, Ezra Peisach <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Thank you for your response. >>> >>> If both applications use numpy and lxml, is it safe to use the same global >>> WSGIApplicationGroup, but use separate process groups for each applcation? >>> The applications are related, but do not interact with each other, except >>> through database and filesystem. >>> >> That was the example I already provided. Eg. >> >> # Add this outside of VirtualHost to ensure only daemon mode used. >> >> WSGIRestrictEmbedded On >> >> # Two daemon process group. >> >> WSGIDaemonProcess wsgi_app_ssl_1 processes=5 threads=1 >> python-path="/path_to_venv..." >> WSGIDaemonProcess wsgi_app_ssl_2 processes=5 threads=1 >> python-path="/path_to_venv..." >> >> # Force first into one daemon process group. >> >> WSGIScriptAlias /service/review_v2 /path/doServiceRequest_review.wsgi >> process-group=wsgi_app_ssl_1 application-group=%{GLOBAL} >> >> # And second into other daemon process group. >> >> WSGIScriptAlias /service/status_update_tasks_v2 >> /path/doServiceRequest_ctl_v2.wsgi process-group=wsgi_app_ssl_2 >> application-group=%{GLOBAL} >> >> Am using application-group and process-group options on WSGIScriptAlias, >> instead of WGSIProcessGroup/WSGIApplicationGroup, as the options are more >> precise and do the same thing. Using both options as same time also has side >> effect or preloading WSGI script on process start, rather than first >> request, which can be beneficial in some cases. >> >>> I will try this. >>> >>> Independently, for webob, I believe if a file upload request is larger than >>> 10Kb, it buffers to a temporary file, but never closes at end, relying on >>> pythonic cleanup when class scope is exited. That I can report >>> independently. >>> >>> >>> On 3/2/21 8:59 PM, Graham Dumpleton wrote: >>>> >>>> >>>>> On 3 Mar 2021, at 12:39 pm, Ezra Peisach <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> >>>>> Ok, this will be complicated. >>>>> Recently moving to Python3, >>>>> Running mod_wsgi from apache, >>>>> >>>>> We needed to add: >>>>> >>>>> WSGIApplicationGroup %{GLOBAL} >>>>> >>>>> due to third party code (numpy, lxml). This means that requests are >>>>> served by primary python process >>>> >>>> No, that isn't what it means. Setting the application group forces which >>>> sub interpreter context within each process is used. In this case it sets >>>> it to the main or first interpreter context, which behaves like command >>>> line Python. There will still be a copy of this application (interpreter >>>> context) in all 10 of the processes in the daemon process group. >>>>> WSGIDaemonProcess wsgi_app_ssl processes=10 threads=1 >>>>> python-path="/path_to_venv..." >>>>> >>>>> WSGIProcessGroup wsgi_app_ssl >>>>> >>>>> We then have some scripts: >>>>> >>>>> WSGIScriptAlias /service/review_v2 >>>>> /path/doServiceRequest_review.wsgi >>>>> >>>>> WSGIScriptAlias /service/status_update_tasks_v2 >>>>> /path/doServiceRequest_ctl_v2.wsgi >>>>> >>>>> ..... >>>>> >>>>> >>>> >>>> This is where may now have a problem as setting the application group >>>> globally means both those WSGI applications now run in the same sub >>>> interpreter context of each process. If those WSGI applications are not >>>> compatible when run together, eg., try and both use same global data >>>> object of imported module for different things, then you can get problems. >>>>> Application handling is a standard >>>>> >>>>> >>>>> >>>>> from webob import Request, Response >>>>> >>>>> def __call__(self, environment, responseApplication): >>>>> >>>>> myRequest = Request(environment) >>>>> >>>>> ..... >>>>> >>>>> After a single request is processed with a file that is being uploaded >>>>> in a FieldStorage >>>>> >>>>> The issue is that after the request, the file descriptor is still open, >>>>> but deleted (using lsof). >>>>> >>>>> This file appears to be open in every process.of httpd. (same filename). >>>>> >>>>> >>>> >>>> That would only be the case if there had been multiple requests against >>>> the WSGI application. >>>> >>>> As mentioned above, there is still a copy of the WSGI application in each >>>> process, and thus as each process handles a request, then that process >>>> would also end up opening the file. >>>>> a) Is the apache configuration correct in this case? >>>>> >>>>> >>>> >>>> It is okay, but with concern over whether your multiple WSGI applications >>>> can now run together in the same sub interpreter context. >>>> >>>> If both WSGI applications use numpy, you would have to use multiple daemon >>>> process groups and keep them separate. >>>> >>>> # Add this outside of VirtualHost to ensure only daemon mode used. >>>> >>>> WSGIRestrictEmbedded On >>>> >>>> # Two daemon process group. >>>> >>>> WSGIDaemonProcess wsgi_app_ssl_1 processes=5 threads=1 >>>> python-path="/path_to_venv..." >>>> WSGIDaemonProcess wsgi_app_ssl_2 processes=5 threads=1 >>>> python-path="/path_to_venv..." >>>> >>>> # Force first into one daemon process group. >>>> >>>> WSGIScriptAlias /service/review_v2 /path/doServiceRequest_review.wsgi >>>> process-group=wsgi_app_ssl_1 application-group=%{GLOBAL} >>>> >>>> # And second into other daemon process group. >>>> >>>> WSGIScriptAlias /service/status_update_tasks_v2 >>>> /path/doServiceRequest_ctl_v2.wsgi process-group=wsgi_app_ssl_2 >>>> application-group=%{GLOBAL} >>>> >>>> If one doesn't use numpy, then you can restrict which one has to run in >>>> the main interpreter context. >>>> >>>> # Add this outside of VirtualHost to ensure only daemon mode used. >>>> >>>> WSGIRestrictEmbedded On >>>> >>>> # Single daemon process group. >>>> >>>> WSGIDaemonProcess wsgi_app_ssl processes=10 threads=1 >>>> python-path="/path_to_venv..." >>>> >>>> # Force one using numpy into main interpreter context. >>>> >>>> WSGIScriptAlias /service/review_v2 /path/doServiceRequest_review.wsgi >>>> process-group=wsgi_app_ssl application-group=%{GLOBAL} >>>> >>>> # For second application group not specified, meaning it will run in >>>> named sub interpreter where name based on host and URL mount point. >>>> >>>> WSGIScriptAlias /service/status_update_tasks_v2 >>>> /path/doServiceRequest_ctl_v2.wsgi process-group=wsgi_app_ssl >>>> >>>> Note I am using options to WSGIScriptAlias to set process group and >>>> application group instead of the separate directives. >>>>> b) Am I missing something here - i.e. is WebOB at fault here? >>>>> >>>>> WebOB uses cgi - which has a cleanup __del__ which is supposed to close >>>>> the file - but.I have not debugged down that far.... >>>>> >>>>> >>>> >>>> Relying on __del__ to cleanup file descriptors can be bad because if >>>> something holds the object in memory, it may only be cleaned up later when >>>> garbage collector kicks in. >>>> >>>> Anyway, hope that helps explain things. >>>> >>>> Graham >>>> >>>> -- >>>> You received this message because you are subscribed to a topic in the >>>> Google Groups "modwsgi" group. >>>> To unsubscribe from this topic, visit >>>> https://groups.google.com/d/topic/modwsgi/rvOgQsj-kN0/unsubscribe >>>> <https://groups.google.com/d/topic/modwsgi/rvOgQsj-kN0/unsubscribe>. >>>> To unsubscribe from this group and all its topics, send an email to >>>> [email protected] >>>> <mailto:[email protected]>. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/modwsgi/9D3B14DB-861B-4B3E-AD3F-91E855A9F8D6%40gmail.com >>>> >>>> <https://groups.google.com/d/msgid/modwsgi/9D3B14DB-861B-4B3E-AD3F-91E855A9F8D6%40gmail.com?utm_medium=email&utm_source=footer>. >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "modwsgi" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected] >>> <mailto:[email protected]>. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/modwsgi/f2847256-7248-33ce-b9f3-792d570117ba%40rcsb.org >>> >>> <https://groups.google.com/d/msgid/modwsgi/f2847256-7248-33ce-b9f3-792d570117ba%40rcsb.org?utm_medium=email&utm_source=footer>. >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "modwsgi" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/modwsgi/rvOgQsj-kN0/unsubscribe >> <https://groups.google.com/d/topic/modwsgi/rvOgQsj-kN0/unsubscribe>. >> To unsubscribe from this group and all its topics, send an email to >> [email protected] >> <mailto:[email protected]>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/modwsgi/718C9466-2575-4660-A8D5-3F3E48D921DB%40gmail.com >> >> <https://groups.google.com/d/msgid/modwsgi/718C9466-2575-4660-A8D5-3F3E48D921DB%40gmail.com?utm_medium=email&utm_source=footer>. > > > -- > You received this message because you are subscribed to the Google Groups > "modwsgi" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/modwsgi/803283b2-85d4-35ec-335f-907bbed53c2a%40rcsb.org > > <https://groups.google.com/d/msgid/modwsgi/803283b2-85d4-35ec-335f-907bbed53c2a%40rcsb.org?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/modwsgi/6AB3624A-CC43-4BE4-8C91-1857CD722AD2%40gmail.com.
