These are great points. Thanks Graham!!!

I did run some experiments and I do have a database lock in place (get for 
UPDATE in Mysql seems to act as a lock in pymysql connector), so as you 
note requests could pile up. However, the subprocess is apparently not 
invoked in my main wsgi python program (with the api.add_resource type 
statements) until a successful get-unique-key from MySql works. So I don't 
think I will pile up a mass of subprocesses, at least.

I just noticed my code does not presently check the return status properly 
after the subprocess completes. So yes, I would need to be doing that. Good 
point, check exit status.

Could you please recommend some reading for how to properly configure a 
queuing system?

Mike

On Monday, August 8, 2022 at 10:15:44 PM UTC-7 Graham Dumpleton wrote:

> Just be mindful of what will happen if a database operation takes a long 
> time and holds some sort of lock. More requests may come into the web 
> application, and if every one of these is creating a sub process, but then 
> get stuck waiting for the first, then you could spike out memory usage for 
> the system as a whole.
>
> This is the benefit of using a task queuing system as it can queue up 
> requests and give you a point of control for how many can run concurrently.
>
> Also ensure that you are waiting on the sub processes if necessary and 
> getting back any exit status. If you don't do this they can become zombie 
> processes, which although dead, still can consume memory in kernel process 
> table. So not being mindful of that and letting the number of zombie 
> processes grow indefinitely is not a good idea.
>
> Anyway, just look out for issues like that.
>
> Graham
>
> On 9 Aug 2022, at 3:09 pm, [email protected] <[email protected]> wrote:
>
>
> In Python... It's just reading from a database a little, minor updates, 
> then some read-only models for AI, no network I/O.  When I ran experiments 
> it fired up and used the pipes fine, no problems I could see, and I ran two 
> calls concurrently.
>
> Thanks Graham!
> On Monday, August 8, 2022 at 8:11:17 PM UTC-7 Graham Dumpleton wrote:
>
>> Using subprocess module alone may work okay, really depends on what it is 
>> doing. For simple stuff it is probably okay, but danger is where the sub 
>> process being run has strange requirements around signals because of what 
>> it inherits from the Apache parent process by way of the signal mask. This 
>> for example causes certain Java applications to not work properly when 
>> executed via subprocess module out of mod_wsgi process as something about 
>> Java garbage collection (from memory), requires setting its own signal 
>> handlers, but they are blocked and so never execute and so Java gets stuck.
>>
>> So you would really just need to try and see. For more complicated stuff, 
>> you would be better off delegating stuff to a backend task management 
>> system such as Celery.
>>
>> Graham
>>
>> On 9 Aug 2022, at 1:04 pm, [email protected] <[email protected]> wrote:
>>
>> Hi, 
>>
>> I'm trying to speed up my python program using multiprocessing since some 
>> of it can be concurrent.
>>
>> I am using Rocky Linux, Apache, mod_wsgi. I've been using this setup for 
>> years and no problem, but no multiprocessing...
>>
>> What I have been doing all along is to invoke my program from the main 
>> wsgi-flask script as such:
>>
>> Result = subprocess.run([python3 MainPgm.py],
>> stdin=subprocess.PIPE,
>> stdout=subprocess.PIPE)
>> stdout_data = result.stdout
>>
>> So I'm using the subprocess. 
>>
>> My question is:  is it safe to add multiprocessing inside my "MainPgm"?
>> My tests today sure worked fine, but I notice that this is frowned upon, 
>> but I noticed:
>>
>> "If you really want to pursue this, then suggest you move this code
>> outside of the WSGI script file and put it in a standard module on the
>> Python module search path you have set up for application."
>>
>> ^^ which seems to indicate it might work.
>>
>> Thanks.
>>
>> On Monday, May 2, 2011 at 4:55:38 PM UTC-7 Graham Dumpleton wrote:
>>
>>> Using the multiprocessing module within mod_wsgi is a really bad idea.
>>> This is because it is an embedded system where Apache and mod_wsgi
>>> manage processes. Once you start using multiprocessing module which
>>> tries to do its own process management, then it could potentially
>>> interfere with the operation of Apache/mod_wsgi in unexpected ways.
>>>
>>> For example, taking your example and changing it not to be dependent
>>> on web.py I get:
>>>
>>> import multiprocessing
>>> import os
>>>
>>> def x(y):
>>> print os.getpid(), 'x', y
>>> return y
>>>
>>> def application(environ, start_response):
>>> status = '200 OK'
>>> output = 'Hello World!'
>>>
>>> response_headers = [('Content-type', 'text/plain'),
>>> ('Content-Length', str(len(output)))]
>>> start_response(status, response_headers)
>>>
>>> print 'create pool'
>>> pool = multiprocessing.Pool(processes=1)
>>> print 'map call'
>>> result = pool.map(x, [1])
>>> print os.getpid(), 'doit', result
>>>
>>> return [output]
>>>
>>> If I fire off a request to this it appears to work correctly,
>>> returning me hello world string and log the appropriate messages.
>>>
>>> [Tue May 03 09:40:36 2011] [info] [client 127.0.0.1] mod_wsgi
>>> (pid=32752, process='hello-1',
>>> application='hello-1.example.com|/mptest.wsgi'): Loading WSGI script
>>> '/Library/WebServer/Sites/hello-1/htdocs/mptest.wsgi'.
>>> [Tue May 03 09:40:36 2011] [error] create pool
>>> [Tue May 03 09:40:36 2011] [error] map call
>>> [Tue May 03 09:40:36 2011] [error] 32753 x 1
>>> [Tue May 03 09:40:36 2011] [error] 32752 doit [1]
>>>
>>> However, the process then appears to receive a signal from somewhere
>>> causing it to shutdown:
>>>
>>> [Tue May 03 09:40:36 2011] [info] mod_wsgi (pid=32752): Shutdown
>>> requested 'hello-1'.
>>> [Tue May 03 09:40:41 2011] [info] mod_wsgi (pid=32752): Aborting
>>> process 'hello-1'.
>>>
>>> The multiprocessing module does issue signals, so it may be the source 
>>> of this.
>>>
>>> One thought was that this may be occurring when the pool is destroyed
>>> at the end of the function call, so I moved the creation of pool to
>>> module scope.
>>>
>>> import multiprocessing
>>> import os
>>>
>>> print 'create pool'
>>> pool = multiprocessing.Pool(processes=1)
>>>
>>> def x(y):
>>> print os.getpid(), 'x', y
>>> return y
>>>
>>> def application(environ, start_response):
>>> status = '200 OK'
>>> output = 'Hello World!'
>>>
>>> response_headers = [('Content-type', 'text/plain'),
>>> ('Content-Length', str(len(output)))]
>>> start_response(status, response_headers)
>>>
>>> print 'map call'
>>> result = pool.map(x, [1])
>>> print os.getpid(), 'doit', result
>>>
>>> return [output]
>>>
>>> This though will not even run:
>>>
>>> [Tue May 03 09:47:31 2011] [info] [client 127.0.0.1] mod_wsgi
>>> (pid=32893, process='hello-1',
>>> application='hello-1.example.com|/mptest.wsgi'): Loading WSGI script
>>> '/Library/WebServer/Sites/hello-1/htdocs/mptest.wsgi'.
>>> [Tue May 03 09:47:31 2011] [error] create pool
>>> [Tue May 03 09:47:31 2011] [error] map call
>>> [Tue May 03 09:47:31 2011] [error] Process PoolWorker-1:
>>> [Tue May 03 09:47:31 2011] [error] Traceback (most recent call last):
>>> [Tue May 03 09:47:31 2011] [error] File
>>>
>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/process.py",
>>> line 231, in _bootstrap
>>> [Tue May 03 09:47:31 2011] [error] self.run()
>>> [Tue May 03 09:47:31 2011] [error] File
>>>
>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/process.py",
>>> line 88, in run
>>> [Tue May 03 09:47:31 2011] [error] self._target(*self._args, 
>>> **self._kwargs)
>>> [Tue May 03 09:47:31 2011] [error] File
>>>
>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/pool.py",
>>> line 57, in worker
>>> [Tue May 03 09:47:31 2011] [error] task = get()
>>> [Tue May 03 09:47:31 2011] [error] File
>>>
>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/queues.py",
>>> line 339, in get
>>> [Tue May 03 09:47:31 2011] [error] return recv()
>>> [Tue May 03 09:47:31 2011] [error] AttributeError: 'module' object has
>>> no attribute 'x'
>>>
>>> The browser also then hangs at that point.
>>>
>>> Part of the issue here may be that WSGI script files are not really
>>> standard Python modules in that the basename of the WSGI script file
>>> doesn't match a module in sys.modules. If the multiprocessing module
>>> tries to do magic stuff with imports to find original code to execute
>>> in sub process it isn't going to work.
>>>
>>> Specifically, may be related to:
>>>
>>> http://code.google.com/p/modwsgi/wiki/IssuesWithPickleModule
>>>
>>> If I attempt to move x() into being a nested function as:
>>>
>>> import multiprocessing
>>> import os
>>>
>>> print 'create pool'
>>> pool = multiprocessing.Pool(processes=1)
>>>
>>> def application(environ, start_response):
>>> status = '200 OK'
>>> output = 'Hello World!'
>>>
>>> response_headers = [('Content-type', 'text/plain'),
>>> ('Content-Length', str(len(output)))]
>>> start_response(status, response_headers)
>>>
>>> def x(y):
>>> print os.getpid(), 'x', y
>>> return y
>>>
>>> print 'map call'
>>> result = pool.map(x, [1])
>>> print os.getpid(), 'doit', result
>>>
>>> return [output]
>>>
>>> Then one does get pickle errors, albeit for a different reason:
>>>
>>> [Tue May 03 09:52:59 2011] [info] [client 127.0.0.1] mod_wsgi
>>> (pid=33010, process='hello-1',
>>> application='hello-1.example.com|/mptest.wsgi'): Loading WSGI script
>>> '/Library/WebServer/Sites/hello-1/htdocs/mptest.wsgi'.
>>> [Tue May 03 09:52:59 2011] [error] create pool
>>> [Tue May 03 09:52:59 2011] [error] map call
>>> [Tue May 03 09:52:59 2011] [error] Exception in thread Thread-1:
>>> [Tue May 03 09:52:59 2011] [error] Traceback (most recent call last):
>>> [Tue May 03 09:52:59 2011] [error] File
>>>
>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py",
>>> line 522, in __bootstrap_inner
>>> [Tue May 03 09:52:59 2011] [error] self.run()
>>> [Tue May 03 09:52:59 2011] [error] File
>>>
>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py",
>>> line 477, in run
>>> [Tue May 03 09:52:59 2011] [error] self.__target(*self.__args,
>>> **self.__kwargs)
>>> [Tue May 03 09:52:59 2011] [error] File
>>>
>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/pool.py",
>>> line 225, in _handle_tasks
>>> [Tue May 03 09:52:59 2011] [error] put(task)
>>> [Tue May 03 09:52:59 2011] [error] PicklingError: Can't pickle <type
>>> 'function'>: attribute lookup __builtin__.function failed
>>>
>>> So, it is doing pickling in some form, which isn't going to work for
>>> stuff in WSGI script file.
>>>
>>> If you really want to pursue this, then suggest you move this code
>>> outside of the WSGI script file and put it in a standard module on the
>>> Python module search path you have set up for application.
>>>
>>> Overall though, I would recommend against using multiprocessing module
>>> from inside of mod_wsgi.
>>>
>>> Graham
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 2 May 2011 23:37, Ed Summers <[email protected]> wrote:
>>> > Hi all,
>>> >
>>> > I asked this over on web-sig [1] earlier today, but am asking here
>>> > since it looks to only mod_wsgi related...
>>> >
>>> > I've been trying to use the multiprocessing [2] w/ mod_wsgi and have
>>> > noticed what appears to be deadlocking behavior with body django and
>>> > web.py.  I created a minimal example with web.py to demonstrate [3].
>>> >
>>> > If you have mod_wsgi and web.py available, and and put something like
>>> > this in your apache config:
>>> >
>>> >    WSGIScriptAlias /multiprocessing /home/ed/wsgi_multiprocessing.py
>>> >    AddType text/html .py
>>> >
>>> > then visit:
>>> >
>>> >    http://localhost/
>>> >
>>> > and compare with:
>>> >
>>> >    http://localhost/?multiprocessing=1
>>> >
>>> > you should see the second URL hang.
>>> >
>>> > Going forward I'm most likely going to move this functionality to an
>>> > asynchronous queue (celery, etc) but I was wondering if
>>> > multiprocessing + mod_wsgi was generally known to be something to
>>> > avoid, or if it was even forbidden somehow.
>>> >
>>> > Any assistance you can provide would be welcome.
>>> >
>>> > //Ed
>>> >
>>> >
>>> > [1] http://mail.python.org/pipermail/web-sig/2011-May/005065.html
>>> > [2] http://docs.python.org/library/multiprocessing.html
>>> > [3] https://gist.github.com/951570
>>> >
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> > --
>>> > You received this message because you are subscribed to the Google 
>>> Groups "modwsgi" group.
>>> > To post to this group, send email to [email protected].
>>> > To unsubscribe from this group, send email to modwsgi+u...@
>>> googlegroups.com.
>>> > For more options, visit this group at 
>>> http://groups.google.com/group/modwsgi?hl=en.
>>> >
>>> >
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "modwsgi" group.
>>
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/modwsgi/7be84885-54d4-4417-adb3-42f1a0122a54n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/modwsgi/7be84885-54d4-4417-adb3-42f1a0122a54n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/modwsgi/79fda9dd-a8d7-4c5a-a5ec-794074e4cf14n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/modwsgi/79fda9dd-a8d7-4c5a-a5ec-794074e4cf14n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/d9d9077c-53bf-45e8-a9e4-948a355ac2e1n%40googlegroups.com.

Reply via email to