[web2py] Re: creating background process with multiprocessing spawns new instance of web2py

Graham Dumpleton Fri, 21 May 2010 03:07:00 -0700


On May 21, 7:00 pm, Yarko Tymciurak <[email protected]>
wrote:
> On May 21, 3:33 am, Magnitus <[email protected]> wrote:
>
> > But if you create "tasks" without doing it at the OS level, doesn't
> > that means that you won't really be able to take full advantage of
> > multi-processor hardware (since the OS handles the hardware and if the
> > OS doesn't know about it, it won't be able to do the required
> > optimizations with the hardware)?
>
> With the GIL, python itself does not utilize multiple processors, so
> web2py is processor-bound (the only
> effect of multi-core is that the o/s itself can "leave" a core to the
> main python task, e.g.
> it can grab an alternate core... other than that, you're running on
> one core regardless -
> unless you fire multiple instances of python interpreters, in which
> case you are really only
> going to communicate thru services anyway....
>
> See some of the discussion 
> athttp://bugs.python.org/issue7946,http://stackoverflow.com/questions/990102/python-global-interpreter-l...
>
> ... and so forth...
>
>
>
> > Maybe I've done C/C++ for too long and am trying to micro-manage too
> > much, but a solution to I like to use for the problem of creating/
> > tearing down process threads is just to pre-create a limited number of
> > them (optimised for the number of CPUs you have) and recycle them to
> > do various tasks as needed.
>
> Well - since you don't have that with python, you run the risk of I/O
> blocking .... which is why really lightweight
> tasklets are so desireable (CCP Games 
> runshttp://en.wikipedia.org/wiki/Eve_Online
> with many tens of thousands of simultaneous users, if I recall
> correctly, and maintain stackless for this purpose).
>
>
>
> > Of course, that works best when you give your threads/processes longer
> > tasks to perform in parallel (else, the extra cost of managing it will
> > probably outweight the benefits of running it in parallel).
>
> There is much to cover in this - and I suppose reason to be happy that
> python traditionally hasn't run multi-core.
> See, for example, the discussions 
> at:http://stackoverflow.com/questions/203912/does-python-support-multipr...
>
> andhttp://docs.python.org/library/multiprocessing.html
>
> Lots to read! ;-)

Also read:

  http://blog.dscpl.com.au/2007/07/web-hosting-landscape-and-modwsgi.html
  http://blog.dscpl.com.au/2007/09/parallel-python-discussion-and-modwsgi.html

BTW, your prior descriptions about how web2py works under mod_wsgi
aren't overly accurate. You said:

"""
In a hosting environment, you have apache/wsgi (for example) running
a
wsgi-thred that is web2py - that (main and the stuff in gluon) is
your
long-running process (er, thread).   To restart web2py, with wsgi,
you
would do what is normal (touch a file) to cause apache to re-start
that wsgi thread.

Within web2py, you have a number of threads:  db connection pools,
and
application threads;   again, these respond to requests, and are
spawned off by web2py (not you)
"""

When run under Apache/mod_wsgi there is not a thread that is dedicated
to web2py and web2py doesn't have its own threads to respond to
requests.

In each Apache or mod_wsgi daemon process, depending on whether you
are using embedded mode or daemon mode, there is a pool of threads.
These are C threads, not Python threads and the thread pool is managed
by Apache or mod_wsgi as appropriate.

How a connection is accepted depends on Apache MPM or mod_wsgi mode
being used, but ultimately one of the threads in the thread pool
processes the request, all still in C code. For embedded mode the
request may not even be for the WSGI application but be for a static
file or other dynamic application such as PHP. If daemon mode, or if
target of request was the WSGI application, only then does Python GIL
get acquired and the thread tries to call into the WSGI application as
an external thread calling into the embedded Python interpreter.

At this point the WSGI application may not have even been loaded, so
the first request to find that has to load the WSGI script file which
may in turn load web2py. In this case web2py doesn't do anything
special. That is, it doesn't go creating its own thread pool and it
actually must return immediately once it is loaded and initialised.
Once it returns, the thread calls into the WSGI application entry
point and web2py handles the request. Any response is thence passed
back through Apache with the GIL being released at each point where
this occurs. When complete request is done, the GIL is completely
released and thread becomes inactive again pending a further request.

If other requests occur at the same time, they could also call into
web2py. The only choke point is the initial loading of the WSGI script
as obviously only want to allow one thread to do that.

So, web2py doesn't have its own request threads and all calls come in
from a external threads managed by Apache or mod_wsgi.

Graham

> - Yarko
>
>
>
>
>
> > On May 20, 2:12 pm, Yarko Tymciurak <[email protected]>
> > wrote:
>
> > > On May 19, 6:18 pm, Yarko Tymciurak <[email protected]>
> > > wrote:
>
> > > > On May 19, 5:41 pm, amoygard <[email protected]> wrote:
>
> > > ....
>
> > > > So - in general, you do not start subprocesses - with the exception of
> > > > cron.   Seehttp://www.web2py.com/book/default/section/4/17
>
> > > I might better have said you do not _want_ to be starting subprocesses
> > > - besides the cost (compute time, memory, etc.), if you generally did
> > > this.   This (the inneficiency of spawning subrocesses) is why
> > > stackless  was created - and (among other things) used in a a very
> > > busy online game.  A lot of thought went into avoiding the costs of
> > > spawning subprocesses.
>
> > > If you haven't heard of it, stackless is an implementation of python
> > > that does not use the traditional "C" stack for local variables,
> > > etc.   Among other things, it has added "tasklets" to the language, so
> > > you can create and schedule tasks - without the overhead of doing so
> > > in your operating system.   There is a lot of discussion of benefit,
> > > efficiency.   Although there might be some discussion questioning the
> > > approach, other alternative approaches, one thing is clear:  the
> > > motivation to stay away from creating threads / subprocesses, and the
> > > costs involved.  it might be interesting to read up on it.
>
> > > - Yarko
>
> > > > - Yarko

[web2py] Re: creating background process with multiprocessing spawns new instance of web2py

Reply via email to