Re: pylons architecture decision question

Mike Orr Mon, 13 Apr 2009 01:31:02 -0700

On Mon, Apr 13, 2009 at 12:14 AM, Iain Duncan <[email protected]> wrote:
>
> On Sun, 2009-04-12 at 22:22 -0700, Mike Orr wrote:
>>
>> On Sun, Apr 12, 2009 at 9:18 PM, Iain Duncan <[email protected]> wrote:
>> >
>> > I've been working through the Pylons book and new docs ( fantastic
>> > improvement btw! ) as well as Ian's webob tutorials. I'm pretty clear
>> > now on how Pylons calls pylons controllers and wsgi apps, and understand
>> > the difference between calling an instantiated app and the way pylons
>> > re-instantiates a controller on each request.  However I haven't seen
>> > anything describing *why* pylons does it this way. I'd like to learn
>> > more, so is there an explanation as to why that's a good idea anywhere?
>>
>> It's just a design variation.  Instantiating a controller for each
>> request allows you to put request-local state data in 'self'.  (As if
>> 'c' and 'request' weren't enough.)  It also guards against
>> accidentally leaking state from one request to another.  I believe
>> that has been the experience with previous frameworks, that
>> instantiating the controller for each request is a good thing.  Pylons
>> has plenty of places to store multi-request data anyway: app_globals,
>> cache, and module globals.  And the controller *class* is a module
>> global so it also remains in memory.
>>
>> As for why external WSGI applications are long-lived, that's because
>> it's a complete application that manages its own requests.  It may
>> have initalization state or database connections.  It may be as big as
>> Plone.  We don't know what it is, but we know that some WSGI
>> applications like to be long-lived.
>
> Thanks Mike. I think in TG and in the web ob examples the standard is
> that controllers get instantiated once on startup and __call__'ed on
> request right?
>
> I'm curious, is the amount of extra python execution necessary trivial,
> or is this a valid way to trade execution time for ram use and vice
> versa? It seems to me that in a large app with many many controllers
> executing to fulfil each request ( say many components being amalgamated
> or something ) that one pattern would require a lot more code to stay
> loaded in memory and the other more to execute for each request. But
> maybe this is one of those instances where in reality the difference
> makes no difference?


An instance is basically an indirect dict. so the overhead is just a
little more than creating a dict.

===
import sys
import time

class C(object):
    pass

print "Starting";  sys.stdout.flush()

COUNT = 2000
lis = [None] * COUNT
for i in xrange(COUNT):
    c = C()
    lis[i] = c

print "Finishing";  sys.stdout.flush()

time.sleep(60)
===

Running this, it takes essentially no time to create 2000 objects.
The sleep is there to pause the program long enough to check how much
memory it's using.  On my Linux system it's 2.6 MB resident and 4.3 MB
total, or < 1 megabyte more than a trivial 'python -c "import time;
time.sleep(60)'.

But 2000 requests a second is highly unrealistic because your database
queries and filesystem accesses will bog down long before then.  And
the requests wouldn't all be in memory simultaneously anyway because
the earlier ones would have finished by the time the later ones begin.

It's like the issue of why Python doesn't have separate float and long
types.  It uses long throughout because the overhead of a Python
object dwarfs the miniscule difference between floats and longs.  (And
for applications that create a lot of them, there's the array module
and NumPy).

The first rule of optimizing is to see whether you need to do it at
all.  The second is to tame the biggest bottleneck, and then the
second-biggest bottleneck, and so forth.  Changing PylonsApp to cache
controller instances and reuse them would be far down the list of
priorities, though it's probably not that difficult if you want to
try.  But oh, they're not thread safe, because users may put request
state into them, and I think Pylons does too.  So you'd need a
threadlocal cache....  Pretty soon you're managing instances as if
they were database connections, which is overkill.

-- 
Mike Orr <[email protected]>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: pylons architecture decision question

Reply via email to