[web2py] Re: Dangerous web2py object-reuse pattern?

Massimo Di Pierro Mon, 09 Jul 2012 15:24:02 -0700

As long as you do not use DAL to connect to couchdb then you do not need to 
worry about web2py closing the connections.


You still have two problems:
1) thread safety, you may need to mutex lock the objects
2) threads and processes will be killed by the web server at will and 
therefore you have no guarantee they will persist across requests.

I you have sized number of connections and the number is not too high, you 
can create a background process that communicates only with localhost via - 
for example - xmlrpc. Then your web app would basically act as a proxy of 
that background process.

I cannot help more, since there are many details involved. Mover couchdb 
talks over rest therefore I am not sure I understand the meaning of 
persistent connection. Perhaps web2py's cache.ram may be sufficient.

Massimo

On Monday, 9 July 2012 09:45:36 UTC-5, Daniel Gonzalez wrote:
>
> I can make subscriber_pool threadsafe, can't I?
>
> The point is that I would like to have some objects to be persistent 
> between requests, to avoid the cost of re-creating them each time. This is 
> nothing that can be managed by the DAL. You could think of it as a 
> long-to-setup object that I need to obtain the results expected by the 
> JSONRPC requests hitting web2py.
>
> To give you more detail: I have a pool of subscribers, belonging to 
> different "organizations". They are requesting data related to their 
> organization, via JSONRPC requests to web2py. The data is not controlled by 
> web2py, but is in external couchdb instances. I already have libraries to 
> access and manipulate this data, so I do not want to create new models for 
> it. *But* I need to connecto to those couchdb instances, and create my 
> library objects which know how to process this external data. Those are the 
> objects that I want to be persistent, because:
>
>    1. A user can send requests very fast (1 s period)
>    2. Several users can belong to the same organization. Thus, they can 
>    reuse the same subscriber object.
>    
> For how long must these objects be in the pool? This is something that I 
> have not yet decided myself. Probably for as long as they are being needed, 
> which means that I will implement a timeout (let's say 30s). If they are 
> used within that timeframe, they get to stay alive. If they timeout, they 
> get destroyed and will be recreated in the next request, thus incurring in 
> a penalty. This is a trade-off between speed and cache size (or memory 
> leak, if you want to see it that way).
>
> I could use a message queue (beanstalkd, celery?) to communicate with 
> these "workers", but in my first implementation I would like to keep it as 
> simple as possible. I am directly using the libraries from withing web2py. 
> So far, this is working fine, but I must confess that I have not yet 
> performed concurrent access tests.
>
> On Monday, July 9, 2012 4:03:36 PM UTC+2, Massimo Di Pierro wrote:
>>
>> I am not sure what class Subscriber does but, if it uses the DAL, than 
>> you code is problematic.
>> The DAL is "smart" in the sense that it keep tracks of all connection 
>> open in certain thread and closes them all (or pools them) when the thread 
>> ends. By using a module, you store references to those connections (which 
>> may be closed or no, depending on the thread) in a persistent object, 
>> subscriber_pool, thus making your code not thread safe.
>>
>> Can you please explain in more detail what are you trying to accomplish? 
>> How long should the connections be cached? What is their scope? I am sure 
>> there is a better way. :-)
>>
>>
>> On Monday, 9 July 2012 02:32:28 UTC-5, Daniel Gonzalez wrote:
>>>
>>> Hi,
>>>
>>> I am using the following pattern to use my libraries with web2py: some 
>>> of my utilities are creating connections to databases where I have data 
>>> that I need to serve with web2py. This data is not modeled with web2py 
>>> models. In order to avoid re-creating these connections with every single 
>>> request (I have a site which is performing requests with a 1s period), I 
>>> have discovered that the importing of modules is not cleaning the module 
>>> global variables. So now what I am doing is creating a cache for the 
>>> objects that I want to be persistent across requests. This is my code, in 
>>> file subscribers.py:
>>>
>>> class SubscriberPoolCls:
>>>     def __init__(self):
>>>         self.subscriber_pool = { }
>>>
>>>
>>>     def get_subscriber(self, subsriber_id, myorg):
>>>         log.info('get_subscriber > Requested subscriber_id=%s myorg=%d' 
>>> % (subsriber_id, myorg))
>>>         if not subsriber_id in self.subscriber_pool:
>>>             self.subscriber_pool[subsriber_id] = 
>>> Subscriber(myorg,subsriber_id
>>> )
>>>         return self.subscriber_pool[subsriber_id]
>>>
>>>
>>>     def unsubscribe_all(self):
>>>         for subsriber_id in self.subscriber_pool:
>>>             self.subscriber_pool[subsriber_id].unsubscribe()
>>>
>>>
>>> _subscriber_pool = None
>>>
>>>
>>> def SubscriberPool():
>>>     global _subscriber_pool
>>>     if not _subscriber_pool:
>>>         _subscriber_pool = SubscriberPoolCls()
>>>     return _subscriber_pool
>>>
>>> And then in a request I do the following (default.py):
>>>
>>> from   subscribers      import SubscriberPool
>>>
>>> subscriber_pool   = SubscriberPool()
>>>
>>> ...
>>>
>>> def init_session():
>>>     if not session.my_session_id:
>>>         session.my_session_id = get_uuid()
>>>
>>> ...
>>>
>>> @auth.requires_login()
>>> @service.jsonrpc
>>> def get_call_details(pars_json):
>>>     init_session()
>>>     myorg = session.auth.user.org_id
>>>     pars = simplejson.loads(pars_json)
>>>     subscriber = subscriber_pool.get_subscriber(session.my_session_id,myorg
>>> )
>>>     activity_cdr = subscriber.get_call_details(pars['cdr_doc_id'])
>>>     response = {
>>>
>>>         'cdr_details' : activity_cdr,
>>>         }
>>>     return simplejson.dumps(response)
>>>
>>> By doing this I can create a subscriber object associated to the session 
>>> and the organization, and I get to reuse this object in subsequent requests.
>>>
>>> Now I have the following questions:
>>>
>>>    1. Why are the imported modules keeping the global variables? 
>>>    default.py is not, as far as I can tell. I would say it is reparsed with 
>>>    each request.
>>>    2. I have the problem that my object cache 
>>>    (SubscriberPoolCls.subscriber_pool) can grow indefinitely. I do not know 
>>>    how or when to delete entries from this cache.
>>>    3. Do you think this pattern is dangerous? Do you have an 
>>>    alternative?
>>>
>>> Thanks,
>>>
>>> Daniel
>>>
>>

[web2py] Re: Dangerous web2py object-reuse pattern?

Reply via email to