I can make subscriber_pool threadsafe, can't I?

The point is that I would like to have some objects to be persistent 
between requests, to avoid the cost of re-creating them each time. This is 
nothing that can be managed by the DAL. You could think of it as a 
long-to-setup object that I need to obtain the results expected by the 
JSONRPC requests hitting web2py.

To give you more detail: I have a pool of subscribers, belonging to 
different "organizations". They are requesting data related to their 
organization, via JSONRPC requests to web2py. The data is not controlled by 
web2py, but is in external couchdb instances. I already have libraries to 
access and manipulate this data, so I do not want to create new models for 
it. *But* I need to connecto to those couchdb instances, and create my 
library objects which know how to process this external data. Those are the 
objects that I want to be persistent, because:

   1. A user can send requests very fast (1 s period)
   2. Several users can belong to the same organization. Thus, they can 
   reuse the same subscriber object.
   
For how long must these objects be in the pool? This is something that I 
have not yet decided myself. Probably for as long as they are being needed, 
which means that I will implement a timeout (let's say 30s). If they are 
used within that timeframe, they get to stay alive. If they timeout, they 
get destroyed and will be recreated in the next request, thus incurring in 
a penalty. This is a trade-off between speed and cache size (or memory 
leak, if you want to see it that way).

I could use a message queue (beanstalkd, celery?) to communicate with these 
"workers", but in my first implementation I would like to keep it as simple 
as possible. I am directly using the libraries from withing web2py. So far, 
this is working fine, but I must confess that I have not yet performed 
concurrent access tests.

On Monday, July 9, 2012 4:03:36 PM UTC+2, Massimo Di Pierro wrote:
>
> I am not sure what class Subscriber does but, if it uses the DAL, than you 
> code is problematic.
> The DAL is "smart" in the sense that it keep tracks of all connection open 
> in certain thread and closes them all (or pools them) when the thread ends. 
> By using a module, you store references to those connections (which may be 
> closed or no, depending on the thread) in a persistent object, 
> subscriber_pool, thus making your code not thread safe.
>
> Can you please explain in more detail what are you trying to accomplish? 
> How long should the connections be cached? What is their scope? I am sure 
> there is a better way. :-)
>
>
> On Monday, 9 July 2012 02:32:28 UTC-5, Daniel Gonzalez wrote:
>>
>> Hi,
>>
>> I am using the following pattern to use my libraries with web2py: some of 
>> my utilities are creating connections to databases where I have data that I 
>> need to serve with web2py. This data is not modeled with web2py models. In 
>> order to avoid re-creating these connections with every single request (I 
>> have a site which is performing requests with a 1s period), I have 
>> discovered that the importing of modules is not cleaning the module global 
>> variables. So now what I am doing is creating a cache for the objects that 
>> I want to be persistent across requests. This is my code, in file 
>> subscribers.py:
>>
>> class SubscriberPoolCls:
>>     def __init__(self):
>>         self.subscriber_pool = { }
>>
>>
>>     def get_subscriber(self, subsriber_id, myorg):
>>         log.info('get_subscriber > Requested subscriber_id=%s myorg=%d' % 
>> (subsriber_id, myorg))
>>         if not subsriber_id in self.subscriber_pool:
>>             self.subscriber_pool[subsriber_id] = 
>> Subscriber(myorg,subsriber_id
>> )
>>         return self.subscriber_pool[subsriber_id]
>>
>>
>>     def unsubscribe_all(self):
>>         for subsriber_id in self.subscriber_pool:
>>             self.subscriber_pool[subsriber_id].unsubscribe()
>>
>>
>> _subscriber_pool = None
>>
>>
>> def SubscriberPool():
>>     global _subscriber_pool
>>     if not _subscriber_pool:
>>         _subscriber_pool = SubscriberPoolCls()
>>     return _subscriber_pool
>>
>> And then in a request I do the following (default.py):
>>
>> from   subscribers      import SubscriberPool
>>
>> subscriber_pool   = SubscriberPool()
>>
>> ...
>>
>> def init_session():
>>     if not session.my_session_id:
>>         session.my_session_id = get_uuid()
>>
>> ...
>>
>> @auth.requires_login()
>> @service.jsonrpc
>> def get_call_details(pars_json):
>>     init_session()
>>     myorg = session.auth.user.org_id
>>     pars = simplejson.loads(pars_json)
>>     subscriber = subscriber_pool.get_subscriber(session.my_session_id,myorg
>> )
>>     activity_cdr = subscriber.get_call_details(pars['cdr_doc_id'])
>>     response = {
>>
>>         'cdr_details' : activity_cdr,
>>         }
>>     return simplejson.dumps(response)
>>
>> By doing this I can create a subscriber object associated to the session 
>> and the organization, and I get to reuse this object in subsequent requests.
>>
>> Now I have the following questions:
>>
>>    1. Why are the imported modules keeping the global variables? 
>>    default.py is not, as far as I can tell. I would say it is reparsed with 
>>    each request.
>>    2. I have the problem that my object cache 
>>    (SubscriberPoolCls.subscriber_pool) can grow indefinitely. I do not know 
>>    how or when to delete entries from this cache.
>>    3. Do you think this pattern is dangerous? Do you have an alternative?
>>
>> Thanks,
>>
>> Daniel
>>
>

Reply via email to