In my opinion,  using a DB is easier, more logical, transparent and makes 
your data more accessible. 

The function that is used to process the data on arrival can check whether 
all pieces of data are present.  If so,  then after saving the last piece 
it can then go on to migrate them to your main DB and delete them from the 
transitional DB.  It could,  if required,  also go on to scan the DB and 
expunge any old incomplete records afterwards.  In my description I broke 
down the logic into steps,  but really the one function receiving the data 
can do the whole thing.  

This code should not be too difficult to write and there are literally 
hundreds of examples of using DBs from which you can learn.  This community 
is more experienced in using DBs to store data rather than cache so your 
questions will be more easily answered.

I recommend that you write down the logic of what you would like the 
function to do and start writing some code.  Come back here when you get 
stuck.

Regards,  D


On Tuesday, 8 May 2012 01:16:45 UTC+1, cyan wrote:
>
>  
>
>> Databases are specially designed for keeping persistent data - so there 
>> is your answer!   :)
>>
>> I suggest:
>>
>>    1. Write the data initially to a transitional Sqlite DB on disk.
>>    2. Once all the data pieces have arrived,  migrate the completed data 
>>    to your main DB and delete all the transitional records.
>>    3. It is much safer than that which you have proposed.  Web2py can 
>>    easily handle the two DBs.
>>    4. If you need any housekeeping, set up a scheduled job to purge all 
>>    the old, incomplete stuff once in a while.
>>    
>> Regards, David
>>
>
> Thanks David. This approach ensures better data consistency, but I have 
> two concerns:
>
> 1. If we store everything in database, how should we track whether all the 
> pieces of data have arrived. For example, if I am expecting 10 pieces of 
> data from 10 users, do I then need to constantly poll the db to check this? 
> Is there any other tracking/trigger mechanism available?
>
> 2. It seems to me that the following database operations are needed: 1. 
> write to the transitional db; 2. check for data arrival; 3. once all data 
> arrived, read from transitional db so we can process them; 4. write the 
> results to the main db; 5. delete all data from the transitional db. Added 
> together, all these db operations may be substantial, not to mention that 
> this process may need to be repeated for many number of times. In this 
> aspect, the cached solution seems to shine performance wise.
>
> Would love to hear your thoughts on the above. Thanks! 
>

Reply via email to