In my opinion, using a DB is easier, more logical, transparent and makes your data more accessible.
The function that is used to process the data on arrival can check whether all pieces of data are present. If so, then after saving the last piece it can then go on to migrate them to your main DB and delete them from the transitional DB. It could, if required, also go on to scan the DB and expunge any old incomplete records afterwards. In my description I broke down the logic into steps, but really the one function receiving the data can do the whole thing. This code should not be too difficult to write and there are literally hundreds of examples of using DBs from which you can learn. This community is more experienced in using DBs to store data rather than cache so your questions will be more easily answered. I recommend that you write down the logic of what you would like the function to do and start writing some code. Come back here when you get stuck. Regards, D On Tuesday, 8 May 2012 01:16:45 UTC+1, cyan wrote: > > > >> Databases are specially designed for keeping persistent data - so there >> is your answer! :) >> >> I suggest: >> >> 1. Write the data initially to a transitional Sqlite DB on disk. >> 2. Once all the data pieces have arrived, migrate the completed data >> to your main DB and delete all the transitional records. >> 3. It is much safer than that which you have proposed. Web2py can >> easily handle the two DBs. >> 4. If you need any housekeeping, set up a scheduled job to purge all >> the old, incomplete stuff once in a while. >> >> Regards, David >> > > Thanks David. This approach ensures better data consistency, but I have > two concerns: > > 1. If we store everything in database, how should we track whether all the > pieces of data have arrived. For example, if I am expecting 10 pieces of > data from 10 users, do I then need to constantly poll the db to check this? > Is there any other tracking/trigger mechanism available? > > 2. It seems to me that the following database operations are needed: 1. > write to the transitional db; 2. check for data arrival; 3. once all data > arrived, read from transitional db so we can process them; 4. write the > results to the main db; 5. delete all data from the transitional db. Added > together, all these db operations may be substantial, not to mention that > this process may need to be repeated for many number of times. In this > aspect, the cached solution seems to shine performance wise. > > Would love to hear your thoughts on the above. Thanks! >

