On Sun, Jan 06, 2013 at 01:33:55AM +0700, Henrik Sarvell wrote: > In which actual situations do you need to do dbSync -> commit upd as > opposed to just commit upd? It seems to me the RMDBS equivalent is begin -> > commit?
The sequence (dbSync) -> (modify objects) -> (commit 'upd) is all and only about two things: Avoiding race conditions, and keeping the cached objects in all involved processes consistent. If only one single process writes to the database (this is usually the case when a new database is created and populated with initial data, but before the main event loop is started), you don't need (dbSync). You just call (commit) after creating or modifying objects. As soon as several processes operate on the DB, you should call (dbSync) before modifying anything, and (commit 'upd) when you are done. It is theoretically possible to just go ahead, modify some objects, and call (commit) to write the changes to the DB. But then you must be aware that any other process having one of the involved objects already in memory (this happens when the program accessed the object's value or property list) will continue keeping the old state of that object. It will use an outdated version, possibly giving wrong results, and -- even worse -- will write this old state to the DB when it modifies that object at a later time. So if, for example, a process modifies an object's address, then another process modifies that object's telephone number, the second process will overwrite the changed address with the old version. A change to an object's property will in many cases cause changes in index trees (B-tree nodes are implemented as DB objects too), and in other objects (e.g. in bi-directional '+Joint' relations). For that reason, unsynchronized changes will almost surely result in a completely messed-up database. You can avoid the synchronization only if you are absolutely sure that no other process has read (and cached) the objects you are about to modify. Then you can simply go ahead, create and modify the objects, and call (commit). It is important here not to forget here that while a newly created object itself is safe (no other process can have it already), the creation of objects usually causes the modification of other objects (tree nodes etc.). > If you do db, put> and commit upd calls so that they happen roughly at the > same time you should be as safe as you are when you do a "update table set > bla bla where userid = 5" in an RMDBS? "roughly at the same time" is probably not enough. > Let's take what I actually do at work as an example, pretend that I would > like to try to move the whole casino database from MySQL/InnoDB to PL, the > main thing then would be the following SQL query for example: "update users > set balance = balance + 10 where user_id = 50". > > There can be 100 such calls per second and upwards 5 of them per second for > the same user. Doing a dbSync here doesn't makes sense, there are no > relations, just a number that needs increasing or decreasing, updating user > 5's balance should not have to wait for an update of 10's balance to finish > and so on. > > It seems to me that this situation is solved in PL by simply doing (commit > upd)s without any dbSyncs, if there are several updates coming in at > virtually the same time they will still be synced through the commit upds. Yes, but the caching issue is not resolved. If two processes increment the object's counter, the second one will still hold the old un-incremented value, because it didn't wait to receive the changes broadcasted by (commit 'upd). This waiting is handled by the 'sync' call in (dbSync). You could call (rollback) before the increment, thus forcing the object (_all_ objects, to be precise) to be reloaded, but you may still have a race condition where another process increments the object before you do your 'commit'. ♪♫ Alex -- UNSUBSCRIBE: mailto:email@example.com?subject=Unsubscribe