date:20130106

Re: ACID tests

2013-01-06 Thread Alexander Burger

On Sun, Jan 06, 2013 at 01:33:55AM +0700, Henrik Sarvell wrote:
 In which actual situations do you need to do dbSync - commit upd as
 opposed to just commit upd? It seems to me the RMDBS equivalent is begin -
 commit?

The sequence

   (dbSync) - (modify objects) - (commit 'upd)

is all and only about two things: Avoiding race conditions, and keeping
the cached objects in all involved processes consistent.


If only one single process writes to the database (this is usually the
case when a new database is created and populated with initial data, but
before the main event loop is started), you don't need (dbSync). You
just call (commit) after creating or modifying objects.


As soon as several processes operate on the DB, you should call (dbSync)
before modifying anything, and (commit 'upd) when you are done.

It is theoretically possible to just go ahead, modify some objects, and
call (commit) to write the changes to the DB. But then you must be aware
that any other process having one of the involved objects already in
memory (this happens when the program accessed the object's value or
property list) will continue keeping the old state of that object. It
will use an outdated version, possibly giving wrong results, and -- even
worse -- will write this old state to the DB when it modifies that
object at a later time.

So if, for example, a process modifies an object's address, then another
process modifies that object's telephone number, the second process will
overwrite the changed address with the old version.

A change to an object's property will in many cases cause changes in
index trees (B-tree nodes are implemented as DB objects too), and in
other objects (e.g. in bi-directional '+Joint' relations). For that
reason, unsynchronized changes will almost surely result in a completely
messed-up database.


You can avoid the synchronization only if you are absolutely sure that
no other process has read (and cached) the objects you are about to
modify. Then you can simply go ahead, create and modify the objects, and
call (commit). It is important here not to forget here that while a
newly created object itself is safe (no other process can have it
already), the creation of objects usually causes the modification of
other objects (tree nodes etc.).



 If you do db, put and commit upd calls so that they happen roughly at the
 same time you should be as safe as you are when you do a update table set
 bla bla where userid = 5 in an RMDBS?

roughly at the same time is probably not enough.


 Let's take what I actually do at work as an example, pretend that I would
 like to try to move the whole casino database from MySQL/InnoDB to PL, the
 main thing then would be the following SQL query for example: update users
 set balance = balance + 10 where user_id = 50.
 
 There can be 100 such calls per second and upwards 5 of them per second for
 the same user. Doing a dbSync here doesn't makes sense, there are no
 relations, just a number that needs increasing or decreasing, updating user
 5's balance should not have to wait for an update of 10's balance to finish
 and so on.
 
 It seems to me that this situation is solved in PL by simply doing (commit
 upd)s without any dbSyncs, if there are several updates coming in at
 virtually the same time they will still be synced through the commit upds.

Yes, but the caching issue is not resolved. If two processes increment
the object's counter, the second one will still hold the old
un-incremented value, because it didn't wait to receive the changes
broadcasted by (commit 'upd). This waiting is handled by the 'sync' call
in (dbSync).

You could call (rollback) before the increment, thus forcing the object
(_all_ objects, to be precise) to be reloaded, but you may still have a
race condition where another process increments the object before you do
your 'commit'.

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: ACID tests

2013-01-06 Thread Henrik Sarvell

 Yes, but the caching issue is not resolved. If two processes increment
 the object's counter, the second one will still hold the old
 un-incremented value, because it didn't wait to receive the changes
 broadcasted by (commit 'upd). This waiting is handled by the 'sync' call
 in (dbSync).


I'm not sure I follow you with that, when I do my testing with the
following code:

(load lib/http.l)

(de decBalance ()
   (let U (db 'uname '+User henrik)
  (dec U 'balance 10) )
   (commit 'upd)
   (println Decreased balance by 10.) )

(de incRepeat ()
   (let U (db 'uname '+User henrik)
  (put! U 'balance 100)
  (println
 (make
(do 10
   (inc U 'balance 10)
   (commit 'upd)
   (link (; U balance))
   (wait 2000) ) ) ) ) )

(class +User +Entity)
(rel uname(+Key +String))
(rel balance  (+Number))

(dbs
   (4 +User)

   (4 (+User uname balance)))

(pool /opt/picolisp/projects/acid-test/ *Dbs)

#(setq U (request '(+User) 'uname henrik 'balance 100))
#(commit)
#(mapc show (collect 'uname '+User))

(de go ()
  (server 8080))

I get (110 120 120 130 140 150 160 170 180 190) as the result of
http://localhost:8080/!incRepeat when I also open
http://localhost:8080/!decBalance in another tab while incRepeat runs.

Seems to me that incRepeat gets the decrease alright?


On Sun, Jan 6, 2013 at 5:16 PM, Alexander Burger a...@software-lab.dewrote:

 On Sun, Jan 06, 2013 at 01:33:55AM +0700, Henrik Sarvell wrote:
  In which actual situations do you need to do dbSync - commit upd as
  opposed to just commit upd? It seems to me the RMDBS equivalent is begin
 -
  commit?

 The sequence

(dbSync) - (modify objects) - (commit 'upd)

 is all and only about two things: Avoiding race conditions, and keeping
 the cached objects in all involved processes consistent.


 If only one single process writes to the database (this is usually the
 case when a new database is created and populated with initial data, but
 before the main event loop is started), you don't need (dbSync). You
 just call (commit) after creating or modifying objects.


 As soon as several processes operate on the DB, you should call (dbSync)
 before modifying anything, and (commit 'upd) when you are done.

 It is theoretically possible to just go ahead, modify some objects, and
 call (commit) to write the changes to the DB. But then you must be aware
 that any other process having one of the involved objects already in
 memory (this happens when the program accessed the object's value or
 property list) will continue keeping the old state of that object. It
 will use an outdated version, possibly giving wrong results, and -- even
 worse -- will write this old state to the DB when it modifies that
 object at a later time.

 So if, for example, a process modifies an object's address, then another
 process modifies that object's telephone number, the second process will
 overwrite the changed address with the old version.

 A change to an object's property will in many cases cause changes in
 index trees (B-tree nodes are implemented as DB objects too), and in
 other objects (e.g. in bi-directional '+Joint' relations). For that
 reason, unsynchronized changes will almost surely result in a completely
 messed-up database.


 You can avoid the synchronization only if you are absolutely sure that
 no other process has read (and cached) the objects you are about to
 modify. Then you can simply go ahead, create and modify the objects, and
 call (commit). It is important here not to forget here that while a
 newly created object itself is safe (no other process can have it
 already), the creation of objects usually causes the modification of
 other objects (tree nodes etc.).



  If you do db, put and commit upd calls so that they happen roughly at
 the
  same time you should be as safe as you are when you do a update table
 set
  bla bla where userid = 5 in an RMDBS?

 roughly at the same time is probably not enough.


  Let's take what I actually do at work as an example, pretend that I would
  like to try to move the whole casino database from MySQL/InnoDB to PL,
 the
  main thing then would be the following SQL query for example: update
 users
  set balance = balance + 10 where user_id = 50.
 
  There can be 100 such calls per second and upwards 5 of them per second
 for
  the same user. Doing a dbSync here doesn't makes sense, there are no
  relations, just a number that needs increasing or decreasing, updating
 user
  5's balance should not have to wait for an update of 10's balance to
 finish
  and so on.
 
  It seems to me that this situation is solved in PL by simply doing
 (commit
  upd)s without any dbSyncs, if there are several updates coming in at
  virtually the same time they will still be synced through the commit
 upds.

 Yes, but the caching issue is not resolved. If two processes increment
 the object's counter, the second one will still hold the old
 un-incremented value,

Re: ACID tests

2013-01-06 Thread Henrik Sarvell

OK so how would you deal with a situation where you have the need to
quickly increment people's balances like I mentioned previously but at the
same time you have another process that has to update a lot of objects by
fetching information from many others?

This second process will take roughly one minute to complete from start to
finish and will not update +User in any way.

If I have understood things correctly simply doing dbSync - work - commit
in the second process won't work here because it will block the balance
updates.

Another option would be to do it in a loop and use put! which will only
initiate the sync at the time of each update which should not block the
balance updates for too long.

The question then is how much overhead does this cause when it comes to the
balance updates in your experience? If significant is it possible to
somehow solve the issue of these two processes creating collateral damage
to each other so to speak?


On Sun, Jan 6, 2013 at 7:19 PM, Alexander Burger a...@software-lab.dewrote:

 On Sun, Jan 06, 2013 at 05:43:06PM +0700, Henrik Sarvell wrote:
   Yes, but the caching issue is not resolved. If two processes increment
  ...
  I'm not sure I follow you with that, when I do my testing with the
  following code:
  ...
  (do 10
 (inc U 'balance 10)
 (commit 'upd)
 (link (; U balance))
 (wait 2000) ) ) ) ) )
  ...
  I get (110 120 120 130 140 150 160 170 180 190) as the result of
  ...
  Seems to me that incRepeat gets the decrease alright?

 This works because 'wait' is an idle loop, which also synchronizes in
 the background.

 Try a busy loop like (do 1000) instead. Then the above code will
 happily increment the local balance, no matter how much other processes
 decrement the value in the meantime.


 'sync' uses the internal mechanisms of 'wait', but in addition to that
 also addresses the issue of race conditions, which are difficult to
 reproduce with your setup. 'sync' guarantees that notifications about
 changes done by one process are sent _atomically_ to all other
 processes.

 ♪♫ Alex
 --
 UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: ACID tests

2013-01-06 Thread Alexander Burger

On Sun, Jan 06, 2013 at 08:00:51PM +0700, Henrik Sarvell wrote:
 OK so how would you deal with a situation where you have the need to
 quickly increment people's balances like I mentioned previously but at the
 same time you have another process that has to update a lot of objects by
 fetching information from many others?
 
 This second process will take roughly one minute to complete from start to
 finish and will not update +User in any way.

I would do it the normal, safe way, i.e. with 'inc!'. Note that the
way you proposed doesn't have so much less overhead, I think, because it
still uses the 'upd' argument to 'commit', which triggers communication
with other processes, and 'commit' itself, which does a low-level
locking of the DB.


 If I have understood things correctly simply doing dbSync - work - commit
 in the second process won't work here because it will block the balance
 updates.

Yes, but you can control this, depending on how many updates are done in
the 'work' between dbSync and commit.

We've discussed this in IRC, so for other readers here is what I do
usually in such cases:

   (dbSync)
   (while (..)
  (... do one update step ...)
  (at (0 . 1000) (commit 'upd) (dbSync)) )
   (commit 'upd)

The value of 1000 is an examply, I would try something between 100 and
1.

With that, after every 1000th update step other processes get a chance
to grab the lock in the (dbSync) after the 'commit'.


 Another option would be to do it in a loop and use put! which will only
 initiate the sync at the time of each update which should not block the
 balance updates for too long.

Right. This would be optimal in terms of giving freedom to other
processes, but it does only one single change in the 'put!', and thus
the whole update might take too long.

The above sync at every 1000th step allows for a good compromise.


 The question then is how much overhead does this cause when it comes to the
 balance updates in your experience? If significant is it possible to

I would not call this overhead. It is just so that the quick
operation of incrementing the balance may have to wait too long if the
second process does too many changes in a single transaction.

So the problem is not the 'inc!'. It just sits and waits until it can
do its job, and is then done quite quickly. It is the large update
'work' which may grab the DB for too long periods.


 somehow solve the issue of these two processes creating collateral damage
 to each other so to speak?

If you can isolate the balance (not like in your last example, where two
processes incremented and decremented the balance at the same time), and
make absolutely sure that only once process caches the object at a given
time, you could take the risk and do the incrementing/decrementing
without synchronization, with just (commit).

One way might be to have a single process take care of that,
communicating values to/from other processes with 'tell', so that no one
else needs to access these objects. But that's more complicated than
the straight-forward way.

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: picoLisp on FreeBSD 9.1 x64

2013-01-06 Thread Mansur Mamkin

Hi!
That's my inattention.
I've found the solution :)
One should use -rdynamic for LD_MAIN too.
Without this e.g. dlopen(lib/ht) fails with Undefined symbol 'Nil' error
So, now all tests passed with OK.
Alex, I will send you my files soon. As I understand, it's not possible to send 
them as attachment to mailing list
Best regards, 
Mansur

Суббота,  5 января 2013, 18:06 +01:00 от Alexander Burger 
a...@software-lab.de:
 Hi Mansur,
 
  I tried different options, also without stripping, but with no success.
  Perhaps someone can help to write small assembly program with shared
  library for test dlopen/dlsym calls?
 
 What did you use for the 'LD-SHARED' linker options? The Makefile of
 'pil32' uses -shared -export-dynamic for FreeBSD (in addition to
 -m32 which obviously doesn't make sense here). Perhaps this works
 for 64-bits too?
 
 
 An assembly program is probably not so helpful. Does FreeBSD have
 the 'ltrace' utility? If so, you could do
 
$ ltrace bin/picolisp 2xxx
: (ht:Prin 123)
 
 and then look into 'xxx' for something like
 
dlopen(lib/ht, 257)= 0x011700f0
dlsym(0x011700f0, Prin)= 0x7fe3f3f9f222
 
 
 I use 'gdb' in such cases to single-step through the program, but that's
 not what I would really recommend ;-)
 
 ♪♫ Alex
 -- 
 UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: ACID tests

Re: ACID tests

Re: ACID tests

Re: ACID tests

Re: picoLisp on FreeBSD 9.1 x64

5 matches

Site Navigation

Mail list logo

Footer information