Re: db new! performance

Alexander Burger Tue, 29 May 2012 23:16:51 -0700

On Wed, May 30, 2012 at 12:28:50PM +0700, Henrik Sarvell wrote:
> Use new and chunk it up:
> 
>    (dbSync)
>    (for A As
>       (at (0 . 1000) (commit 'upd) (prune) (dbSync))
>       (new (db: +Article) '(+Article) key1 value1 key2 value2 ... ))
>    (commit 'upd)
> 
> With new! you are locking and writing every row so should only be used
> in cases where you know you are only inserting one (or maybe very
> few).
> 
> Above we create them in memory and write 1000 of them at a time.
> 
> If you have 12 million you should probably use an even higher number than 
> 1000.


Yes, I usually use 10000. Larger values seem not to bring any further
improvements, and use too much memory.


You can slightly simplify and speed up the above, if you do not need to
synchronize with other processes during that import (i.e. if other
processes can wait until the import is done). Then you can omit the
calls to 'commit' with 'upd' (signaling "done" to other processes) and
the (dbSync) calls in the loop.

And, in the final end, you could call (prune T) to reset to normal
behavior, though not doing this will not have any bad effect.

With that, we would have

   (dbSync)
   (for ...
      (new (db: +Article) '(+Article) 'key1 <value1> 'key2 <value2> ... )
      (at (0 . 10000) (commit) (prune)) )
   (commit 'upd)
   (prune T)


Practically, I would also omit the 'prune' calls, and only insert them
if I find that the import process uses too much money (monitor with
'top' or 'ps'). This speeds up small imports (which include 12 million
objects).

This would simplify the import to

   (dbSync)
   (for ...
      (new (db: +Article) '(+Article) 'key1 <value1> 'key2 <value2> ... )
      (at (0 . 10000) (commit)) )
   (commit 'upd)

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: db new! performance

Reply via email to