On Wed, May 30, 2012 at 12:28:50PM +0700, Henrik Sarvell wrote: > Use new and chunk it up: > > (dbSync) > (for A As > (at (0 . 1000) (commit 'upd) (prune) (dbSync)) > (new (db: +Article) '(+Article) key1 value1 key2 value2 ... )) > (commit 'upd) > > With new! you are locking and writing every row so should only be used > in cases where you know you are only inserting one (or maybe very > few). > > Above we create them in memory and write 1000 of them at a time. > > If you have 12 million you should probably use an even higher number than > 1000.
Yes, I usually use 10000. Larger values seem not to bring any further improvements, and use too much memory. You can slightly simplify and speed up the above, if you do not need to synchronize with other processes during that import (i.e. if other processes can wait until the import is done). Then you can omit the calls to 'commit' with 'upd' (signaling "done" to other processes) and the (dbSync) calls in the loop. And, in the final end, you could call (prune T) to reset to normal behavior, though not doing this will not have any bad effect. With that, we would have (dbSync) (for ... (new (db: +Article) '(+Article) 'key1 <value1> 'key2 <value2> ... ) (at (0 . 10000) (commit) (prune)) ) (commit 'upd) (prune T) Practically, I would also omit the 'prune' calls, and only insert them if I find that the import process uses too much money (monitor with 'top' or 'ps'). This speeds up small imports (which include 12 million objects). This would simplify the import to (dbSync) (for ... (new (db: +Article) '(+Article) 'key1 <value1> 'key2 <value2> ... ) (at (0 . 10000) (commit)) ) (commit 'upd) Cheers, - Alex -- UNSUBSCRIBE: mailto:email@example.com?subject=Unsubscribe