Hi,
Trying to keep this a little alive :)
I ran a SA-independent benchmark, which was simply using a .dump (which
outputs BEGIN TRANSACTION, CREATE TABLE, INSERT ..., COMMIT) of a single
table with 5 columns, ca. 17000 rows:
with BEGIN TRANSACTION/COMMIT:
PRAGMA synchronous=OFF;
real 0m1.455s
user 0m1.400s
sys 0m0.052s
PRAGMA synchronous=NORMAL;
real 0m1.523s
user 0m1.372s
sys 0m0.072s
PRAGMA synchronous=FULL;
real 0m1.537s
user 0m1.400s
sys 0m0.052s
without BEGIN TRANSACTION/COMMIT:
PRAGMA synchronous=OFF;
real 0m10.113s
user 0m2.692s
sys 0m7.220s
PRAGMA synchronous=NORMAL;
real 10m38.229s
user 0m4.788s
sys 0m13.353s
PRAGMA synchronous=FULL;
real 14m3.243s
user 0m4.920s
sys 0m14.193s
so, if you run multiple INSERTs (and probably UPDATES), you should do it
in a single transaction (which should be done for integrity, anyway).
That fits perfectly in what I saw before when I converted my bayes DB
into SQLite.
1: In phase 2 and 5 there's an enormous amount of calls to _db_connect.
IIRC "connecting" to a SQLite databse can be potentially time consuming,
so using more persistant database connections *might* give a SQLite
bayes-store better performance. Actually, a more persistant connection
makes sense for other SQL modules as well.
Sure, but for SQLite the effect is probably not that big. And I think
this is just what http://wiki.apache.org/spamassassin/DBIPlugin does.
If I get the time, I'll try to make my own SQLite module (and probably
ask the SQLite people before, the mailing list is usually helpful). I
doubt that it outperforms SDBM, but you never know.