Michael Parker wrote:
On a stable system with working backup routines running SQLite with 'PRAGMA SYNCHRONOUS=OFF' for bayes makes a lot of sense.
It has been awhile, but I believe you just need to do this at create time, so you'd only need a proper .sql file that did it.
I think that might have been true for the pragma "default_sunchronous" (or something like that) in SQLite 2.*. In SQLite 3.' there is no persistant setting for this so the command must be given for each "connection" to the database.
That said, that doesn't mean that I wouldn't welcome a contribution from someone who went off and did the work, so feel free to create the module and do the testing.
I created a small module for SQLlite that simply inherits almost everything from Mail::SpamAssassin::SQL. It seems to work and I've done some benchmarks with it. I did notice some two things though: 1: In phase 2 and 5 there's an enormous amount of calls to _db_connect. IIRC "connecting" to a SQLite databse can be potentially time consuming, so using more persistant database connections *might* give a SQLite bayes-store better performance. Actually, a more persistant connection makes sense for other SQL modules as well. I might try the benchmarjs again with an override so that the untie_db method doesn't really disconnect from the databse in the SQLite module. 2: In phase 5 I see a number of "warn: closing dbh with active statement handles" in "output.txt". While such warning *can* indicate a potential memory leak, I have no idea wether it is any problem or not in this case.
Submit a bug with the code and results attached and
I'll do some more testing, before doing that. Here's the benchmarks so far: Total times: SDBM: 44:07 DB_File: 49:43 SQLite: 2:26_03 Detailed Times: Phase SDBM DB_File SQLite 1.a 305,95 375,05 1719,04 1.b 234,67 308,42 759,78 2.- 934,95 939,6 923,21 ? 1,19 1,18 1,23 3.- 11,41 24,38 28,84 4.a 213,47 235,41 2982,64 4.b 110,09 122,56 737,73 5.a 484,73 578,32 1139,67 5.b 349,21 396,2 470,01 ? 1,44 1,58 1,24 Total 2647,13 2982,7 8763,4 Obviously SQLite is nowehere near SDBM or DB_File when just using the standard SQL module this way. Notable though is that SQLite actually performed best in phase 2 though, and that's one of the phases where I saw a big number of calls to _connect_db.
discovered it just was not worth it, you were better off sticking with Berkeley DBD or the MUCH faster SDBM.
I believe you're right. I don't have any normal SQL server on the machine I'm testing this on. Otherwise it would make sense comparing the bechmarks for SQLite with MySQL and PostGreSQL. I was a bit surprised that the differenmce between SDBM and DB_File wasn't bigger though. I had the impression that the difference would be bigger. It is possible that some parts of the becnhmark isn't working right for me, I guess. Or Berkley DB has become faster than it was. Or it just fits well together with FreeBSDs file system.
Improvements to the benchmark are also more than welcome.
I added helper and tests files for the SQLite module, and will send them later. Regards /Jonas -- Jonas Eckerman, FSDB & Fruktträdet http://whatever.frukt.org/ http://www.fsdb.org/ http://www.frukt.org/
