>All of my machines are 64-bit, and run 64-bit Linux kernels. Python3
>is a 64-bit executable according to 'file'.
>I re-ran the whole thing (slightly modified to also do a test on
>pmain) on my big dual-Xeon (32 hyperthreads, 256GB! RAM ) from
>System76. In spite of having about half the CPU clock speed,
>and being single-thread, it ran about 5 times faster.
>I take this to mean that it really helps if the entire database
>fits in the kernel memory buffers. Kinda like putting it
>on an SSD, only better.
There are multiple levels of caching. There is the block I/O cache, the device
cache, the OS filesystem cache and also the SQLite3 page cache. Even if *all*
the database fits in the OS filesystem cache, you will still have quite a bit
of a slowdown if the SQLite3 page cache is not big enough to hold the "working
set" of pages necessary to traverse the B-Tree structures (indexes and tables).
By default the SQLite3 page cache is a paltry 20000 pages (which at a 4K page
size is probably big enough for a small database (a gig or two) to have decent
performance). This cache can "thrash" as well.
Thashing against the OS Filesystem cache is significantly faster than thrashing
down to actual physical I/O, but is still significantly slower than if it does
not thrash or thrashes within the I/O rate of the underlying physical
If you read the hitrate and it is small (I define small as less than 99% -- you
may want more thrashing) then your page cache (PRAGMA page_cache) is too small:
hits = db.status(apsw.SQLITE_DBSTATUS_CACHE_HIT)
misses = db.status(apsw.SQLITE_DBSTATUS_CACHE_MISS)
return hits * 100.0 / max(hits + misses, 1)
There is also of course a point at which it is more efficient to do an I/O (I/O
time) than it is to search the page cache (CPU time). Don't know where that is
but of course there is such a point. Truly you want to find the point at which
SQLite3's page cache is sufficient to maintain the I/O rate to the next lower
level as low as possible, and no lower. Hard to guess what that number is and
it costs big bucks to performance tune computers to balance workloads to keep
every component of the system just ever so slightly under 100% utilized 100% of
the time (that is, operating at peak speed).
>I also take it to mean two other things:
>1. I should have been more careful to make sure the system was
>in the same state on every trial. It makes a difference if
>part of the DB is already in the buffers.
>2. Most of the slowdown was buffer thrashing, caused by an
>access pattern that had to read some parts of the database
>I no longer think this is any kind of a problem in SQLite.
The fact that there's a Highway to Hell but only a Stairway to Heaven says a
lot about anticipated traffic volume.
sqlite-users mailing list