Re: [sqlite] Disk caching impacts performance.

Mark Spiegel Thu, 08 Nov 2007 07:41:36 -0800

As Trevor cautioned, you run the risk of really impacting the otherapplications running on the system and the system itself. As you fillup the Windows cache manager with your file, you will push out otherrelevant data including data that the OS may be using. This will causeapplications to run slower until the data is reloaded and newapplications to start slower. The cache manager will drop cached dataas it loads your file. Assuming that the cache manager is not alreadyusing memory at the limit allowed, it will ask the memory manager formore pages which will cause the memory manager to (on an LRU basis) droppages that are backed by a file via a mapping and write pages to thepaging file in cases where the pages are not backed by a map. Thesewrites (if they happen) are expensive in a relative sense and will nowbe mixed with your reads. If the OS is not under stress and hassufficient resources, you probably won't see any of this.

Given that you can't write your own VFS, there is not much to suggest,but one question to ask. Is the sum of the time for the pre-read youperform and the subsequent database operation(s) smaller than doing thedatabase operation(s) without the pre-read? I see that in the 3.5.xsource Dr. Hipp gives the file system the proper random access hint tothe file system when opening databases. This is just a hint to thecache manager and it is not obligated to honor it, but it willeffectively shut down most read ahead and large block reads which iswhat you are getting when you sequentially pre-read.

One more thing, did raising the limit on the number of pages SQLITE cancache internally have any effect?



Trevor Talbot wrote:

On 11/8/07, Julien Renggli <[EMAIL PROTECTED]> wrote:

As I said, the "nasty" trick works and we'll live with it (writing our
own FileSystem is not an option). We would just like to understand
better what's happening, to know why and when the "first run" is so much
slower. And should we read the whole file beforehand, only parts of it?
Does it depend on how much RAM is installed, ...? If you have any hints
they are welcome. I guess I should ask NTFS experts as well.


What you've discovered is pretty accurate: when the delay is due to
disk I/O, pre-reading the file will load it into the OS's disk cache,
so subsequent accesses are faster.  It will depend on available RAM,
not only what is physically installed but also memory pressure from
other running applications, other disk I/O in progress, various OS
settings, etc.  There isn't any way to accurately predict it.

Reading the entire file when the OS is unwilling to cache all of it
will simply result in only part of the file being cached.  It may also
cause other data to be pushed into the pagefile, slowing down
applications when they later access their own stale data.  In the
worst case, it could slow everything down for a short time.

Reading the entire file from start to finish is pretty much the only
effective way to pull it into cache.  An ifstream is not the most
efficient way to do that, since it does its own buffering, but that's
not important as far as the disk caching effects are concerned.

VACUUM removes internal fragmentation and writes rows in order, which
helps make disk I/O more sequential.  You may be able to one-up it by
inserting data in the order you intend to access it (but create any
indexes afterward).  Beyond that, I'm not aware of anything that would
help.

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Re: [sqlite] Disk caching impacts performance.

Reply via email to