After watching Molly Brown's Channel9 videos on the cache manager, I'm convinced the behavior for SQLite should be to not give the filesystem any hints about caching and let the cache manager itself figure it out. The exception being Windows CE, where we can confirm that when this flag is not set, the device will use compression in memory and degrade performance.
If that's the general consensus, I'll open a ticket. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel Sent: Thursday, September 18, 2008 7:56 AM To: [EMAIL PROTECTED]; General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the cache manager (CC) in Windows and the underlying file system(s). With respect to the cache manager, it is going to affect whether or not there is read ahead, how much read ahead will be used, and how long data will remain in the cache (or another way, how quickly it will be dropped). It has been some time since I've talked to the Queen of Cache Manger about this, but as I recall CC will try to figure out what you are doing if you don't give it a hint. If you do give it a hint, then it is going to run with that hint no matter what the cost. Note that CC or the file system are perfectly within their right to ignore your hints. CC generally does honor them. NTFS, well that's another matter. It has been MY experience (YMMV) that database and temp file reads are fairly random. Database files also have the "nice" property that read and writes are often sector (page) aligned. Journal files should be opened for sequential scan and are generally not sector (page) aligned. Setting SQLite aside for a moment, for very large files that are only going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show huge performance gains. However, if most or all of a file is going to be touched, even in random order, then it doesn't get you much and can hurt you. Most SQLite data bases _probably_ fall into that second case. If you have enough memory and a small enough file such that the cache manager can hold the entire file, you are golden. That's why some people see such great SQLite performance by just sequentially reading their DB files before running their SQLite application. The elephants in the room with that previous paragraph is 1) the amount of RAM in the system and 2) the other applications running. Windows will try to share its resources among all the applications running as best it can. I have not seen any "bugs" in SQLite in this area. It gives a reasonable hint for the general case. To be fair however, I should note that I have my own VFS. It does unbuffered I/O for database files and sequential, cached I/O for journal files. If you think you can get better performance with different flags, create your own VFS, starting with the Windows VFS and make the changes. You can get as sophisticated with your hints as you want. You can write your own caching system if you've ingested way too much caffeine. (Did I mention that the VFS stuff is great!) I would not as a general rule advise people (customers) to change the way their Windows system caches globally for the benefit of one of your applications. Eventually, that is going to bite you with some support calls. Jay A. Kreibich wrote: > On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the wall: > > >> The second is that SQLite when opening a file under Windows explicitly >> tells Windows that the file will be used for random access even though >> that is not the case. Windows uses this hint to override its builtin >> heuristics which can cause bug #1. >> > > >> Bug #2 is that SQLite is lying to the operating system and could result >> in performance degradation if the operating system actually pays >> attention to the hint. >> > > SQLite is not "lying." After poking around a bit to refresh my > understanding of SQLite's file structure, I think it is safe to say > that SQLite will almost never do a sequential file read, even if > you're doing a sequential table scan. > > sequential table scan != sequential file access > > There are some specific situations when you might get bursts of sequential > reads, but only for very specific page layouts with very specific > types of queries. In short, not the common case. Furthermore, even > those patterns can get broken up and shuffled around depending on the > state of SQLite's page cache-- especially if it is bumped up a few > dozen megs. So simply running different types of queries can change > the access patterns (this is true of the OS's file system cache as > well, of course). > > It might be worth instrumenting a few systems and having a look, but > in general, if you had to label SQLite's access pattern, I think > "random" would be the most appropriate label. > > > > I also contend that if the Windows file cache becomes some kind of > bumbling idiot if you actually try to define an access pattern, then > something is wrong. There is a very good reason why the POSIX > functions for doing this kind of thing are called "*advise()". You > might seed the heuristic statistics in a specific direction, but they > should never be totally over-ridden. That quickly leads to stupid > behaviors, like grabbing all the RAM on the system and not letting go. > > > > Of course, we could argue philosophy for a long time. In the here > and now to work around MS's inconsistencies, it looks like the best > bet is turn it on with CE and off on Vista, because it appears to > have two totally different meanings. > > -j, > > _______________________________________________ sqlite-users mailing list email@example.com http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list firstname.lastname@example.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users