After watching Molly Brown's Channel9 videos on the cache manager, I'm
convinced the behavior for SQLite should be to not give the filesystem any
hints about caching and let the cache manager itself figure it out.  The
exception being Windows CE, where we can confirm that when this flag is not
set, the device will use compression in memory and degrade performance.

If that's the general consensus, I'll open a ticket.

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel
Sent: Thursday, September 18, 2008 7:56 AM
To: [EMAIL PROTECTED]; General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the 
cache manager (CC) in Windows and the underlying file system(s).  With 
respect to the cache manager, it is going to affect whether or not there 
is read ahead, how much read ahead will be used, and how long data will 
remain in the cache (or another way, how quickly it will be dropped).  
It has been some time since I've talked to the Queen of Cache Manger 
about this, but as I recall CC will try to figure out what you are doing 
if you don't give it a hint.  If you do give it a hint, then it is going 
to run with that hint no matter what the cost.  Note that CC or the file 
system are perfectly within their right to ignore your hints.  CC 
generally does honor them.  NTFS, well that's another matter.

It has been MY experience (YMMV) that database and temp file reads are 
fairly random.  Database files also have the "nice" property that read 
and writes are often sector (page) aligned.  Journal files should be 
opened for sequential scan and are generally not sector (page) aligned.  
Setting SQLite aside for a moment, for very large files that are only 
going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show 
huge performance gains.  However, if most or all of a file is going to 
be touched, even in random order, then it doesn't get you much and can 
hurt you.  Most SQLite data bases _probably_ fall into that second 
case.  If you have enough memory and a small enough file such that the 
cache manager can hold the entire file, you are golden.  That's why some 
people see such great SQLite performance by just sequentially reading 
their DB files before running their SQLite application.

The elephants in the room with that previous paragraph is 1) the amount 
of RAM in the system and 2) the other applications running.  Windows 
will try to share its resources among all the applications running as 
best it can.

I have not seen any "bugs" in SQLite in this area.  It gives a 
reasonable hint for the general case.  To be fair however, I should note 
that I have my own VFS.  It does unbuffered I/O for database files and 
sequential, cached I/O for journal files.  If you think you can get 
better performance with different flags, create your own VFS, starting 
with the Windows VFS and make the changes.  You can get as sophisticated 
with your hints as you want.  You can write your own caching system if 
you've ingested way too much caffeine.  (Did I mention that the VFS 
stuff is great!)

I would not as a general rule advise people (customers) to change the 
way their Windows system caches globally for the benefit of one of your 
applications.  Eventually, that is going to bite you with some support 
calls.

Jay A. Kreibich wrote:
> On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the
wall:
>
>   
>> The second is that SQLite when opening a file under Windows explicitly
>> tells Windows that the file will be used for random access even though
>> that is not the case.  Windows uses this hint to override its builtin
>> heuristics which can cause bug #1.
>>     
>
>   
>> Bug #2 is that SQLite is lying to the operating system and could result
>> in performance degradation if the operating system actually pays
>> attention to the hint.
>>     
>
>   SQLite is not "lying."  After poking around a bit to refresh my
>   understanding of SQLite's file structure, I think it is safe to say
>   that SQLite will almost never do a sequential file read, even if
>   you're doing a sequential table scan.
>
>     sequential table scan != sequential file access
>   
>   There are some specific situations when you might get bursts of
sequential
>   reads, but only for very specific page layouts with very specific
>   types of queries.  In short, not the common case.  Furthermore, even
>   those patterns can get broken up and shuffled around depending on the
>   state of SQLite's page cache-- especially if it is bumped up a few
>   dozen megs.  So simply running different types of queries can change
>   the access patterns (this is true of the OS's file system cache as
>   well, of course).
>
>   It might be worth instrumenting a few systems and having a look, but
>   in general, if you had to label SQLite's access pattern, I think
>   "random" would be the most appropriate label.
>
>
>
>   I also contend that if the Windows file cache becomes some kind of
>   bumbling idiot if you actually try to define an access pattern, then
>   something is wrong.  There is a very good reason why the POSIX
>   functions for doing this kind of thing are called "*advise()".  You
>   might seed the heuristic statistics in a specific direction, but they
>   should never be totally over-ridden.  That quickly leads to stupid
>   behaviors, like grabbing all the RAM on the system and not letting go.
>
>
>
>   Of course, we could argue philosophy for a long time.  In the here
>   and now to work around MS's inconsistencies, it looks like the best
>   bet is turn it on with CE and off on Vista, because it appears to
>   have two totally different meanings.
>
>    -j, 
>
>   
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to