Re: [sqlite] Vista frustrations

2008-09-18 Thread Mark Spiegel
The SQLite part was an analogy.  That must have been beyond you.  You 
can have the last word.  You're beyond my help.

Fred Williams wrote:
> I never said a word aboout SQLite.  You ass U Me too much I suspect.
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Mark Spiegel
> Sent: Thursday, September 18, 2008 11:25 AM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Vista frustrations
>
>
> I'm sorry, I have to take issue with that statement.  The design of the
> file system/cache manager is not "pitiful".  It strives to provide good
> performance in the entire application space, not just your little corner
> of it.  It is doing the best it can with the "hint" you've given it.  If
> another (or no) hint provides better performance in your application,
> who's fault is that?  Do you realize that without the cache manager,
> fast I/O would not be possible?  Run on a debug system where only IRP
> based I/O is possible any you will be singing another tune in a hurry.
> Why do you think these hints are even available?  It is to help you
> optimize your application.
>
> The SQLite memory subsystem doesn't work well on my platform  I don't
> run around calling SQLite "pitiful".  I recognize that the authors'
> implementation(s) is probably a good performance compromise in the
> generic case.  If it is a big enough problem (which it is for me), I
> write my own version to optimize my performance.  While better, the
> integer encoding is not as good as it could be for me.  Does that mean
> the SQLite is pitiful?
>
> I should also note that as of the last time I talked to her, Molly is no
> longer handling the cache manager.  I believe she has moved back into
> the kernel group after a brief departure, but is working on something
> else.  I haven't seen the talks that Robert refers to, but suspect they
> are close to the versions I have seen in person.  I would bet they are
> still very useful and relevant.
>
> Fred Williams wrote:
>   
>> Is a sad day when an application program is forced to compensate for
>> 
> pitiful
>   
>> OS design and performance :-(
>>
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson
>> Sent: Thursday, September 18, 2008 10:31 AM
>> To: 'General Discussion of SQLite Database'
>> Subject: Re: [sqlite] Vista frustrations
>>
>>
>> After watching Molly Brown's Channel9 videos on the cache manager, I'm
>> convinced the behavior for SQLite should be to not give the filesystem any
>> hints about caching and let the cache manager itself figure it out.  The
>> exception being Windows CE, where we can confirm that when this flag is
>> 
> not
>   
>> set, the device will use compression in memory and degrade performance.
>>
>> If that's the general consensus, I'll open a ticket.
>>
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel
>> Sent: Thursday, September 18, 2008 7:56 AM
>> To: [EMAIL PROTECTED]; General Discussion of SQLite Database
>> Subject: Re: [sqlite] Vista frustrations
>>
>> FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the
>> cache manager (CC) in Windows and the underlying file system(s).  With
>> respect to the cache manager, it is going to affect whether or not there
>> is read ahead, how much read ahead will be used, and how long data will
>> remain in the cache (or another way, how quickly it will be dropped).
>> It has been some time since I've talked to the Queen of Cache Manger
>> about this, but as I recall CC will try to figure out what you are doing
>> if you don't give it a hint.  If you do give it a hint, then it is going
>> to run with that hint no matter what the cost.  Note that CC or the file
>> system are perfectly within their right to ignore your hints.  CC
>> generally does honor them.  NTFS, well that's another matter.
>>
>> It has been MY experience (YMMV) that database and temp file reads are
>> fairly random.  Database files also have the "nice" property that read
>> and writes are often sector (page) aligned.  Journal files should be
>> opened for sequential scan and are generally not sector (page) aligned.
>> Setting SQLite aside for a moment, for very large files that are only
>> going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show
>> huge performance gains.  However, if most or all of a file is going to
>> be touched, even in random order, then it doesn't get you much and can
>&

Re: [sqlite] Vista frustrations

2008-09-18 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jay A. Kreibich wrote:
>   SQLite is not "lying."  After poking around a bit to refresh my
>   understanding of SQLite's file structure, I think it is safe to say
>   that SQLite will almost never do a sequential file read, even if
>   you're doing a sequential table scan.
> 
> sequential table scan != sequential file access

That statement is true in general but in this specific case we are
talking about a multi-gigabyte database.  It would be fair to say that
large chunks of the database would have been created in a transaction
and that there wouldn't have been a lot of row updates/deletes/inserts
throughout it causing something similar to fragmentation.

I made the vfs track sequential vs random read accesses and then created
a 3 column table with each row being a random int, a string between 0
and 1024 chars long and another random int.  A 2,000,000 row database
occupied 1.4GB.  Page/cache size etc were all left at their defaults.
Testing was done on 64 bit Linux.

   create table foo(x,y,z)

Full scan query:  select max(z) from foo

  seq 1,298,231 random 93,599  - 93% sequential

Create index:  create index foo1 on foo(z)

  seq 244,920 random 3,261,252  - 7% sequential

Indexed query: select max(z) from foo where z>9900

  seq 0 random 7- 0% sequential


In this particular case SQLite is very much lying when a full table scan
was done as the vast majority of access is indeed sequential.  Of course
several factors will affect this such as typical size of each row,
schema, how much prior "thrashing" about of contents the database has
had etc.  I also didn't count nearly sequential access, for example if
the operating system does reads in 4kb chunks and one read was for the
first kb and the next read was for the third kb.

An example of how the numbers can be skewed for the full scan:

%Sequential   StringSize PageSize

  8 1-2KB  1KB
 99 0-1KB  4KB
 99.9   0-1KB 32KB
 97 0-.5KB 1KB

It is fair to say that SQLite does not know in advance what the access
pattern is going to be, especially without analysing the database
contents, rows per page and queries that will be executed.  Queries that
predominantly use indices (a desirable trait) are mostly random access.
 Full table scans where the row fits within the page are mostly sequential.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0xBomOOfHg372QQRAnXXAJ0VGwzzrOwkWF0q4Za2QpuTKnQD+QCfblin
DeHukDoCs809uawnlt5GYLc=
=l25n
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-18 Thread Fred Williams
I never said a word aboout SQLite.  You ass U Me too much I suspect.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Mark Spiegel
Sent: Thursday, September 18, 2008 11:25 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations


I'm sorry, I have to take issue with that statement.  The design of the
file system/cache manager is not "pitiful".  It strives to provide good
performance in the entire application space, not just your little corner
of it.  It is doing the best it can with the "hint" you've given it.  If
another (or no) hint provides better performance in your application,
who's fault is that?  Do you realize that without the cache manager,
fast I/O would not be possible?  Run on a debug system where only IRP
based I/O is possible any you will be singing another tune in a hurry.
Why do you think these hints are even available?  It is to help you
optimize your application.

The SQLite memory subsystem doesn't work well on my platform  I don't
run around calling SQLite "pitiful".  I recognize that the authors'
implementation(s) is probably a good performance compromise in the
generic case.  If it is a big enough problem (which it is for me), I
write my own version to optimize my performance.  While better, the
integer encoding is not as good as it could be for me.  Does that mean
the SQLite is pitiful?

I should also note that as of the last time I talked to her, Molly is no
longer handling the cache manager.  I believe she has moved back into
the kernel group after a brief departure, but is working on something
else.  I haven't seen the talks that Robert refers to, but suspect they
are close to the versions I have seen in person.  I would bet they are
still very useful and relevant.

Fred Williams wrote:
> Is a sad day when an application program is forced to compensate for
pitiful
> OS design and performance :-(
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson
> Sent: Thursday, September 18, 2008 10:31 AM
> To: 'General Discussion of SQLite Database'
> Subject: Re: [sqlite] Vista frustrations
>
>
> After watching Molly Brown's Channel9 videos on the cache manager, I'm
> convinced the behavior for SQLite should be to not give the filesystem any
> hints about caching and let the cache manager itself figure it out.  The
> exception being Windows CE, where we can confirm that when this flag is
not
> set, the device will use compression in memory and degrade performance.
>
> If that's the general consensus, I'll open a ticket.
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel
> Sent: Thursday, September 18, 2008 7:56 AM
> To: [EMAIL PROTECTED]; General Discussion of SQLite Database
> Subject: Re: [sqlite] Vista frustrations
>
> FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the
> cache manager (CC) in Windows and the underlying file system(s).  With
> respect to the cache manager, it is going to affect whether or not there
> is read ahead, how much read ahead will be used, and how long data will
> remain in the cache (or another way, how quickly it will be dropped).
> It has been some time since I've talked to the Queen of Cache Manger
> about this, but as I recall CC will try to figure out what you are doing
> if you don't give it a hint.  If you do give it a hint, then it is going
> to run with that hint no matter what the cost.  Note that CC or the file
> system are perfectly within their right to ignore your hints.  CC
> generally does honor them.  NTFS, well that's another matter.
>
> It has been MY experience (YMMV) that database and temp file reads are
> fairly random.  Database files also have the "nice" property that read
> and writes are often sector (page) aligned.  Journal files should be
> opened for sequential scan and are generally not sector (page) aligned.
> Setting SQLite aside for a moment, for very large files that are only
> going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show
> huge performance gains.  However, if most or all of a file is going to
> be touched, even in random order, then it doesn't get you much and can
> hurt you.  Most SQLite data bases _probably_ fall into that second
> case.  If you have enough memory and a small enough file such that the
> cache manager can hold the entire file, you are golden.  That's why some
> people see such great SQLite performance by just sequentially reading
> their DB files before running their SQLite application.
>
> The elephants in the room with that previous paragraph is 1) the amount
> of RAM in the system and 2) the other applications running.  Windows
> will try to share its resources among all the 

Re: [sqlite] Vista frustrations

2008-09-18 Thread Mark Spiegel
I'm sorry, I have to take issue with that statement.  The design of the 
file system/cache manager is not "pitiful".  It strives to provide good 
performance in the entire application space, not just your little corner 
of it.  It is doing the best it can with the "hint" you've given it.  If 
another (or no) hint provides better performance in your application, 
who's fault is that?  Do you realize that without the cache manager, 
fast I/O would not be possible?  Run on a debug system where only IRP 
based I/O is possible any you will be singing another tune in a hurry.  
Why do you think these hints are even available?  It is to help you 
optimize your application.

The SQLite memory subsystem doesn't work well on my platform  I don't 
run around calling SQLite "pitiful".  I recognize that the authors' 
implementation(s) is probably a good performance compromise in the 
generic case.  If it is a big enough problem (which it is for me), I 
write my own version to optimize my performance.  While better, the 
integer encoding is not as good as it could be for me.  Does that mean 
the SQLite is pitiful?

I should also note that as of the last time I talked to her, Molly is no 
longer handling the cache manager.  I believe she has moved back into 
the kernel group after a brief departure, but is working on something 
else.  I haven't seen the talks that Robert refers to, but suspect they 
are close to the versions I have seen in person.  I would bet they are 
still very useful and relevant.

Fred Williams wrote:
> Is a sad day when an application program is forced to compensate for pitiful
> OS design and performance :-(
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson
> Sent: Thursday, September 18, 2008 10:31 AM
> To: 'General Discussion of SQLite Database'
> Subject: Re: [sqlite] Vista frustrations
>
>
> After watching Molly Brown's Channel9 videos on the cache manager, I'm
> convinced the behavior for SQLite should be to not give the filesystem any
> hints about caching and let the cache manager itself figure it out.  The
> exception being Windows CE, where we can confirm that when this flag is not
> set, the device will use compression in memory and degrade performance.
>
> If that's the general consensus, I'll open a ticket.
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel
> Sent: Thursday, September 18, 2008 7:56 AM
> To: [EMAIL PROTECTED]; General Discussion of SQLite Database
> Subject: Re: [sqlite] Vista frustrations
>
> FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the
> cache manager (CC) in Windows and the underlying file system(s).  With
> respect to the cache manager, it is going to affect whether or not there
> is read ahead, how much read ahead will be used, and how long data will
> remain in the cache (or another way, how quickly it will be dropped).
> It has been some time since I've talked to the Queen of Cache Manger
> about this, but as I recall CC will try to figure out what you are doing
> if you don't give it a hint.  If you do give it a hint, then it is going
> to run with that hint no matter what the cost.  Note that CC or the file
> system are perfectly within their right to ignore your hints.  CC
> generally does honor them.  NTFS, well that's another matter.
>
> It has been MY experience (YMMV) that database and temp file reads are
> fairly random.  Database files also have the "nice" property that read
> and writes are often sector (page) aligned.  Journal files should be
> opened for sequential scan and are generally not sector (page) aligned.
> Setting SQLite aside for a moment, for very large files that are only
> going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show
> huge performance gains.  However, if most or all of a file is going to
> be touched, even in random order, then it doesn't get you much and can
> hurt you.  Most SQLite data bases _probably_ fall into that second
> case.  If you have enough memory and a small enough file such that the
> cache manager can hold the entire file, you are golden.  That's why some
> people see such great SQLite performance by just sequentially reading
> their DB files before running their SQLite application.
>
> The elephants in the room with that previous paragraph is 1) the amount
> of RAM in the system and 2) the other applications running.  Windows
> will try to share its resources among all the applications running as
> best it can.
>
> I have not seen any "bugs" in SQLite in this area.  It gives a
> reasonable hint for the general case.  To be fair however, I should note
> that I have my own VFS.  It does unbuffered I/O for database files and

Re: [sqlite] Vista frustrations

2008-09-18 Thread Robert Simpson
Designing operating systems is hard work, and I don't envy the teams that do
it.  I've found my Vista experience to be rather enjoyable, especially post
SP1.  I don't think there's an OS out today that isn't annoying in one
fashion or another.  Having an application or library implement a workaround
for an OS design flaw is not unique to Windows by any stretch.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Fred Williams
Sent: Thursday, September 18, 2008 8:54 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

Is a sad day when an application program is forced to compensate for pitiful
OS design and performance :-(

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson
Sent: Thursday, September 18, 2008 10:31 AM
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] Vista frustrations


After watching Molly Brown's Channel9 videos on the cache manager, I'm
convinced the behavior for SQLite should be to not give the filesystem any
hints about caching and let the cache manager itself figure it out.  The
exception being Windows CE, where we can confirm that when this flag is not
set, the device will use compression in memory and degrade performance.

If that's the general consensus, I'll open a ticket.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel
Sent: Thursday, September 18, 2008 7:56 AM
To: [EMAIL PROTECTED]; General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the
cache manager (CC) in Windows and the underlying file system(s).  With
respect to the cache manager, it is going to affect whether or not there
is read ahead, how much read ahead will be used, and how long data will
remain in the cache (or another way, how quickly it will be dropped).
It has been some time since I've talked to the Queen of Cache Manger
about this, but as I recall CC will try to figure out what you are doing
if you don't give it a hint.  If you do give it a hint, then it is going
to run with that hint no matter what the cost.  Note that CC or the file
system are perfectly within their right to ignore your hints.  CC
generally does honor them.  NTFS, well that's another matter.

It has been MY experience (YMMV) that database and temp file reads are
fairly random.  Database files also have the "nice" property that read
and writes are often sector (page) aligned.  Journal files should be
opened for sequential scan and are generally not sector (page) aligned.
Setting SQLite aside for a moment, for very large files that are only
going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show
huge performance gains.  However, if most or all of a file is going to
be touched, even in random order, then it doesn't get you much and can
hurt you.  Most SQLite data bases _probably_ fall into that second
case.  If you have enough memory and a small enough file such that the
cache manager can hold the entire file, you are golden.  That's why some
people see such great SQLite performance by just sequentially reading
their DB files before running their SQLite application.

The elephants in the room with that previous paragraph is 1) the amount
of RAM in the system and 2) the other applications running.  Windows
will try to share its resources among all the applications running as
best it can.

I have not seen any "bugs" in SQLite in this area.  It gives a
reasonable hint for the general case.  To be fair however, I should note
that I have my own VFS.  It does unbuffered I/O for database files and
sequential, cached I/O for journal files.  If you think you can get
better performance with different flags, create your own VFS, starting
with the Windows VFS and make the changes.  You can get as sophisticated
with your hints as you want.  You can write your own caching system if
you've ingested way too much caffeine.  (Did I mention that the VFS
stuff is great!)

I would not as a general rule advise people (customers) to change the
way their Windows system caches globally for the benefit of one of your
applications.  Eventually, that is going to bite you with some support
calls.

Jay A. Kreibich wrote:
> On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the
wall:
>
>
>> The second is that SQLite when opening a file under Windows explicitly
>> tells Windows that the file will be used for random access even though
>> that is not the case.  Windows uses this hint to override its builtin
>> heuristics which can cause bug #1.
>>
>
>
>> Bug #2 is that SQLite is lying to the operating system and could result
>> in performance degradation if the operating system actually pays
>> attention to the hint.
>>
>
>   SQLite is not "lying."  After 

Re: [sqlite] Vista frustrations

2008-09-18 Thread Fred Williams
Is a sad day when an application program is forced to compensate for pitiful
OS design and performance :-(

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson
Sent: Thursday, September 18, 2008 10:31 AM
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] Vista frustrations


After watching Molly Brown's Channel9 videos on the cache manager, I'm
convinced the behavior for SQLite should be to not give the filesystem any
hints about caching and let the cache manager itself figure it out.  The
exception being Windows CE, where we can confirm that when this flag is not
set, the device will use compression in memory and degrade performance.

If that's the general consensus, I'll open a ticket.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel
Sent: Thursday, September 18, 2008 7:56 AM
To: [EMAIL PROTECTED]; General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the
cache manager (CC) in Windows and the underlying file system(s).  With
respect to the cache manager, it is going to affect whether or not there
is read ahead, how much read ahead will be used, and how long data will
remain in the cache (or another way, how quickly it will be dropped).
It has been some time since I've talked to the Queen of Cache Manger
about this, but as I recall CC will try to figure out what you are doing
if you don't give it a hint.  If you do give it a hint, then it is going
to run with that hint no matter what the cost.  Note that CC or the file
system are perfectly within their right to ignore your hints.  CC
generally does honor them.  NTFS, well that's another matter.

It has been MY experience (YMMV) that database and temp file reads are
fairly random.  Database files also have the "nice" property that read
and writes are often sector (page) aligned.  Journal files should be
opened for sequential scan and are generally not sector (page) aligned.
Setting SQLite aside for a moment, for very large files that are only
going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show
huge performance gains.  However, if most or all of a file is going to
be touched, even in random order, then it doesn't get you much and can
hurt you.  Most SQLite data bases _probably_ fall into that second
case.  If you have enough memory and a small enough file such that the
cache manager can hold the entire file, you are golden.  That's why some
people see such great SQLite performance by just sequentially reading
their DB files before running their SQLite application.

The elephants in the room with that previous paragraph is 1) the amount
of RAM in the system and 2) the other applications running.  Windows
will try to share its resources among all the applications running as
best it can.

I have not seen any "bugs" in SQLite in this area.  It gives a
reasonable hint for the general case.  To be fair however, I should note
that I have my own VFS.  It does unbuffered I/O for database files and
sequential, cached I/O for journal files.  If you think you can get
better performance with different flags, create your own VFS, starting
with the Windows VFS and make the changes.  You can get as sophisticated
with your hints as you want.  You can write your own caching system if
you've ingested way too much caffeine.  (Did I mention that the VFS
stuff is great!)

I would not as a general rule advise people (customers) to change the
way their Windows system caches globally for the benefit of one of your
applications.  Eventually, that is going to bite you with some support
calls.

Jay A. Kreibich wrote:
> On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the
wall:
>
>
>> The second is that SQLite when opening a file under Windows explicitly
>> tells Windows that the file will be used for random access even though
>> that is not the case.  Windows uses this hint to override its builtin
>> heuristics which can cause bug #1.
>>
>
>
>> Bug #2 is that SQLite is lying to the operating system and could result
>> in performance degradation if the operating system actually pays
>> attention to the hint.
>>
>
>   SQLite is not "lying."  After poking around a bit to refresh my
>   understanding of SQLite's file structure, I think it is safe to say
>   that SQLite will almost never do a sequential file read, even if
>   you're doing a sequential table scan.
>
> sequential table scan != sequential file access
>
>   There are some specific situations when you might get bursts of
sequential
>   reads, but only for very specific page layouts with very specific
>   types of queries.  In short, not the common case.  Furthermore, even
>   those patterns can get broken up and shuffled around depending on the
&

Re: [sqlite] Vista frustrations

2008-09-18 Thread Mark Spiegel
FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the 
cache manager (CC) in Windows and the underlying file system(s).  With 
respect to the cache manager, it is going to affect whether or not there 
is read ahead, how much read ahead will be used, and how long data will 
remain in the cache (or another way, how quickly it will be dropped).  
It has been some time since I've talked to the Queen of Cache Manger 
about this, but as I recall CC will try to figure out what you are doing 
if you don't give it a hint.  If you do give it a hint, then it is going 
to run with that hint no matter what the cost.  Note that CC or the file 
system are perfectly within their right to ignore your hints.  CC 
generally does honor them.  NTFS, well that's another matter.

It has been MY experience (YMMV) that database and temp file reads are 
fairly random.  Database files also have the "nice" property that read 
and writes are often sector (page) aligned.  Journal files should be 
opened for sequential scan and are generally not sector (page) aligned.  
Setting SQLite aside for a moment, for very large files that are only 
going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show 
huge performance gains.  However, if most or all of a file is going to 
be touched, even in random order, then it doesn't get you much and can 
hurt you.  Most SQLite data bases _probably_ fall into that second 
case.  If you have enough memory and a small enough file such that the 
cache manager can hold the entire file, you are golden.  That's why some 
people see such great SQLite performance by just sequentially reading 
their DB files before running their SQLite application.

The elephants in the room with that previous paragraph is 1) the amount 
of RAM in the system and 2) the other applications running.  Windows 
will try to share its resources among all the applications running as 
best it can.

I have not seen any "bugs" in SQLite in this area.  It gives a 
reasonable hint for the general case.  To be fair however, I should note 
that I have my own VFS.  It does unbuffered I/O for database files and 
sequential, cached I/O for journal files.  If you think you can get 
better performance with different flags, create your own VFS, starting 
with the Windows VFS and make the changes.  You can get as sophisticated 
with your hints as you want.  You can write your own caching system if 
you've ingested way too much caffeine.  (Did I mention that the VFS 
stuff is great!)

I would not as a general rule advise people (customers) to change the 
way their Windows system caches globally for the benefit of one of your 
applications.  Eventually, that is going to bite you with some support 
calls.

Jay A. Kreibich wrote:
> On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the wall:
>
>   
>> The second is that SQLite when opening a file under Windows explicitly
>> tells Windows that the file will be used for random access even though
>> that is not the case.  Windows uses this hint to override its builtin
>> heuristics which can cause bug #1.
>> 
>
>   
>> Bug #2 is that SQLite is lying to the operating system and could result
>> in performance degradation if the operating system actually pays
>> attention to the hint.
>> 
>
>   SQLite is not "lying."  After poking around a bit to refresh my
>   understanding of SQLite's file structure, I think it is safe to say
>   that SQLite will almost never do a sequential file read, even if
>   you're doing a sequential table scan.
>
> sequential table scan != sequential file access
>   
>   There are some specific situations when you might get bursts of sequential
>   reads, but only for very specific page layouts with very specific
>   types of queries.  In short, not the common case.  Furthermore, even
>   those patterns can get broken up and shuffled around depending on the
>   state of SQLite's page cache-- especially if it is bumped up a few
>   dozen megs.  So simply running different types of queries can change
>   the access patterns (this is true of the OS's file system cache as
>   well, of course).
>
>   It might be worth instrumenting a few systems and having a look, but
>   in general, if you had to label SQLite's access pattern, I think
>   "random" would be the most appropriate label.
>
>
>
>   I also contend that if the Windows file cache becomes some kind of
>   bumbling idiot if you actually try to define an access pattern, then
>   something is wrong.  There is a very good reason why the POSIX
>   functions for doing this kind of thing are called "*advise()".  You
>   might seed the heuristic statistics in a specific direction, but they
>   should never be totally over-ridden.  That quickly leads to stupid
>   behaviors, like grabbing all the RAM on the system and not letting go.
>
>
>
>   Of course, we could argue philosophy for a long time.  In the here
>   and now to work around MS's inconsistencies, it looks like the best
>   

Re: [sqlite] Vista frustrations

2008-09-17 Thread Robert Simpson
A full table scan of 16 million rows and 55 columns on Vista with
FILE_FLAG_RANDOM_ACCESS finished in 600 seconds (cold) and 499 seconds
(warm) and turned Vista into a brick. The same test with the flag removed
ran in 119 seconds (cold) and 99 seconds (warm), and caused no adverse
problems with the OS.

Not even Sql Server could do a full table scan of the same data that fast!
Same test on Sql Server 2005 on the same machine ran in a minimum of 180
seconds, and averaged 220 seconds!  Oh and Sql Server's memory usage shot to
3gb as well.  Fun stuff.

I ran the same tests with SQLite on an XP laptop.  I got 618 seconds with
FILE_FLAG_RANDOM_ACCESS, and 599 seconds without the flag.  Not so dramatic
of a difference, but still in the right direction.

So I'd agree that the hint should only be used for CE, and no other
platforms.  Roger quoted the CE compression blurb from this MSDN page:
http://msdn.microsoft.com/en-us/library/ms923949.aspx

Robert


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns
Sent: Wednesday, September 17, 2008 4:24 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS
> any hints about caching.  However, I'm not sure what kind of performance
hit
> (if any) that would have on Windows.  It's already been proven that
> providing the hint on WinCE is beneficial.

I agree.  Open a ticket and include measurements in it.

With Windows CE, setting the flag prevents compression:

  Using the FILE_FLAG_RANDOM_ACCESS flag in the RAM file
  system, which places files in the object store, will
  prevent a file from being compressed. If performance is
  an issue, this could be the correct solution. Read and
  write operations to a compressed file are slower than
  read and write operations to an uncompressed file.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0ZGemOOfHg372QQRAtHSAJ9fSAXY3ekwrK4g9/eTrcYYQInW6QCgvSkB
e00SnZbtTmQQ1G8ZxEVTK1M=
=Y3uW
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Jay A. Kreibich
On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the wall:

> The second is that SQLite when opening a file under Windows explicitly
> tells Windows that the file will be used for random access even though
> that is not the case.  Windows uses this hint to override its builtin
> heuristics which can cause bug #1.

> Bug #2 is that SQLite is lying to the operating system and could result
> in performance degradation if the operating system actually pays
> attention to the hint.

  SQLite is not "lying."  After poking around a bit to refresh my
  understanding of SQLite's file structure, I think it is safe to say
  that SQLite will almost never do a sequential file read, even if
  you're doing a sequential table scan.

sequential table scan != sequential file access
  
  There are some specific situations when you might get bursts of sequential
  reads, but only for very specific page layouts with very specific
  types of queries.  In short, not the common case.  Furthermore, even
  those patterns can get broken up and shuffled around depending on the
  state of SQLite's page cache-- especially if it is bumped up a few
  dozen megs.  So simply running different types of queries can change
  the access patterns (this is true of the OS's file system cache as
  well, of course).

  It might be worth instrumenting a few systems and having a look, but
  in general, if you had to label SQLite's access pattern, I think
  "random" would be the most appropriate label.



  I also contend that if the Windows file cache becomes some kind of
  bumbling idiot if you actually try to define an access pattern, then
  something is wrong.  There is a very good reason why the POSIX
  functions for doing this kind of thing are called "*advise()".  You
  might seed the heuristic statistics in a specific direction, but they
  should never be totally over-ridden.  That quickly leads to stupid
  behaviors, like grabbing all the RAM on the system and not letting go.



  Of course, we could argue philosophy for a long time.  In the here
  and now to work around MS's inconsistencies, it looks like the best
  bet is turn it on with CE and off on Vista, because it appears to
  have two totally different meanings.

   -j, 

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brown, Daniel wrote:
> Has anyone tried to replicate this bug on WindowsXP too?

There are two separate bugs here.  One is that when Vista is told a file
is used for random access and then there is a lot of file access, it may
use a little too much memory for cache instead of applications which
causes those application to be sluggish sometimes to the point of
unusability.

The second is that SQLite when opening a file under Windows explicitly
tells Windows that the file will be used for random access even though
that is not the case.  Windows uses this hint to override its builtin
heuristics which can cause bug #1.

Bug #1 will not happen under default Windows XP configurations as XP
uses 10 MB of memory for file caching no matter how much RAM you have.
See "Zero" on
http://www.sqlite.org/cvstrac/wiki/wiki?p=PerformanceTuningWindows

Bug #2 is that SQLite is lying to the operating system and could result
in performance degradation if the operating system actually pays
attention to the hint.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0ag9mOOfHg372QQRAvCuAJ4uVYifeD7PsEVuyN2Mt1VFdqfHjwCgv3tj
/P62UEnqepRAfhV2QczfBKs=
=9ZV0
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Brown, Daniel
Has anyone tried to replicate this bug on WindowsXP too?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns
Sent: Wednesday, September 17, 2008 4:24 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> Ideally, at least on non-CE platforms, I'd like see SQLite not give
the OS
> any hints about caching.  However, I'm not sure what kind of
performance hit
> (if any) that would have on Windows.  It's already been proven that
> providing the hint on WinCE is beneficial.

I agree.  Open a ticket and include measurements in it.

With Windows CE, setting the flag prevents compression:

  Using the FILE_FLAG_RANDOM_ACCESS flag in the RAM file
  system, which places files in the object store, will
  prevent a file from being compressed. If performance is
  an issue, this could be the correct solution. Read and
  write operations to a compressed file are slower than
  read and write operations to an uncompressed file.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0ZGemOOfHg372QQRAtHSAJ9fSAXY3ekwrK4g9/eTrcYYQInW6QCgvSkB
e00SnZbtTmQQ1G8ZxEVTK1M=
=Y3uW
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS
> any hints about caching.  However, I'm not sure what kind of performance hit
> (if any) that would have on Windows.  It's already been proven that
> providing the hint on WinCE is beneficial.

I agree.  Open a ticket and include measurements in it.

With Windows CE, setting the flag prevents compression:

  Using the FILE_FLAG_RANDOM_ACCESS flag in the RAM file
  system, which places files in the object store, will
  prevent a file from being compressed. If performance is
  an issue, this could be the correct solution. Read and
  write operations to a compressed file are slower than
  read and write operations to an uncompressed file.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0ZGemOOfHg372QQRAtHSAJ9fSAXY3ekwrK4g9/eTrcYYQInW6QCgvSkB
e00SnZbtTmQQ1G8ZxEVTK1M=
=Y3uW
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Fred Williams wrote:
> Have you ever actually used a version of Windows?

Windows 1.0 (once), Windows 2/286, Windows 3, 3.1, 3.11, 95, 98, ME,
2000, XP, 2003 and Vista.

> ANY OS that attempts to read in a xGigibyte file into real memory to the
> detriment of the entire system load is not working correctly.  Call it a bug
> or a feature it still sucks.

If the other tasks aren't doing anything then it slows down the reading
task by not using using as much memory as is reasonable.  For example
not using half the RAM for the file caching just in case a background
task may came to life in the future will slow down the file task which
then increases the probability that a background task will be needed.
Getting this stuff right is not easy.  The kernel cannot predict the
future and instead has to use heuristics which involve tradeoffs of
throughput against latency.  There will always be cases where the
heuristics get it wrong, but the goal is to get it right the vast
majority of the time.

Linux has exactly the same issue.  Ultimately a kernel variable named
swappiness is used with a default value, but is administrator adjustable
because even they couldn't work a single correct value for every situation.

  http://lwn.net/Articles/83588/

Everyone recognises the degenerate cases are bad, but even then it is
questionable.  Do you want the file task to take 1 minute and then have
all your other tasks take 10 seconds to become fully responsive, or do
you want your file task to take 5 minutes while your other tasks are
mostly responsive?

If you have some algorithm, parameters and heuristics that work under
all workloads and configurations then I am sure the operating systems
community would love to hear from you.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0Y/lmOOfHg372QQRAlMbAKCB3acOdcnIw2JRIGrrOh//jw/0rgCdG2AQ
SvyDyo3K3fsxtxhYgsAGVeU=
=X+SR
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Jeffrey Becker
I am running vista my personal pc.  I'll try to take a look into how
not having this flag affects performance when I get home.

On Wed, Sep 17, 2008 at 5:29 PM, Robert Simpson <[EMAIL PROTECTED]> wrote:
> Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS
> any hints about caching.  However, I'm not sure what kind of performance hit
> (if any) that would have on Windows.  It's already been proven that
> providing the hint on WinCE is beneficial.
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns
> Sent: Wednesday, September 17, 2008 2:02 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Vista frustrations
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Robert Simpson wrote:
>> The purpose of a cache is to improve performance and responsiveness.  Any
>> cache that uses all physical memory, forces all other apps to the paging
>> file
>
> All current operating systems do this, using heuristics to determine how
> much of each running application to leave in memory while using "spare"
> memory for filesystem cache and dynamically changing allocation based on
> demand.  Sometimes they mess up to the detriment of idle applications.
> However that is not the bit I am talking about as it is not controlled
> by flags when opening files and instead is part of the core operating
> system code.  (You often see this issue when copying files larger than RAM).
>
> What is under discussion is how the operating system is using the cache
> that it does decide to allocate for SQLite.  If you tell it that a file
> is sequential access then that means that read ahead is good and that
> data already read can be discarded.  If you tell it that a file is
> random access then read ahead is bad (it has the disk occupied when the
> next random request comes in) and already read data should be kept.
> SQLite does not know and should be leaving it up to the operating
> system.  Your tests prove that when that is done on Vista, performance
> of SQLite is better and other applications are less adversely affected.
>
>> The real frustration is that this seems to be a rather obvious bug in
> Vista,
>> and definitely not SQLite's responsibility.
>
> This is conflating two issues.  One is the tradeoff between RAM usage
> for cache vs idle applications.  You can argue that is a bug, or more
> accurately there are circumstances under which the tradeoff picks wrong
> values, and is very hard to get right.  Example underlying details are
> at http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx
>
> The second issue is the performance differences when random access,
> sequential and no flags are given while opening the file and the
> resulting performance.  In this case a file is accessed mostly
> sequentially but the random access flag is given.  Performance was worse
> than letting the operating system use its own heuristics.  That is
> hardly surprising or a bug.
>
> Roger
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iD8DBQFI0XBYmOOfHg372QQRAmKfAJ4pWVfZ8LS2ET+Y55FGT3Am9wCfZg4e
> McaZrcLUXQJhU7i1Gw5+cRk=
> =CQg5
> -END PGP SIGNATURE-
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Robert Simpson
Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS
any hints about caching.  However, I'm not sure what kind of performance hit
(if any) that would have on Windows.  It's already been proven that
providing the hint on WinCE is beneficial.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns
Sent: Wednesday, September 17, 2008 2:02 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> The purpose of a cache is to improve performance and responsiveness.  Any
> cache that uses all physical memory, forces all other apps to the paging
> file 

All current operating systems do this, using heuristics to determine how
much of each running application to leave in memory while using "spare"
memory for filesystem cache and dynamically changing allocation based on
demand.  Sometimes they mess up to the detriment of idle applications.
However that is not the bit I am talking about as it is not controlled
by flags when opening files and instead is part of the core operating
system code.  (You often see this issue when copying files larger than RAM).

What is under discussion is how the operating system is using the cache
that it does decide to allocate for SQLite.  If you tell it that a file
is sequential access then that means that read ahead is good and that
data already read can be discarded.  If you tell it that a file is
random access then read ahead is bad (it has the disk occupied when the
next random request comes in) and already read data should be kept.
SQLite does not know and should be leaving it up to the operating
system.  Your tests prove that when that is done on Vista, performance
of SQLite is better and other applications are less adversely affected.

> The real frustration is that this seems to be a rather obvious bug in
Vista,
> and definitely not SQLite's responsibility. 

This is conflating two issues.  One is the tradeoff between RAM usage
for cache vs idle applications.  You can argue that is a bug, or more
accurately there are circumstances under which the tradeoff picks wrong
values, and is very hard to get right.  Example underlying details are
at http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx

The second issue is the performance differences when random access,
sequential and no flags are given while opening the file and the
resulting performance.  In this case a file is accessed mostly
sequentially but the random access flag is given.  Performance was worse
than letting the operating system use its own heuristics.  That is
hardly surprising or a bug.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0XBYmOOfHg372QQRAmKfAJ4pWVfZ8LS2ET+Y55FGT3Am9wCfZg4e
McaZrcLUXQJhU7i1Gw5+cRk=
=CQg5
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Fred Williams
Have you ever actually used a version of Windows?

ANY OS that attempts to read in a xGigibyte file into real memory to the
detriment of the entire system load is not working correctly.  Call it a bug
or a feature it still sucks.

I expect nothing less from Microsoft with each new version.  I recommend you
wait for, it looks like, maybe, Service Pack 6 before you get serious about
Vista.

Fred

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Roger Binns
Sent: Wednesday, September 17, 2008 4:02 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> The purpose of a cache is to improve performance and responsiveness.  Any
> cache that uses all physical memory, forces all other apps to the paging
> file

All current operating systems do this, using heuristics to determine how
much of each running application to leave in memory while using "spare"
memory for filesystem cache and dynamically changing allocation based on
demand.  Sometimes they mess up to the detriment of idle applications.
However that is not the bit I am talking about as it is not controlled
by flags when opening files and instead is part of the core operating
system code.  (You often see this issue when copying files larger than RAM).

What is under discussion is how the operating system is using the cache
that it does decide to allocate for SQLite.  If you tell it that a file
is sequential access then that means that read ahead is good and that
data already read can be discarded.  If you tell it that a file is
random access then read ahead is bad (it has the disk occupied when the
next random request comes in) and already read data should be kept.
SQLite does not know and should be leaving it up to the operating
system.  Your tests prove that when that is done on Vista, performance
of SQLite is better and other applications are less adversely affected.

> The real frustration is that this seems to be a rather obvious bug in
Vista,
> and definitely not SQLite's responsibility.

This is conflating two issues.  One is the tradeoff between RAM usage
for cache vs idle applications.  You can argue that is a bug, or more
accurately there are circumstances under which the tradeoff picks wrong
values, and is very hard to get right.  Example underlying details are
at http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx

The second issue is the performance differences when random access,
sequential and no flags are given while opening the file and the
resulting performance.  In this case a file is accessed mostly
sequentially but the random access flag is given.  Performance was worse
than letting the operating system use its own heuristics.  That is
hardly surprising or a bug.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0XBYmOOfHg372QQRAmKfAJ4pWVfZ8LS2ET+Y55FGT3Am9wCfZg4e
McaZrcLUXQJhU7i1Gw5+cRk=
=CQg5
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> The purpose of a cache is to improve performance and responsiveness.  Any
> cache that uses all physical memory, forces all other apps to the paging
> file 

All current operating systems do this, using heuristics to determine how
much of each running application to leave in memory while using "spare"
memory for filesystem cache and dynamically changing allocation based on
demand.  Sometimes they mess up to the detriment of idle applications.
However that is not the bit I am talking about as it is not controlled
by flags when opening files and instead is part of the core operating
system code.  (You often see this issue when copying files larger than RAM).

What is under discussion is how the operating system is using the cache
that it does decide to allocate for SQLite.  If you tell it that a file
is sequential access then that means that read ahead is good and that
data already read can be discarded.  If you tell it that a file is
random access then read ahead is bad (it has the disk occupied when the
next random request comes in) and already read data should be kept.
SQLite does not know and should be leaving it up to the operating
system.  Your tests prove that when that is done on Vista, performance
of SQLite is better and other applications are less adversely affected.

> The real frustration is that this seems to be a rather obvious bug in Vista,
> and definitely not SQLite's responsibility. 

This is conflating two issues.  One is the tradeoff between RAM usage
for cache vs idle applications.  You can argue that is a bug, or more
accurately there are circumstances under which the tradeoff picks wrong
values, and is very hard to get right.  Example underlying details are
at http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx

The second issue is the performance differences when random access,
sequential and no flags are given while opening the file and the
resulting performance.  In this case a file is accessed mostly
sequentially but the random access flag is given.  Performance was worse
than letting the operating system use its own heuristics.  That is
hardly surprising or a bug.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0XBYmOOfHg372QQRAmKfAJ4pWVfZ8LS2ET+Y55FGT3Am9wCfZg4e
McaZrcLUXQJhU7i1Gw5+cRk=
=CQg5
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Virgilio Alexandre Fornazin
Sure. I just tell to do this test to check if the bug is related to this
component, since it debuted on Vista.
 
 
Virgilio Alexandre Fornazin
High performance and realtime systems development

Rua Brigadeiro Vicente Faria Lima, 268
Bela VistaLeme-SPCEP 13611-485
Phone: +55 19 3571-5573
Cell: +55 19 8111-4053
+55 11 8357 1491
Mail: [EMAIL PROTECTED]
Web: http://www.fornazinconsultoria.com.br



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Becker
Sent: quarta-feira, 17 de setembro de 2008 13:41
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

SQLite in general and the .Net provider in particular are most often
shipped as components of other applications.  I dont think having
developers tell their end users to disable superfetch is a viable
solution.  As much as I hate to propose this maybe a runtime check is
in order to see what the OS version is and not use the flag where it's
known to be problematic.

On Wed, Sep 17, 2008 at 12:14 PM, Virgilio Alexandre Fornazin
<[EMAIL PROTECTED]> wrote:
> Could not this bug be related with Vista feature called 'Superfetch' ?
> It tries to keep in memory the most accessed files for user, avoiding
> disk for read access.
>
> If you disable (or stop) this service, the problem remains or not ?
>
>
>
>
> Virgilio Alexandre Fornazin
> High performance and realtime systems development
>
> Rua Brigadeiro Vicente Faria Lima, 268
> Bela VistaLeme-SPCEP 13611-485
> Phone: +55 19 3571-5573
> Cell: +55 19 8111-4053
> +55 11 8357 1491
> Mail: [EMAIL PROTECTED]
> Web: http://www.fornazinconsultoria.com.br
>
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich
> Sent: quarta-feira, 17 de setembro de 2008 13:01
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Vista frustrations
>
> On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the
wall:
>> Robert Simpson wrote:
>> > To me this seems like an obvious bug in Vista,
>>
>> Actually I'd argue that it is behaving as designed.
>
>  You could argue it is behaving as designed, but I'd still argue it is
>  behaving poorly.
>
>  Further, if a system component who's sole purpose is to increase
>  performance-- as all cache systems are-- has the overall effect of
>  decreasing performance, not only of the process it is trying to speed
>  up, but of the whole system, it is pretty easy to argue that's a serious
>  functional bug.
>
>
>  Given the speed of most storage systems, filesystem cache management
>  is an important component of overall system performance.  However, if
>  the cache system is grabbing so much physical memory (and, apparently,
>  refusing to let go of it) that processes are forced to aggressively
>  page and the net result is a massive performance loss, then something
>  isn't right.
>
>  As with so many things, cache management (and, indeed, the whole
>  concept of caches) tends to be a huge web of compromises.  It is
>  extremely difficult, if not impossible, to cover all cases.  But
>  these things are not exactly new, and it should be easy enough to
>  never get in a situation where things are actually made worse--
>  especially that they're not made worse for the whole system.
>
>> Generally
>> filesystem code will try to detect what is going on under the hood.  In
>> particular if it looks like you are doing sequential access(*) then they
>> will start doing read ahead, whereas read ahead is a waste for random
>> access.
>
>  Not to get into a whole argument about cache strategies, but this
>  often not true.  If we assume free memory isn't a big concern,
>  when a process opens a file for random-access we can either
>  read-ahead the whole thing or we can read blocks here and there until
>  (if the process touches the majority of the file) we have the whole
>  thing in memory.  Both systems, in the end, will result in the same
>  memory usage.
>
>  However, if I'm going to be doing random access on a file of moderate
>  or smaller size, it is much cheaper for the OS to just suck the whole
>  thing into memory via one bulk read operation than it is to grab it
>  piecemeal.
>
>  The whole trick is defining "moderate" both in terms of first-return
>  read times (time to return the block the process actually asked for,
>  which might not be the first block pulled off disk) vs how likely the
>  process is to touch the majority of file blocks (something that is
>  somewhat less likely as the file gets bigger).
>
>  As the file gets larger, there is also the real-world issue of how
>  much RAM the system has, and how much of it is ac

Re: [sqlite] Vista frustrations

2008-09-17 Thread Jeffrey Becker
SQLite in general and the .Net provider in particular are most often
shipped as components of other applications.  I dont think having
developers tell their end users to disable superfetch is a viable
solution.  As much as I hate to propose this maybe a runtime check is
in order to see what the OS version is and not use the flag where it's
known to be problematic.

On Wed, Sep 17, 2008 at 12:14 PM, Virgilio Alexandre Fornazin
<[EMAIL PROTECTED]> wrote:
> Could not this bug be related with Vista feature called 'Superfetch' ?
> It tries to keep in memory the most accessed files for user, avoiding
> disk for read access.
>
> If you disable (or stop) this service, the problem remains or not ?
>
>
>
>
> Virgilio Alexandre Fornazin
> High performance and realtime systems development
>
> Rua Brigadeiro Vicente Faria Lima, 268
> Bela VistaLeme-SPCEP 13611-485
> Phone: +55 19 3571-5573
> Cell: +55 19 8111-4053
> +55 11 8357 1491
> Mail: [EMAIL PROTECTED]
> Web: http://www.fornazinconsultoria.com.br
>
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich
> Sent: quarta-feira, 17 de setembro de 2008 13:01
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Vista frustrations
>
> On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall:
>> Robert Simpson wrote:
>> > To me this seems like an obvious bug in Vista,
>>
>> Actually I'd argue that it is behaving as designed.
>
>  You could argue it is behaving as designed, but I'd still argue it is
>  behaving poorly.
>
>  Further, if a system component who's sole purpose is to increase
>  performance-- as all cache systems are-- has the overall effect of
>  decreasing performance, not only of the process it is trying to speed
>  up, but of the whole system, it is pretty easy to argue that's a serious
>  functional bug.
>
>
>  Given the speed of most storage systems, filesystem cache management
>  is an important component of overall system performance.  However, if
>  the cache system is grabbing so much physical memory (and, apparently,
>  refusing to let go of it) that processes are forced to aggressively
>  page and the net result is a massive performance loss, then something
>  isn't right.
>
>  As with so many things, cache management (and, indeed, the whole
>  concept of caches) tends to be a huge web of compromises.  It is
>  extremely difficult, if not impossible, to cover all cases.  But
>  these things are not exactly new, and it should be easy enough to
>  never get in a situation where things are actually made worse--
>  especially that they're not made worse for the whole system.
>
>> Generally
>> filesystem code will try to detect what is going on under the hood.  In
>> particular if it looks like you are doing sequential access(*) then they
>> will start doing read ahead, whereas read ahead is a waste for random
>> access.
>
>  Not to get into a whole argument about cache strategies, but this
>  often not true.  If we assume free memory isn't a big concern,
>  when a process opens a file for random-access we can either
>  read-ahead the whole thing or we can read blocks here and there until
>  (if the process touches the majority of the file) we have the whole
>  thing in memory.  Both systems, in the end, will result in the same
>  memory usage.
>
>  However, if I'm going to be doing random access on a file of moderate
>  or smaller size, it is much cheaper for the OS to just suck the whole
>  thing into memory via one bulk read operation than it is to grab it
>  piecemeal.
>
>  The whole trick is defining "moderate" both in terms of first-return
>  read times (time to return the block the process actually asked for,
>  which might not be the first block pulled off disk) vs how likely the
>  process is to touch the majority of file blocks (something that is
>  somewhat less likely as the file gets bigger).
>
>  As the file gets larger, there is also the real-world issue of how
>  much RAM the system has, and how much of it is actually in-use with
>  process and OS pages.  This is true of both sequential AND random
>  access, although memory usage is generally easier to control in
>  sequential patterns.
>
>  This is where Vista appears to be breaking down and making very poor
>  decisions.  It seems to be giving cache pages more priority than
>  process and OS system pages, and generally that should never happen.
>  If we're correctly understanding what is going on, Vista might very
>  well be paging out SQLite's internal page cache to fit a few extra file
>  blocks in RAM.  How much sense does that make?
>
>> By us

Re: [sqlite] Vista frustrations

2008-09-17 Thread Virgilio Alexandre Fornazin
Note that Windows Server 2008 use the same 'core' as Windows Vista.
If you´re detecting and redirecting by using GetVersion() or other
approach you might test for Server 2008 too.
 
 
Virgilio Alexandre Fornazin
High performance and realtime systems development

Rua Brigadeiro Vicente Faria Lima, 268
Bela VistaLeme-SPCEP 13611-485
Phone: +55 19 3571-5573
Cell: +55 19 8111-4053
+55 11 8357 1491
Mail: [EMAIL PROTECTED]
Web: http://www.fornazinconsultoria.com.br



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Robert Simpson
Sent: quarta-feira, 17 de setembro de 2008 13:30
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] Vista frustrations

I've run the tests with superfetch and prefetch disabled and enabled.
Results are consistent with or without these running.  The only thing that
has any affect is the FILE_FLAG_RANDOM_ACCESS flag.  And only on Vista.

For now I'm thinking of overriding the default Windows VFS and redirecting
the open function just for Vista so it doesn't use that flag.

Robert



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Virgilio Alexandre
Fornazin
Sent: Wednesday, September 17, 2008 9:14 AM
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] Vista frustrations

Could not this bug be related with Vista feature called 'Superfetch' ?
It tries to keep in memory the most accessed files for user, avoiding
disk for read access.

If you disable (or stop) this service, the problem remains or not ?


 
 
Virgilio Alexandre Fornazin
High performance and realtime systems development

Rua Brigadeiro Vicente Faria Lima, 268
Bela VistaLeme-SPCEP 13611-485
Phone: +55 19 3571-5573
Cell: +55 19 8111-4053
+55 11 8357 1491
Mail: [EMAIL PROTECTED]
Web: http://www.fornazinconsultoria.com.br


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich
Sent: quarta-feira, 17 de setembro de 2008 13:01
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall:
> Robert Simpson wrote:
> > To me this seems like an obvious bug in Vista,
> 
> Actually I'd argue that it is behaving as designed.

  You could argue it is behaving as designed, but I'd still argue it is
  behaving poorly.

  Further, if a system component who's sole purpose is to increase
  performance-- as all cache systems are-- has the overall effect of
  decreasing performance, not only of the process it is trying to speed
  up, but of the whole system, it is pretty easy to argue that's a serious
  functional bug.


  Given the speed of most storage systems, filesystem cache management
  is an important component of overall system performance.  However, if
  the cache system is grabbing so much physical memory (and, apparently,
  refusing to let go of it) that processes are forced to aggressively
  page and the net result is a massive performance loss, then something
  isn't right.

  As with so many things, cache management (and, indeed, the whole
  concept of caches) tends to be a huge web of compromises.  It is
  extremely difficult, if not impossible, to cover all cases.  But
  these things are not exactly new, and it should be easy enough to
  never get in a situation where things are actually made worse--
  especially that they're not made worse for the whole system.

> Generally
> filesystem code will try to detect what is going on under the hood.  In
> particular if it looks like you are doing sequential access(*) then they
> will start doing read ahead, whereas read ahead is a waste for random
> access. 

  Not to get into a whole argument about cache strategies, but this
  often not true.  If we assume free memory isn't a big concern,
  when a process opens a file for random-access we can either
  read-ahead the whole thing or we can read blocks here and there until
  (if the process touches the majority of the file) we have the whole
  thing in memory.  Both systems, in the end, will result in the same
  memory usage.

  However, if I'm going to be doing random access on a file of moderate
  or smaller size, it is much cheaper for the OS to just suck the whole
  thing into memory via one bulk read operation than it is to grab it
  piecemeal.

  The whole trick is defining "moderate" both in terms of first-return
  read times (time to return the block the process actually asked for,
  which might not be the first block pulled off disk) vs how likely the
  process is to touch the majority of file blocks (something that is
  somewhat less likely as the file gets bigger).  

  As the file gets larger, there is also the real-world issue of how
  much RAM the system has, and how much of it is actually in-use with
  process and OS pages.  This is true of both sequential AND random
  access, although memory usage is gen

Re: [sqlite] Vista frustrations

2008-09-17 Thread Robert Simpson
I've run the tests with superfetch and prefetch disabled and enabled.
Results are consistent with or without these running.  The only thing that
has any affect is the FILE_FLAG_RANDOM_ACCESS flag.  And only on Vista.

For now I'm thinking of overriding the default Windows VFS and redirecting
the open function just for Vista so it doesn't use that flag.

Robert



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Virgilio Alexandre
Fornazin
Sent: Wednesday, September 17, 2008 9:14 AM
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] Vista frustrations

Could not this bug be related with Vista feature called 'Superfetch' ?
It tries to keep in memory the most accessed files for user, avoiding
disk for read access.

If you disable (or stop) this service, the problem remains or not ?


 
 
Virgilio Alexandre Fornazin
High performance and realtime systems development

Rua Brigadeiro Vicente Faria Lima, 268
Bela VistaLeme-SPCEP 13611-485
Phone: +55 19 3571-5573
Cell: +55 19 8111-4053
+55 11 8357 1491
Mail: [EMAIL PROTECTED]
Web: http://www.fornazinconsultoria.com.br


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich
Sent: quarta-feira, 17 de setembro de 2008 13:01
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall:
> Robert Simpson wrote:
> > To me this seems like an obvious bug in Vista,
> 
> Actually I'd argue that it is behaving as designed.

  You could argue it is behaving as designed, but I'd still argue it is
  behaving poorly.

  Further, if a system component who's sole purpose is to increase
  performance-- as all cache systems are-- has the overall effect of
  decreasing performance, not only of the process it is trying to speed
  up, but of the whole system, it is pretty easy to argue that's a serious
  functional bug.


  Given the speed of most storage systems, filesystem cache management
  is an important component of overall system performance.  However, if
  the cache system is grabbing so much physical memory (and, apparently,
  refusing to let go of it) that processes are forced to aggressively
  page and the net result is a massive performance loss, then something
  isn't right.

  As with so many things, cache management (and, indeed, the whole
  concept of caches) tends to be a huge web of compromises.  It is
  extremely difficult, if not impossible, to cover all cases.  But
  these things are not exactly new, and it should be easy enough to
  never get in a situation where things are actually made worse--
  especially that they're not made worse for the whole system.

> Generally
> filesystem code will try to detect what is going on under the hood.  In
> particular if it looks like you are doing sequential access(*) then they
> will start doing read ahead, whereas read ahead is a waste for random
> access. 

  Not to get into a whole argument about cache strategies, but this
  often not true.  If we assume free memory isn't a big concern,
  when a process opens a file for random-access we can either
  read-ahead the whole thing or we can read blocks here and there until
  (if the process touches the majority of the file) we have the whole
  thing in memory.  Both systems, in the end, will result in the same
  memory usage.

  However, if I'm going to be doing random access on a file of moderate
  or smaller size, it is much cheaper for the OS to just suck the whole
  thing into memory via one bulk read operation than it is to grab it
  piecemeal.

  The whole trick is defining "moderate" both in terms of first-return
  read times (time to return the block the process actually asked for,
  which might not be the first block pulled off disk) vs how likely the
  process is to touch the majority of file blocks (something that is
  somewhat less likely as the file gets bigger).  

  As the file gets larger, there is also the real-world issue of how
  much RAM the system has, and how much of it is actually in-use with
  process and OS pages.  This is true of both sequential AND random
  access, although memory usage is generally easier to control in
  sequential patterns.

  This is where Vista appears to be breaking down and making very poor
  decisions.  It seems to be giving cache pages more priority than
  process and OS system pages, and generally that should never happen.
  If we're correctly understanding what is going on, Vista might very
  well be paging out SQLite's internal page cache to fit a few extra file
  blocks in RAM.  How much sense does that make?

> By using the sequential or random flags you are explicitly
> telling the filesystem to ignore its heuristics and do as you say only.

  Even if that's true (most APIs present the flags as "hints" not
  absolute truths), the worst an incorr

Re: [sqlite] Vista frustrations

2008-09-17 Thread Virgilio Alexandre Fornazin
Could not this bug be related with Vista feature called 'Superfetch' ?
It tries to keep in memory the most accessed files for user, avoiding
disk for read access.

If you disable (or stop) this service, the problem remains or not ?


 
 
Virgilio Alexandre Fornazin
High performance and realtime systems development

Rua Brigadeiro Vicente Faria Lima, 268
Bela VistaLeme-SPCEP 13611-485
Phone: +55 19 3571-5573
Cell: +55 19 8111-4053
+55 11 8357 1491
Mail: [EMAIL PROTECTED]
Web: http://www.fornazinconsultoria.com.br


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich
Sent: quarta-feira, 17 de setembro de 2008 13:01
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall:
> Robert Simpson wrote:
> > To me this seems like an obvious bug in Vista,
> 
> Actually I'd argue that it is behaving as designed.

  You could argue it is behaving as designed, but I'd still argue it is
  behaving poorly.

  Further, if a system component who's sole purpose is to increase
  performance-- as all cache systems are-- has the overall effect of
  decreasing performance, not only of the process it is trying to speed
  up, but of the whole system, it is pretty easy to argue that's a serious
  functional bug.


  Given the speed of most storage systems, filesystem cache management
  is an important component of overall system performance.  However, if
  the cache system is grabbing so much physical memory (and, apparently,
  refusing to let go of it) that processes are forced to aggressively
  page and the net result is a massive performance loss, then something
  isn't right.

  As with so many things, cache management (and, indeed, the whole
  concept of caches) tends to be a huge web of compromises.  It is
  extremely difficult, if not impossible, to cover all cases.  But
  these things are not exactly new, and it should be easy enough to
  never get in a situation where things are actually made worse--
  especially that they're not made worse for the whole system.

> Generally
> filesystem code will try to detect what is going on under the hood.  In
> particular if it looks like you are doing sequential access(*) then they
> will start doing read ahead, whereas read ahead is a waste for random
> access. 

  Not to get into a whole argument about cache strategies, but this
  often not true.  If we assume free memory isn't a big concern,
  when a process opens a file for random-access we can either
  read-ahead the whole thing or we can read blocks here and there until
  (if the process touches the majority of the file) we have the whole
  thing in memory.  Both systems, in the end, will result in the same
  memory usage.

  However, if I'm going to be doing random access on a file of moderate
  or smaller size, it is much cheaper for the OS to just suck the whole
  thing into memory via one bulk read operation than it is to grab it
  piecemeal.

  The whole trick is defining "moderate" both in terms of first-return
  read times (time to return the block the process actually asked for,
  which might not be the first block pulled off disk) vs how likely the
  process is to touch the majority of file blocks (something that is
  somewhat less likely as the file gets bigger).  

  As the file gets larger, there is also the real-world issue of how
  much RAM the system has, and how much of it is actually in-use with
  process and OS pages.  This is true of both sequential AND random
  access, although memory usage is generally easier to control in
  sequential patterns.

  This is where Vista appears to be breaking down and making very poor
  decisions.  It seems to be giving cache pages more priority than
  process and OS system pages, and generally that should never happen.
  If we're correctly understanding what is going on, Vista might very
  well be paging out SQLite's internal page cache to fit a few extra file
  blocks in RAM.  How much sense does that make?

> By using the sequential or random flags you are explicitly
> telling the filesystem to ignore its heuristics and do as you say only.

  Even if that's true (most APIs present the flags as "hints" not
  absolute truths), the worst an incorrect flag should do is hurt the
  file access performance of the process that provided the hint.  Even
  then, the lower end should be the same performance one would expect
  if there was no cache (e.g. constant misses).
 
  A poor or incorrect flag is no excuse to be overly aggressive with
  holding pages in RAM and killing the whole system.  Even if a flag
  alters the read-ahead policy or cache replacement strategy, a flag
  should never override the decisions the cache system has to face when
  the system starts to run thin on free physical RAM-- especially on a
  file that is larger than than the RAM footprint 

Re: [sqlite] Vista frustrations

2008-09-17 Thread Jay A. Kreibich
On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall:
> Robert Simpson wrote:
> > To me this seems like an obvious bug in Vista,
> 
> Actually I'd argue that it is behaving as designed.

  You could argue it is behaving as designed, but I'd still argue it is
  behaving poorly.

  Further, if a system component who's sole purpose is to increase
  performance-- as all cache systems are-- has the overall effect of
  decreasing performance, not only of the process it is trying to speed
  up, but of the whole system, it is pretty easy to argue that's a serious
  functional bug.


  Given the speed of most storage systems, filesystem cache management
  is an important component of overall system performance.  However, if
  the cache system is grabbing so much physical memory (and, apparently,
  refusing to let go of it) that processes are forced to aggressively
  page and the net result is a massive performance loss, then something
  isn't right.

  As with so many things, cache management (and, indeed, the whole
  concept of caches) tends to be a huge web of compromises.  It is
  extremely difficult, if not impossible, to cover all cases.  But
  these things are not exactly new, and it should be easy enough to
  never get in a situation where things are actually made worse--
  especially that they're not made worse for the whole system.

> Generally
> filesystem code will try to detect what is going on under the hood.  In
> particular if it looks like you are doing sequential access(*) then they
> will start doing read ahead, whereas read ahead is a waste for random
> access. 

  Not to get into a whole argument about cache strategies, but this
  often not true.  If we assume free memory isn't a big concern,
  when a process opens a file for random-access we can either
  read-ahead the whole thing or we can read blocks here and there until
  (if the process touches the majority of the file) we have the whole
  thing in memory.  Both systems, in the end, will result in the same
  memory usage.

  However, if I'm going to be doing random access on a file of moderate
  or smaller size, it is much cheaper for the OS to just suck the whole
  thing into memory via one bulk read operation than it is to grab it
  piecemeal.

  The whole trick is defining "moderate" both in terms of first-return
  read times (time to return the block the process actually asked for,
  which might not be the first block pulled off disk) vs how likely the
  process is to touch the majority of file blocks (something that is
  somewhat less likely as the file gets bigger).  

  As the file gets larger, there is also the real-world issue of how
  much RAM the system has, and how much of it is actually in-use with
  process and OS pages.  This is true of both sequential AND random
  access, although memory usage is generally easier to control in
  sequential patterns.

  This is where Vista appears to be breaking down and making very poor
  decisions.  It seems to be giving cache pages more priority than
  process and OS system pages, and generally that should never happen.
  If we're correctly understanding what is going on, Vista might very
  well be paging out SQLite's internal page cache to fit a few extra file
  blocks in RAM.  How much sense does that make?

> By using the sequential or random flags you are explicitly
> telling the filesystem to ignore its heuristics and do as you say only.

  Even if that's true (most APIs present the flags as "hints" not
  absolute truths), the worst an incorrect flag should do is hurt the
  file access performance of the process that provided the hint.  Even
  then, the lower end should be the same performance one would expect
  if there was no cache (e.g. constant misses).
 
  A poor or incorrect flag is no excuse to be overly aggressive with
  holding pages in RAM and killing the whole system.  Even if a flag
  alters the read-ahead policy or cache replacement strategy, a flag
  should never override the decisions the cache system has to face when
  the system starts to run thin on free physical RAM-- especially on a
  file that is larger than than the RAM footprint of most machines.

  And in this case, I'm not even sure the flag is incorrect
  
> Since SQLite cannot tell in advance whether access is almost entirely
> random or almost entirely sequential, it makes far more sense to let the
> operating system use its builtin heuristics and optimise accordingly.

  Since a full table scan is unlikely to result in a sequential file
  read for anything but the most pristine database (e.g. one that is
  bulk loaded and never touched), I'm not sure that's true.  I would
  guess the file access patterns are almost always random, making the
  hint correct.

  On the other hand, given that SQLite does it's own cache management
  and tries to make few assumptions about the underlying system, I can
  also see leaving the OS to do what it wants.  The hint may be
  correct, but it 

Re: [sqlite] Vista frustrations

2008-09-17 Thread Fred Williams
Possibly the reason a large number of us are still running Win 2000 :-)

It seems to be the least Windows like Windoze ever released...

When follow on support degrades to an untenable level, I'll either switch to
Linux with a Windoze emulator or maybe run whatever the future ruler of the
Universe (Google) tells me to :-)

By then, I figure to be carrying around a holographic 42" 3-D display,
surround sound cell phone anyhow.  BTW, that will have the same form factor
as the "Star trek Next Generation", Communicator.

Fred

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson
Sent: Wednesday, September 17, 2008 9:54 AM
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] Vista frustrations


The real frustration is that this seems to be a rather obvious bug in Vista,
and definitely not SQLite's responsibility.  IMO setting the flag is the
"right thing to do" -- but at the same time, I don't expect any favors from
Microsoft in fixing this any time soon.  Meanwhile all those poor Vista
people need SQLite to work well with their OS.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns
Sent: Wednesday, September 17, 2008 1:18 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> To me this seems like an obvious bug in Vista,

Actually I'd argue that it is behaving as designed.  Generally
filesystem code will try to detect what is going on under the hood.  In
particular if it looks like you are doing sequential access(*) then they
will start doing read ahead, whereas read ahead is a waste for random
access.  By using the sequential or random flags you are explicitly
telling the filesystem to ignore its heuristics and do as you say only.

Since SQLite cannot tell in advance whether access is almost entirely
random or almost entirely sequential, it makes far more sense to let the
operating system use its builtin heuristics and optimise accordingly.

(*) Sun's ZFS can even detect strided sequential access - ie reading X
amount of data every Y kilobytes.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4
vMfbYMNVV9k6CNR7hpSQo6A=
=AXGU
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Robert Simpson
It's been reproduced on Vista x64 and x86 OS's.  I don't have a pre-SP1
installation to verify if it happens against pre-patched versions.

Getting a database should be easy enough ...

CREATE TABLE FOO(a, b, c, d, e, f);

Then do this about 16 million times or so ... whatever it takes to pump the
db up to the 4-5gb mark:

INSERT INTO FOO VALUES(random(), random(), random(), randomblob(16),
randomblob(16), randomblob(32));




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Sherief N. Farouk
Sent: Wednesday, September 17, 2008 12:38 AM
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] Vista frustrations


> I ran some tests using the command-line sqlite3.exe, and observed that
> Windows Vista (SP1) is actually trying to cache the entire 5gb file
> into
> memory during the table scan!  The system slows to a complete crawl and
> becomes unresponsive.  The sqlite3.exe's memory remains very minimal,
> but
> Vista itself eats every last scrap of physical memory, forcing all
> other
> apps to the paging file trying to cache the contents of the database.

[Sherief N. Farouk] 

Any chance you could make a small C program to reproduce this case (via
procedurally creating a similarly-sized database)? Out of curiosity, was
that on x86 or x64?

- Sherief

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Robert Simpson
The real frustration is that this seems to be a rather obvious bug in Vista,
and definitely not SQLite's responsibility.  IMO setting the flag is the
"right thing to do" -- but at the same time, I don't expect any favors from
Microsoft in fixing this any time soon.  Meanwhile all those poor Vista
people need SQLite to work well with their OS.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns
Sent: Wednesday, September 17, 2008 1:18 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> To me this seems like an obvious bug in Vista,

Actually I'd argue that it is behaving as designed.  Generally
filesystem code will try to detect what is going on under the hood.  In
particular if it looks like you are doing sequential access(*) then they
will start doing read ahead, whereas read ahead is a waste for random
access.  By using the sequential or random flags you are explicitly
telling the filesystem to ignore its heuristics and do as you say only.

Since SQLite cannot tell in advance whether access is almost entirely
random or almost entirely sequential, it makes far more sense to let the
operating system use its builtin heuristics and optimise accordingly.

(*) Sun's ZFS can even detect strided sequential access - ie reading X
amount of data every Y kilobytes.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4
vMfbYMNVV9k6CNR7hpSQo6A=
=AXGU
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Robert Simpson
The purpose of a cache is to improve performance and responsiveness.  Any
cache that uses all physical memory, forces all other apps to the paging
file and turns the operating system into a brick is definitely not working
as designed.


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns
Sent: Wednesday, September 17, 2008 1:18 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Vista frustrations

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> To me this seems like an obvious bug in Vista,

Actually I'd argue that it is behaving as designed.  Generally
filesystem code will try to detect what is going on under the hood.  In
particular if it looks like you are doing sequential access(*) then they
will start doing read ahead, whereas read ahead is a waste for random
access.  By using the sequential or random flags you are explicitly
telling the filesystem to ignore its heuristics and do as you say only.

Since SQLite cannot tell in advance whether access is almost entirely
random or almost entirely sequential, it makes far more sense to let the
operating system use its builtin heuristics and optimise accordingly.

(*) Sun's ZFS can even detect strided sequential access - ie reading X
amount of data every Y kilobytes.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4
vMfbYMNVV9k6CNR7hpSQo6A=
=AXGU
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Filip Navara
"(*) Sun's ZFS can even detect strided sequential access - ie reading X
amount of data every Y kilobytes."

... and so can the NT cache manager since the very first Windows NT
release ;-) It's good to see that people are finally adapting these
features 15 years later.

F.

On Wed, Sep 17, 2008 at 10:17 AM, Roger Binns <[EMAIL PROTECTED]> wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Robert Simpson wrote:
>> To me this seems like an obvious bug in Vista,
>
> Actually I'd argue that it is behaving as designed.  Generally
> filesystem code will try to detect what is going on under the hood.  In
> particular if it looks like you are doing sequential access(*) then they
> will start doing read ahead, whereas read ahead is a waste for random
> access.  By using the sequential or random flags you are explicitly
> telling the filesystem to ignore its heuristics and do as you say only.
>
> Since SQLite cannot tell in advance whether access is almost entirely
> random or almost entirely sequential, it makes far more sense to let the
> operating system use its builtin heuristics and optimise accordingly.
>
> (*) Sun's ZFS can even detect strided sequential access - ie reading X
> amount of data every Y kilobytes.
>
> Roger
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4
> vMfbYMNVV9k6CNR7hpSQo6A=
> =AXGU
> -END PGP SIGNATURE-
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robert Simpson wrote:
> To me this seems like an obvious bug in Vista,

Actually I'd argue that it is behaving as designed.  Generally
filesystem code will try to detect what is going on under the hood.  In
particular if it looks like you are doing sequential access(*) then they
will start doing read ahead, whereas read ahead is a waste for random
access.  By using the sequential or random flags you are explicitly
telling the filesystem to ignore its heuristics and do as you say only.

Since SQLite cannot tell in advance whether access is almost entirely
random or almost entirely sequential, it makes far more sense to let the
operating system use its builtin heuristics and optimise accordingly.

(*) Sun's ZFS can even detect strided sequential access - ie reading X
amount of data every Y kilobytes.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4
vMfbYMNVV9k6CNR7hpSQo6A=
=AXGU
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Vista frustrations

2008-09-17 Thread Sherief N. Farouk

> I ran some tests using the command-line sqlite3.exe, and observed that
> Windows Vista (SP1) is actually trying to cache the entire 5gb file
> into
> memory during the table scan!  The system slows to a complete crawl and
> becomes unresponsive.  The sqlite3.exe's memory remains very minimal,
> but
> Vista itself eats every last scrap of physical memory, forcing all
> other
> apps to the paging file trying to cache the contents of the database.

[Sherief N. Farouk] 

Any chance you could make a small C program to reproduce this case (via
procedurally creating a similarly-sized database)? Out of curiosity, was
that on x86 or x64?

- Sherief

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Vista frustrations

2008-09-16 Thread Robert Simpson
I recently had a user of the ADO.NET provider report a problem on Vista.
His database is 5gb, and he's doing a full table scan of 16 million rows.
Yea.

 

I ran some tests using the command-line sqlite3.exe, and observed that
Windows Vista (SP1) is actually trying to cache the entire 5gb file into
memory during the table scan!  The system slows to a complete crawl and
becomes unresponsive.  The sqlite3.exe's memory remains very minimal, but
Vista itself eats every last scrap of physical memory, forcing all other
apps to the paging file trying to cache the contents of the database.

 

It took about 500 seconds to scan the entire table, which isn't all that bad
given the circumstances.  Unfortunately Vista's cache remained high and
continued to make the entire system unresponsive until I quit out of the
command-line.

 

After some research, I commented out the FILE_FLAG_RANDOM_ACCESS flag in
os_win.c and re-ran the test.

 

It completed in 99 seconds.  System remained highly responsive, and Vista
never blew out the memory trying to cache the entire file.

 

Now, I know Windows CE benefits greatly from having this flag hint - but has
anyone tested normal desktop performance with or without it?  To me this
seems like an obvious bug in Vista, but the chances of getting Microsoft to
fix it are slim to none.

 

Robert

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users