Re: [sqlite] Vista frustrations
The SQLite part was an analogy. That must have been beyond you. You can have the last word. You're beyond my help. Fred Williams wrote: > I never said a word aboout SQLite. You ass U Me too much I suspect. > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Mark Spiegel > Sent: Thursday, September 18, 2008 11:25 AM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Vista frustrations > > > I'm sorry, I have to take issue with that statement. The design of the > file system/cache manager is not "pitiful". It strives to provide good > performance in the entire application space, not just your little corner > of it. It is doing the best it can with the "hint" you've given it. If > another (or no) hint provides better performance in your application, > who's fault is that? Do you realize that without the cache manager, > fast I/O would not be possible? Run on a debug system where only IRP > based I/O is possible any you will be singing another tune in a hurry. > Why do you think these hints are even available? It is to help you > optimize your application. > > The SQLite memory subsystem doesn't work well on my platform I don't > run around calling SQLite "pitiful". I recognize that the authors' > implementation(s) is probably a good performance compromise in the > generic case. If it is a big enough problem (which it is for me), I > write my own version to optimize my performance. While better, the > integer encoding is not as good as it could be for me. Does that mean > the SQLite is pitiful? > > I should also note that as of the last time I talked to her, Molly is no > longer handling the cache manager. I believe she has moved back into > the kernel group after a brief departure, but is working on something > else. I haven't seen the talks that Robert refers to, but suspect they > are close to the versions I have seen in person. I would bet they are > still very useful and relevant. > > Fred Williams wrote: > >> Is a sad day when an application program is forced to compensate for >> > pitiful > >> OS design and performance :-( >> >> -Original Message- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson >> Sent: Thursday, September 18, 2008 10:31 AM >> To: 'General Discussion of SQLite Database' >> Subject: Re: [sqlite] Vista frustrations >> >> >> After watching Molly Brown's Channel9 videos on the cache manager, I'm >> convinced the behavior for SQLite should be to not give the filesystem any >> hints about caching and let the cache manager itself figure it out. The >> exception being Windows CE, where we can confirm that when this flag is >> > not > >> set, the device will use compression in memory and degrade performance. >> >> If that's the general consensus, I'll open a ticket. >> >> -Original Message- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel >> Sent: Thursday, September 18, 2008 7:56 AM >> To: [EMAIL PROTECTED]; General Discussion of SQLite Database >> Subject: Re: [sqlite] Vista frustrations >> >> FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the >> cache manager (CC) in Windows and the underlying file system(s). With >> respect to the cache manager, it is going to affect whether or not there >> is read ahead, how much read ahead will be used, and how long data will >> remain in the cache (or another way, how quickly it will be dropped). >> It has been some time since I've talked to the Queen of Cache Manger >> about this, but as I recall CC will try to figure out what you are doing >> if you don't give it a hint. If you do give it a hint, then it is going >> to run with that hint no matter what the cost. Note that CC or the file >> system are perfectly within their right to ignore your hints. CC >> generally does honor them. NTFS, well that's another matter. >> >> It has been MY experience (YMMV) that database and temp file reads are >> fairly random. Database files also have the "nice" property that read >> and writes are often sector (page) aligned. Journal files should be >> opened for sequential scan and are generally not sector (page) aligned. >> Setting SQLite aside for a moment, for very large files that are only >> going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show >> huge performance gains. However, if most or all of a file is going to >> be touched, even in random order, then it doesn't get you much and can >&
Re: [sqlite] Vista frustrations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jay A. Kreibich wrote: > SQLite is not "lying." After poking around a bit to refresh my > understanding of SQLite's file structure, I think it is safe to say > that SQLite will almost never do a sequential file read, even if > you're doing a sequential table scan. > > sequential table scan != sequential file access That statement is true in general but in this specific case we are talking about a multi-gigabyte database. It would be fair to say that large chunks of the database would have been created in a transaction and that there wouldn't have been a lot of row updates/deletes/inserts throughout it causing something similar to fragmentation. I made the vfs track sequential vs random read accesses and then created a 3 column table with each row being a random int, a string between 0 and 1024 chars long and another random int. A 2,000,000 row database occupied 1.4GB. Page/cache size etc were all left at their defaults. Testing was done on 64 bit Linux. create table foo(x,y,z) Full scan query: select max(z) from foo seq 1,298,231 random 93,599 - 93% sequential Create index: create index foo1 on foo(z) seq 244,920 random 3,261,252 - 7% sequential Indexed query: select max(z) from foo where z>9900 seq 0 random 7- 0% sequential In this particular case SQLite is very much lying when a full table scan was done as the vast majority of access is indeed sequential. Of course several factors will affect this such as typical size of each row, schema, how much prior "thrashing" about of contents the database has had etc. I also didn't count nearly sequential access, for example if the operating system does reads in 4kb chunks and one read was for the first kb and the next read was for the third kb. An example of how the numbers can be skewed for the full scan: %Sequential StringSize PageSize 8 1-2KB 1KB 99 0-1KB 4KB 99.9 0-1KB 32KB 97 0-.5KB 1KB It is fair to say that SQLite does not know in advance what the access pattern is going to be, especially without analysing the database contents, rows per page and queries that will be executed. Queries that predominantly use indices (a desirable trait) are mostly random access. Full table scans where the row fits within the page are mostly sequential. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0xBomOOfHg372QQRAnXXAJ0VGwzzrOwkWF0q4Za2QpuTKnQD+QCfblin DeHukDoCs809uawnlt5GYLc= =l25n -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
I never said a word aboout SQLite. You ass U Me too much I suspect. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Mark Spiegel Sent: Thursday, September 18, 2008 11:25 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations I'm sorry, I have to take issue with that statement. The design of the file system/cache manager is not "pitiful". It strives to provide good performance in the entire application space, not just your little corner of it. It is doing the best it can with the "hint" you've given it. If another (or no) hint provides better performance in your application, who's fault is that? Do you realize that without the cache manager, fast I/O would not be possible? Run on a debug system where only IRP based I/O is possible any you will be singing another tune in a hurry. Why do you think these hints are even available? It is to help you optimize your application. The SQLite memory subsystem doesn't work well on my platform I don't run around calling SQLite "pitiful". I recognize that the authors' implementation(s) is probably a good performance compromise in the generic case. If it is a big enough problem (which it is for me), I write my own version to optimize my performance. While better, the integer encoding is not as good as it could be for me. Does that mean the SQLite is pitiful? I should also note that as of the last time I talked to her, Molly is no longer handling the cache manager. I believe she has moved back into the kernel group after a brief departure, but is working on something else. I haven't seen the talks that Robert refers to, but suspect they are close to the versions I have seen in person. I would bet they are still very useful and relevant. Fred Williams wrote: > Is a sad day when an application program is forced to compensate for pitiful > OS design and performance :-( > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson > Sent: Thursday, September 18, 2008 10:31 AM > To: 'General Discussion of SQLite Database' > Subject: Re: [sqlite] Vista frustrations > > > After watching Molly Brown's Channel9 videos on the cache manager, I'm > convinced the behavior for SQLite should be to not give the filesystem any > hints about caching and let the cache manager itself figure it out. The > exception being Windows CE, where we can confirm that when this flag is not > set, the device will use compression in memory and degrade performance. > > If that's the general consensus, I'll open a ticket. > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel > Sent: Thursday, September 18, 2008 7:56 AM > To: [EMAIL PROTECTED]; General Discussion of SQLite Database > Subject: Re: [sqlite] Vista frustrations > > FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the > cache manager (CC) in Windows and the underlying file system(s). With > respect to the cache manager, it is going to affect whether or not there > is read ahead, how much read ahead will be used, and how long data will > remain in the cache (or another way, how quickly it will be dropped). > It has been some time since I've talked to the Queen of Cache Manger > about this, but as I recall CC will try to figure out what you are doing > if you don't give it a hint. If you do give it a hint, then it is going > to run with that hint no matter what the cost. Note that CC or the file > system are perfectly within their right to ignore your hints. CC > generally does honor them. NTFS, well that's another matter. > > It has been MY experience (YMMV) that database and temp file reads are > fairly random. Database files also have the "nice" property that read > and writes are often sector (page) aligned. Journal files should be > opened for sequential scan and are generally not sector (page) aligned. > Setting SQLite aside for a moment, for very large files that are only > going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show > huge performance gains. However, if most or all of a file is going to > be touched, even in random order, then it doesn't get you much and can > hurt you. Most SQLite data bases _probably_ fall into that second > case. If you have enough memory and a small enough file such that the > cache manager can hold the entire file, you are golden. That's why some > people see such great SQLite performance by just sequentially reading > their DB files before running their SQLite application. > > The elephants in the room with that previous paragraph is 1) the amount > of RAM in the system and 2) the other applications running. Windows > will try to share its resources among all the
Re: [sqlite] Vista frustrations
I'm sorry, I have to take issue with that statement. The design of the file system/cache manager is not "pitiful". It strives to provide good performance in the entire application space, not just your little corner of it. It is doing the best it can with the "hint" you've given it. If another (or no) hint provides better performance in your application, who's fault is that? Do you realize that without the cache manager, fast I/O would not be possible? Run on a debug system where only IRP based I/O is possible any you will be singing another tune in a hurry. Why do you think these hints are even available? It is to help you optimize your application. The SQLite memory subsystem doesn't work well on my platform I don't run around calling SQLite "pitiful". I recognize that the authors' implementation(s) is probably a good performance compromise in the generic case. If it is a big enough problem (which it is for me), I write my own version to optimize my performance. While better, the integer encoding is not as good as it could be for me. Does that mean the SQLite is pitiful? I should also note that as of the last time I talked to her, Molly is no longer handling the cache manager. I believe she has moved back into the kernel group after a brief departure, but is working on something else. I haven't seen the talks that Robert refers to, but suspect they are close to the versions I have seen in person. I would bet they are still very useful and relevant. Fred Williams wrote: > Is a sad day when an application program is forced to compensate for pitiful > OS design and performance :-( > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson > Sent: Thursday, September 18, 2008 10:31 AM > To: 'General Discussion of SQLite Database' > Subject: Re: [sqlite] Vista frustrations > > > After watching Molly Brown's Channel9 videos on the cache manager, I'm > convinced the behavior for SQLite should be to not give the filesystem any > hints about caching and let the cache manager itself figure it out. The > exception being Windows CE, where we can confirm that when this flag is not > set, the device will use compression in memory and degrade performance. > > If that's the general consensus, I'll open a ticket. > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel > Sent: Thursday, September 18, 2008 7:56 AM > To: [EMAIL PROTECTED]; General Discussion of SQLite Database > Subject: Re: [sqlite] Vista frustrations > > FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the > cache manager (CC) in Windows and the underlying file system(s). With > respect to the cache manager, it is going to affect whether or not there > is read ahead, how much read ahead will be used, and how long data will > remain in the cache (or another way, how quickly it will be dropped). > It has been some time since I've talked to the Queen of Cache Manger > about this, but as I recall CC will try to figure out what you are doing > if you don't give it a hint. If you do give it a hint, then it is going > to run with that hint no matter what the cost. Note that CC or the file > system are perfectly within their right to ignore your hints. CC > generally does honor them. NTFS, well that's another matter. > > It has been MY experience (YMMV) that database and temp file reads are > fairly random. Database files also have the "nice" property that read > and writes are often sector (page) aligned. Journal files should be > opened for sequential scan and are generally not sector (page) aligned. > Setting SQLite aside for a moment, for very large files that are only > going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show > huge performance gains. However, if most or all of a file is going to > be touched, even in random order, then it doesn't get you much and can > hurt you. Most SQLite data bases _probably_ fall into that second > case. If you have enough memory and a small enough file such that the > cache manager can hold the entire file, you are golden. That's why some > people see such great SQLite performance by just sequentially reading > their DB files before running their SQLite application. > > The elephants in the room with that previous paragraph is 1) the amount > of RAM in the system and 2) the other applications running. Windows > will try to share its resources among all the applications running as > best it can. > > I have not seen any "bugs" in SQLite in this area. It gives a > reasonable hint for the general case. To be fair however, I should note > that I have my own VFS. It does unbuffered I/O for database files and
Re: [sqlite] Vista frustrations
Designing operating systems is hard work, and I don't envy the teams that do it. I've found my Vista experience to be rather enjoyable, especially post SP1. I don't think there's an OS out today that isn't annoying in one fashion or another. Having an application or library implement a workaround for an OS design flaw is not unique to Windows by any stretch. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Fred Williams Sent: Thursday, September 18, 2008 8:54 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations Is a sad day when an application program is forced to compensate for pitiful OS design and performance :-( -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson Sent: Thursday, September 18, 2008 10:31 AM To: 'General Discussion of SQLite Database' Subject: Re: [sqlite] Vista frustrations After watching Molly Brown's Channel9 videos on the cache manager, I'm convinced the behavior for SQLite should be to not give the filesystem any hints about caching and let the cache manager itself figure it out. The exception being Windows CE, where we can confirm that when this flag is not set, the device will use compression in memory and degrade performance. If that's the general consensus, I'll open a ticket. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel Sent: Thursday, September 18, 2008 7:56 AM To: [EMAIL PROTECTED]; General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the cache manager (CC) in Windows and the underlying file system(s). With respect to the cache manager, it is going to affect whether or not there is read ahead, how much read ahead will be used, and how long data will remain in the cache (or another way, how quickly it will be dropped). It has been some time since I've talked to the Queen of Cache Manger about this, but as I recall CC will try to figure out what you are doing if you don't give it a hint. If you do give it a hint, then it is going to run with that hint no matter what the cost. Note that CC or the file system are perfectly within their right to ignore your hints. CC generally does honor them. NTFS, well that's another matter. It has been MY experience (YMMV) that database and temp file reads are fairly random. Database files also have the "nice" property that read and writes are often sector (page) aligned. Journal files should be opened for sequential scan and are generally not sector (page) aligned. Setting SQLite aside for a moment, for very large files that are only going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show huge performance gains. However, if most or all of a file is going to be touched, even in random order, then it doesn't get you much and can hurt you. Most SQLite data bases _probably_ fall into that second case. If you have enough memory and a small enough file such that the cache manager can hold the entire file, you are golden. That's why some people see such great SQLite performance by just sequentially reading their DB files before running their SQLite application. The elephants in the room with that previous paragraph is 1) the amount of RAM in the system and 2) the other applications running. Windows will try to share its resources among all the applications running as best it can. I have not seen any "bugs" in SQLite in this area. It gives a reasonable hint for the general case. To be fair however, I should note that I have my own VFS. It does unbuffered I/O for database files and sequential, cached I/O for journal files. If you think you can get better performance with different flags, create your own VFS, starting with the Windows VFS and make the changes. You can get as sophisticated with your hints as you want. You can write your own caching system if you've ingested way too much caffeine. (Did I mention that the VFS stuff is great!) I would not as a general rule advise people (customers) to change the way their Windows system caches globally for the benefit of one of your applications. Eventually, that is going to bite you with some support calls. Jay A. Kreibich wrote: > On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the wall: > > >> The second is that SQLite when opening a file under Windows explicitly >> tells Windows that the file will be used for random access even though >> that is not the case. Windows uses this hint to override its builtin >> heuristics which can cause bug #1. >> > > >> Bug #2 is that SQLite is lying to the operating system and could result >> in performance degradation if the operating system actually pays >> attention to the hint. >> > > SQLite is not "lying." After
Re: [sqlite] Vista frustrations
Is a sad day when an application program is forced to compensate for pitiful OS design and performance :-( -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson Sent: Thursday, September 18, 2008 10:31 AM To: 'General Discussion of SQLite Database' Subject: Re: [sqlite] Vista frustrations After watching Molly Brown's Channel9 videos on the cache manager, I'm convinced the behavior for SQLite should be to not give the filesystem any hints about caching and let the cache manager itself figure it out. The exception being Windows CE, where we can confirm that when this flag is not set, the device will use compression in memory and degrade performance. If that's the general consensus, I'll open a ticket. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mark Spiegel Sent: Thursday, September 18, 2008 7:56 AM To: [EMAIL PROTECTED]; General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the cache manager (CC) in Windows and the underlying file system(s). With respect to the cache manager, it is going to affect whether or not there is read ahead, how much read ahead will be used, and how long data will remain in the cache (or another way, how quickly it will be dropped). It has been some time since I've talked to the Queen of Cache Manger about this, but as I recall CC will try to figure out what you are doing if you don't give it a hint. If you do give it a hint, then it is going to run with that hint no matter what the cost. Note that CC or the file system are perfectly within their right to ignore your hints. CC generally does honor them. NTFS, well that's another matter. It has been MY experience (YMMV) that database and temp file reads are fairly random. Database files also have the "nice" property that read and writes are often sector (page) aligned. Journal files should be opened for sequential scan and are generally not sector (page) aligned. Setting SQLite aside for a moment, for very large files that are only going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show huge performance gains. However, if most or all of a file is going to be touched, even in random order, then it doesn't get you much and can hurt you. Most SQLite data bases _probably_ fall into that second case. If you have enough memory and a small enough file such that the cache manager can hold the entire file, you are golden. That's why some people see such great SQLite performance by just sequentially reading their DB files before running their SQLite application. The elephants in the room with that previous paragraph is 1) the amount of RAM in the system and 2) the other applications running. Windows will try to share its resources among all the applications running as best it can. I have not seen any "bugs" in SQLite in this area. It gives a reasonable hint for the general case. To be fair however, I should note that I have my own VFS. It does unbuffered I/O for database files and sequential, cached I/O for journal files. If you think you can get better performance with different flags, create your own VFS, starting with the Windows VFS and make the changes. You can get as sophisticated with your hints as you want. You can write your own caching system if you've ingested way too much caffeine. (Did I mention that the VFS stuff is great!) I would not as a general rule advise people (customers) to change the way their Windows system caches globally for the benefit of one of your applications. Eventually, that is going to bite you with some support calls. Jay A. Kreibich wrote: > On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the wall: > > >> The second is that SQLite when opening a file under Windows explicitly >> tells Windows that the file will be used for random access even though >> that is not the case. Windows uses this hint to override its builtin >> heuristics which can cause bug #1. >> > > >> Bug #2 is that SQLite is lying to the operating system and could result >> in performance degradation if the operating system actually pays >> attention to the hint. >> > > SQLite is not "lying." After poking around a bit to refresh my > understanding of SQLite's file structure, I think it is safe to say > that SQLite will almost never do a sequential file read, even if > you're doing a sequential table scan. > > sequential table scan != sequential file access > > There are some specific situations when you might get bursts of sequential > reads, but only for very specific page layouts with very specific > types of queries. In short, not the common case. Furthermore, even > those patterns can get broken up and shuffled around depending on the &
Re: [sqlite] Vista frustrations
FILE_FLAG_RANDOM_ACCESS and FILE_FLAG_SEQUENTIAL_SCAN are hints to the cache manager (CC) in Windows and the underlying file system(s). With respect to the cache manager, it is going to affect whether or not there is read ahead, how much read ahead will be used, and how long data will remain in the cache (or another way, how quickly it will be dropped). It has been some time since I've talked to the Queen of Cache Manger about this, but as I recall CC will try to figure out what you are doing if you don't give it a hint. If you do give it a hint, then it is going to run with that hint no matter what the cost. Note that CC or the file system are perfectly within their right to ignore your hints. CC generally does honor them. NTFS, well that's another matter. It has been MY experience (YMMV) that database and temp file reads are fairly random. Database files also have the "nice" property that read and writes are often sector (page) aligned. Journal files should be opened for sequential scan and are generally not sector (page) aligned. Setting SQLite aside for a moment, for very large files that are only going to be touched in a few places FILE_FLAG_RANDOM_ACCESS can show huge performance gains. However, if most or all of a file is going to be touched, even in random order, then it doesn't get you much and can hurt you. Most SQLite data bases _probably_ fall into that second case. If you have enough memory and a small enough file such that the cache manager can hold the entire file, you are golden. That's why some people see such great SQLite performance by just sequentially reading their DB files before running their SQLite application. The elephants in the room with that previous paragraph is 1) the amount of RAM in the system and 2) the other applications running. Windows will try to share its resources among all the applications running as best it can. I have not seen any "bugs" in SQLite in this area. It gives a reasonable hint for the general case. To be fair however, I should note that I have my own VFS. It does unbuffered I/O for database files and sequential, cached I/O for journal files. If you think you can get better performance with different flags, create your own VFS, starting with the Windows VFS and make the changes. You can get as sophisticated with your hints as you want. You can write your own caching system if you've ingested way too much caffeine. (Did I mention that the VFS stuff is great!) I would not as a general rule advise people (customers) to change the way their Windows system caches globally for the benefit of one of your applications. Eventually, that is going to bite you with some support calls. Jay A. Kreibich wrote: > On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the wall: > > >> The second is that SQLite when opening a file under Windows explicitly >> tells Windows that the file will be used for random access even though >> that is not the case. Windows uses this hint to override its builtin >> heuristics which can cause bug #1. >> > > >> Bug #2 is that SQLite is lying to the operating system and could result >> in performance degradation if the operating system actually pays >> attention to the hint. >> > > SQLite is not "lying." After poking around a bit to refresh my > understanding of SQLite's file structure, I think it is safe to say > that SQLite will almost never do a sequential file read, even if > you're doing a sequential table scan. > > sequential table scan != sequential file access > > There are some specific situations when you might get bursts of sequential > reads, but only for very specific page layouts with very specific > types of queries. In short, not the common case. Furthermore, even > those patterns can get broken up and shuffled around depending on the > state of SQLite's page cache-- especially if it is bumped up a few > dozen megs. So simply running different types of queries can change > the access patterns (this is true of the OS's file system cache as > well, of course). > > It might be worth instrumenting a few systems and having a look, but > in general, if you had to label SQLite's access pattern, I think > "random" would be the most appropriate label. > > > > I also contend that if the Windows file cache becomes some kind of > bumbling idiot if you actually try to define an access pattern, then > something is wrong. There is a very good reason why the POSIX > functions for doing this kind of thing are called "*advise()". You > might seed the heuristic statistics in a specific direction, but they > should never be totally over-ridden. That quickly leads to stupid > behaviors, like grabbing all the RAM on the system and not letting go. > > > > Of course, we could argue philosophy for a long time. In the here > and now to work around MS's inconsistencies, it looks like the best >
Re: [sqlite] Vista frustrations
A full table scan of 16 million rows and 55 columns on Vista with FILE_FLAG_RANDOM_ACCESS finished in 600 seconds (cold) and 499 seconds (warm) and turned Vista into a brick. The same test with the flag removed ran in 119 seconds (cold) and 99 seconds (warm), and caused no adverse problems with the OS. Not even Sql Server could do a full table scan of the same data that fast! Same test on Sql Server 2005 on the same machine ran in a minimum of 180 seconds, and averaged 220 seconds! Oh and Sql Server's memory usage shot to 3gb as well. Fun stuff. I ran the same tests with SQLite on an XP laptop. I got 618 seconds with FILE_FLAG_RANDOM_ACCESS, and 599 seconds without the flag. Not so dramatic of a difference, but still in the right direction. So I'd agree that the hint should only be used for CE, and no other platforms. Roger quoted the CE compression blurb from this MSDN page: http://msdn.microsoft.com/en-us/library/ms923949.aspx Robert -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns Sent: Wednesday, September 17, 2008 4:24 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS > any hints about caching. However, I'm not sure what kind of performance hit > (if any) that would have on Windows. It's already been proven that > providing the hint on WinCE is beneficial. I agree. Open a ticket and include measurements in it. With Windows CE, setting the flag prevents compression: Using the FILE_FLAG_RANDOM_ACCESS flag in the RAM file system, which places files in the object store, will prevent a file from being compressed. If performance is an issue, this could be the correct solution. Read and write operations to a compressed file are slower than read and write operations to an uncompressed file. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0ZGemOOfHg372QQRAtHSAJ9fSAXY3ekwrK4g9/eTrcYYQInW6QCgvSkB e00SnZbtTmQQ1G8ZxEVTK1M= =Y3uW -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
On Wed, Sep 17, 2008 at 06:00:45PM -0700, Roger Binns scratched on the wall: > The second is that SQLite when opening a file under Windows explicitly > tells Windows that the file will be used for random access even though > that is not the case. Windows uses this hint to override its builtin > heuristics which can cause bug #1. > Bug #2 is that SQLite is lying to the operating system and could result > in performance degradation if the operating system actually pays > attention to the hint. SQLite is not "lying." After poking around a bit to refresh my understanding of SQLite's file structure, I think it is safe to say that SQLite will almost never do a sequential file read, even if you're doing a sequential table scan. sequential table scan != sequential file access There are some specific situations when you might get bursts of sequential reads, but only for very specific page layouts with very specific types of queries. In short, not the common case. Furthermore, even those patterns can get broken up and shuffled around depending on the state of SQLite's page cache-- especially if it is bumped up a few dozen megs. So simply running different types of queries can change the access patterns (this is true of the OS's file system cache as well, of course). It might be worth instrumenting a few systems and having a look, but in general, if you had to label SQLite's access pattern, I think "random" would be the most appropriate label. I also contend that if the Windows file cache becomes some kind of bumbling idiot if you actually try to define an access pattern, then something is wrong. There is a very good reason why the POSIX functions for doing this kind of thing are called "*advise()". You might seed the heuristic statistics in a specific direction, but they should never be totally over-ridden. That quickly leads to stupid behaviors, like grabbing all the RAM on the system and not letting go. Of course, we could argue philosophy for a long time. In the here and now to work around MS's inconsistencies, it looks like the best bet is turn it on with CE and off on Vista, because it appears to have two totally different meanings. -j, -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Brown, Daniel wrote: > Has anyone tried to replicate this bug on WindowsXP too? There are two separate bugs here. One is that when Vista is told a file is used for random access and then there is a lot of file access, it may use a little too much memory for cache instead of applications which causes those application to be sluggish sometimes to the point of unusability. The second is that SQLite when opening a file under Windows explicitly tells Windows that the file will be used for random access even though that is not the case. Windows uses this hint to override its builtin heuristics which can cause bug #1. Bug #1 will not happen under default Windows XP configurations as XP uses 10 MB of memory for file caching no matter how much RAM you have. See "Zero" on http://www.sqlite.org/cvstrac/wiki/wiki?p=PerformanceTuningWindows Bug #2 is that SQLite is lying to the operating system and could result in performance degradation if the operating system actually pays attention to the hint. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0ag9mOOfHg372QQRAvCuAJ4uVYifeD7PsEVuyN2Mt1VFdqfHjwCgv3tj /P62UEnqepRAfhV2QczfBKs= =9ZV0 -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
Has anyone tried to replicate this bug on WindowsXP too? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns Sent: Wednesday, September 17, 2008 4:24 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS > any hints about caching. However, I'm not sure what kind of performance hit > (if any) that would have on Windows. It's already been proven that > providing the hint on WinCE is beneficial. I agree. Open a ticket and include measurements in it. With Windows CE, setting the flag prevents compression: Using the FILE_FLAG_RANDOM_ACCESS flag in the RAM file system, which places files in the object store, will prevent a file from being compressed. If performance is an issue, this could be the correct solution. Read and write operations to a compressed file are slower than read and write operations to an uncompressed file. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0ZGemOOfHg372QQRAtHSAJ9fSAXY3ekwrK4g9/eTrcYYQInW6QCgvSkB e00SnZbtTmQQ1G8ZxEVTK1M= =Y3uW -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS > any hints about caching. However, I'm not sure what kind of performance hit > (if any) that would have on Windows. It's already been proven that > providing the hint on WinCE is beneficial. I agree. Open a ticket and include measurements in it. With Windows CE, setting the flag prevents compression: Using the FILE_FLAG_RANDOM_ACCESS flag in the RAM file system, which places files in the object store, will prevent a file from being compressed. If performance is an issue, this could be the correct solution. Read and write operations to a compressed file are slower than read and write operations to an uncompressed file. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0ZGemOOfHg372QQRAtHSAJ9fSAXY3ekwrK4g9/eTrcYYQInW6QCgvSkB e00SnZbtTmQQ1G8ZxEVTK1M= =Y3uW -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Fred Williams wrote: > Have you ever actually used a version of Windows? Windows 1.0 (once), Windows 2/286, Windows 3, 3.1, 3.11, 95, 98, ME, 2000, XP, 2003 and Vista. > ANY OS that attempts to read in a xGigibyte file into real memory to the > detriment of the entire system load is not working correctly. Call it a bug > or a feature it still sucks. If the other tasks aren't doing anything then it slows down the reading task by not using using as much memory as is reasonable. For example not using half the RAM for the file caching just in case a background task may came to life in the future will slow down the file task which then increases the probability that a background task will be needed. Getting this stuff right is not easy. The kernel cannot predict the future and instead has to use heuristics which involve tradeoffs of throughput against latency. There will always be cases where the heuristics get it wrong, but the goal is to get it right the vast majority of the time. Linux has exactly the same issue. Ultimately a kernel variable named swappiness is used with a default value, but is administrator adjustable because even they couldn't work a single correct value for every situation. http://lwn.net/Articles/83588/ Everyone recognises the degenerate cases are bad, but even then it is questionable. Do you want the file task to take 1 minute and then have all your other tasks take 10 seconds to become fully responsive, or do you want your file task to take 5 minutes while your other tasks are mostly responsive? If you have some algorithm, parameters and heuristics that work under all workloads and configurations then I am sure the operating systems community would love to hear from you. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0Y/lmOOfHg372QQRAlMbAKCB3acOdcnIw2JRIGrrOh//jw/0rgCdG2AQ SvyDyo3K3fsxtxhYgsAGVeU= =X+SR -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
I am running vista my personal pc. I'll try to take a look into how not having this flag affects performance when I get home. On Wed, Sep 17, 2008 at 5:29 PM, Robert Simpson <[EMAIL PROTECTED]> wrote: > Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS > any hints about caching. However, I'm not sure what kind of performance hit > (if any) that would have on Windows. It's already been proven that > providing the hint on WinCE is beneficial. > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns > Sent: Wednesday, September 17, 2008 2:02 PM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Vista frustrations > > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Robert Simpson wrote: >> The purpose of a cache is to improve performance and responsiveness. Any >> cache that uses all physical memory, forces all other apps to the paging >> file > > All current operating systems do this, using heuristics to determine how > much of each running application to leave in memory while using "spare" > memory for filesystem cache and dynamically changing allocation based on > demand. Sometimes they mess up to the detriment of idle applications. > However that is not the bit I am talking about as it is not controlled > by flags when opening files and instead is part of the core operating > system code. (You often see this issue when copying files larger than RAM). > > What is under discussion is how the operating system is using the cache > that it does decide to allocate for SQLite. If you tell it that a file > is sequential access then that means that read ahead is good and that > data already read can be discarded. If you tell it that a file is > random access then read ahead is bad (it has the disk occupied when the > next random request comes in) and already read data should be kept. > SQLite does not know and should be leaving it up to the operating > system. Your tests prove that when that is done on Vista, performance > of SQLite is better and other applications are less adversely affected. > >> The real frustration is that this seems to be a rather obvious bug in > Vista, >> and definitely not SQLite's responsibility. > > This is conflating two issues. One is the tradeoff between RAM usage > for cache vs idle applications. You can argue that is a bug, or more > accurately there are circumstances under which the tradeoff picks wrong > values, and is very hard to get right. Example underlying details are > at http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx > > The second issue is the performance differences when random access, > sequential and no flags are given while opening the file and the > resulting performance. In this case a file is accessed mostly > sequentially but the random access flag is given. Performance was worse > than letting the operating system use its own heuristics. That is > hardly surprising or a bug. > > Roger > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.6 (GNU/Linux) > > iD8DBQFI0XBYmOOfHg372QQRAmKfAJ4pWVfZ8LS2ET+Y55FGT3Am9wCfZg4e > McaZrcLUXQJhU7i1Gw5+cRk= > =CQg5 > -END PGP SIGNATURE- > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
Ideally, at least on non-CE platforms, I'd like see SQLite not give the OS any hints about caching. However, I'm not sure what kind of performance hit (if any) that would have on Windows. It's already been proven that providing the hint on WinCE is beneficial. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns Sent: Wednesday, September 17, 2008 2:02 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > The purpose of a cache is to improve performance and responsiveness. Any > cache that uses all physical memory, forces all other apps to the paging > file All current operating systems do this, using heuristics to determine how much of each running application to leave in memory while using "spare" memory for filesystem cache and dynamically changing allocation based on demand. Sometimes they mess up to the detriment of idle applications. However that is not the bit I am talking about as it is not controlled by flags when opening files and instead is part of the core operating system code. (You often see this issue when copying files larger than RAM). What is under discussion is how the operating system is using the cache that it does decide to allocate for SQLite. If you tell it that a file is sequential access then that means that read ahead is good and that data already read can be discarded. If you tell it that a file is random access then read ahead is bad (it has the disk occupied when the next random request comes in) and already read data should be kept. SQLite does not know and should be leaving it up to the operating system. Your tests prove that when that is done on Vista, performance of SQLite is better and other applications are less adversely affected. > The real frustration is that this seems to be a rather obvious bug in Vista, > and definitely not SQLite's responsibility. This is conflating two issues. One is the tradeoff between RAM usage for cache vs idle applications. You can argue that is a bug, or more accurately there are circumstances under which the tradeoff picks wrong values, and is very hard to get right. Example underlying details are at http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx The second issue is the performance differences when random access, sequential and no flags are given while opening the file and the resulting performance. In this case a file is accessed mostly sequentially but the random access flag is given. Performance was worse than letting the operating system use its own heuristics. That is hardly surprising or a bug. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0XBYmOOfHg372QQRAmKfAJ4pWVfZ8LS2ET+Y55FGT3Am9wCfZg4e McaZrcLUXQJhU7i1Gw5+cRk= =CQg5 -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
Have you ever actually used a version of Windows? ANY OS that attempts to read in a xGigibyte file into real memory to the detriment of the entire system load is not working correctly. Call it a bug or a feature it still sucks. I expect nothing less from Microsoft with each new version. I recommend you wait for, it looks like, maybe, Service Pack 6 before you get serious about Vista. Fred -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Roger Binns Sent: Wednesday, September 17, 2008 4:02 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > The purpose of a cache is to improve performance and responsiveness. Any > cache that uses all physical memory, forces all other apps to the paging > file All current operating systems do this, using heuristics to determine how much of each running application to leave in memory while using "spare" memory for filesystem cache and dynamically changing allocation based on demand. Sometimes they mess up to the detriment of idle applications. However that is not the bit I am talking about as it is not controlled by flags when opening files and instead is part of the core operating system code. (You often see this issue when copying files larger than RAM). What is under discussion is how the operating system is using the cache that it does decide to allocate for SQLite. If you tell it that a file is sequential access then that means that read ahead is good and that data already read can be discarded. If you tell it that a file is random access then read ahead is bad (it has the disk occupied when the next random request comes in) and already read data should be kept. SQLite does not know and should be leaving it up to the operating system. Your tests prove that when that is done on Vista, performance of SQLite is better and other applications are less adversely affected. > The real frustration is that this seems to be a rather obvious bug in Vista, > and definitely not SQLite's responsibility. This is conflating two issues. One is the tradeoff between RAM usage for cache vs idle applications. You can argue that is a bug, or more accurately there are circumstances under which the tradeoff picks wrong values, and is very hard to get right. Example underlying details are at http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx The second issue is the performance differences when random access, sequential and no flags are given while opening the file and the resulting performance. In this case a file is accessed mostly sequentially but the random access flag is given. Performance was worse than letting the operating system use its own heuristics. That is hardly surprising or a bug. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0XBYmOOfHg372QQRAmKfAJ4pWVfZ8LS2ET+Y55FGT3Am9wCfZg4e McaZrcLUXQJhU7i1Gw5+cRk= =CQg5 -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > The purpose of a cache is to improve performance and responsiveness. Any > cache that uses all physical memory, forces all other apps to the paging > file All current operating systems do this, using heuristics to determine how much of each running application to leave in memory while using "spare" memory for filesystem cache and dynamically changing allocation based on demand. Sometimes they mess up to the detriment of idle applications. However that is not the bit I am talking about as it is not controlled by flags when opening files and instead is part of the core operating system code. (You often see this issue when copying files larger than RAM). What is under discussion is how the operating system is using the cache that it does decide to allocate for SQLite. If you tell it that a file is sequential access then that means that read ahead is good and that data already read can be discarded. If you tell it that a file is random access then read ahead is bad (it has the disk occupied when the next random request comes in) and already read data should be kept. SQLite does not know and should be leaving it up to the operating system. Your tests prove that when that is done on Vista, performance of SQLite is better and other applications are less adversely affected. > The real frustration is that this seems to be a rather obvious bug in Vista, > and definitely not SQLite's responsibility. This is conflating two issues. One is the tradeoff between RAM usage for cache vs idle applications. You can argue that is a bug, or more accurately there are circumstances under which the tradeoff picks wrong values, and is very hard to get right. Example underlying details are at http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx The second issue is the performance differences when random access, sequential and no flags are given while opening the file and the resulting performance. In this case a file is accessed mostly sequentially but the random access flag is given. Performance was worse than letting the operating system use its own heuristics. That is hardly surprising or a bug. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0XBYmOOfHg372QQRAmKfAJ4pWVfZ8LS2ET+Y55FGT3Am9wCfZg4e McaZrcLUXQJhU7i1Gw5+cRk= =CQg5 -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
Sure. I just tell to do this test to check if the bug is related to this component, since it debuted on Vista. Virgilio Alexandre Fornazin High performance and realtime systems development Rua Brigadeiro Vicente Faria Lima, 268 Bela VistaLeme-SPCEP 13611-485 Phone: +55 19 3571-5573 Cell: +55 19 8111-4053 +55 11 8357 1491 Mail: [EMAIL PROTECTED] Web: http://www.fornazinconsultoria.com.br -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Becker Sent: quarta-feira, 17 de setembro de 2008 13:41 To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations SQLite in general and the .Net provider in particular are most often shipped as components of other applications. I dont think having developers tell their end users to disable superfetch is a viable solution. As much as I hate to propose this maybe a runtime check is in order to see what the OS version is and not use the flag where it's known to be problematic. On Wed, Sep 17, 2008 at 12:14 PM, Virgilio Alexandre Fornazin <[EMAIL PROTECTED]> wrote: > Could not this bug be related with Vista feature called 'Superfetch' ? > It tries to keep in memory the most accessed files for user, avoiding > disk for read access. > > If you disable (or stop) this service, the problem remains or not ? > > > > > Virgilio Alexandre Fornazin > High performance and realtime systems development > > Rua Brigadeiro Vicente Faria Lima, 268 > Bela VistaLeme-SPCEP 13611-485 > Phone: +55 19 3571-5573 > Cell: +55 19 8111-4053 > +55 11 8357 1491 > Mail: [EMAIL PROTECTED] > Web: http://www.fornazinconsultoria.com.br > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich > Sent: quarta-feira, 17 de setembro de 2008 13:01 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Vista frustrations > > On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall: >> Robert Simpson wrote: >> > To me this seems like an obvious bug in Vista, >> >> Actually I'd argue that it is behaving as designed. > > You could argue it is behaving as designed, but I'd still argue it is > behaving poorly. > > Further, if a system component who's sole purpose is to increase > performance-- as all cache systems are-- has the overall effect of > decreasing performance, not only of the process it is trying to speed > up, but of the whole system, it is pretty easy to argue that's a serious > functional bug. > > > Given the speed of most storage systems, filesystem cache management > is an important component of overall system performance. However, if > the cache system is grabbing so much physical memory (and, apparently, > refusing to let go of it) that processes are forced to aggressively > page and the net result is a massive performance loss, then something > isn't right. > > As with so many things, cache management (and, indeed, the whole > concept of caches) tends to be a huge web of compromises. It is > extremely difficult, if not impossible, to cover all cases. But > these things are not exactly new, and it should be easy enough to > never get in a situation where things are actually made worse-- > especially that they're not made worse for the whole system. > >> Generally >> filesystem code will try to detect what is going on under the hood. In >> particular if it looks like you are doing sequential access(*) then they >> will start doing read ahead, whereas read ahead is a waste for random >> access. > > Not to get into a whole argument about cache strategies, but this > often not true. If we assume free memory isn't a big concern, > when a process opens a file for random-access we can either > read-ahead the whole thing or we can read blocks here and there until > (if the process touches the majority of the file) we have the whole > thing in memory. Both systems, in the end, will result in the same > memory usage. > > However, if I'm going to be doing random access on a file of moderate > or smaller size, it is much cheaper for the OS to just suck the whole > thing into memory via one bulk read operation than it is to grab it > piecemeal. > > The whole trick is defining "moderate" both in terms of first-return > read times (time to return the block the process actually asked for, > which might not be the first block pulled off disk) vs how likely the > process is to touch the majority of file blocks (something that is > somewhat less likely as the file gets bigger). > > As the file gets larger, there is also the real-world issue of how > much RAM the system has, and how much of it is ac
Re: [sqlite] Vista frustrations
SQLite in general and the .Net provider in particular are most often shipped as components of other applications. I dont think having developers tell their end users to disable superfetch is a viable solution. As much as I hate to propose this maybe a runtime check is in order to see what the OS version is and not use the flag where it's known to be problematic. On Wed, Sep 17, 2008 at 12:14 PM, Virgilio Alexandre Fornazin <[EMAIL PROTECTED]> wrote: > Could not this bug be related with Vista feature called 'Superfetch' ? > It tries to keep in memory the most accessed files for user, avoiding > disk for read access. > > If you disable (or stop) this service, the problem remains or not ? > > > > > Virgilio Alexandre Fornazin > High performance and realtime systems development > > Rua Brigadeiro Vicente Faria Lima, 268 > Bela VistaLeme-SPCEP 13611-485 > Phone: +55 19 3571-5573 > Cell: +55 19 8111-4053 > +55 11 8357 1491 > Mail: [EMAIL PROTECTED] > Web: http://www.fornazinconsultoria.com.br > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich > Sent: quarta-feira, 17 de setembro de 2008 13:01 > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Vista frustrations > > On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall: >> Robert Simpson wrote: >> > To me this seems like an obvious bug in Vista, >> >> Actually I'd argue that it is behaving as designed. > > You could argue it is behaving as designed, but I'd still argue it is > behaving poorly. > > Further, if a system component who's sole purpose is to increase > performance-- as all cache systems are-- has the overall effect of > decreasing performance, not only of the process it is trying to speed > up, but of the whole system, it is pretty easy to argue that's a serious > functional bug. > > > Given the speed of most storage systems, filesystem cache management > is an important component of overall system performance. However, if > the cache system is grabbing so much physical memory (and, apparently, > refusing to let go of it) that processes are forced to aggressively > page and the net result is a massive performance loss, then something > isn't right. > > As with so many things, cache management (and, indeed, the whole > concept of caches) tends to be a huge web of compromises. It is > extremely difficult, if not impossible, to cover all cases. But > these things are not exactly new, and it should be easy enough to > never get in a situation where things are actually made worse-- > especially that they're not made worse for the whole system. > >> Generally >> filesystem code will try to detect what is going on under the hood. In >> particular if it looks like you are doing sequential access(*) then they >> will start doing read ahead, whereas read ahead is a waste for random >> access. > > Not to get into a whole argument about cache strategies, but this > often not true. If we assume free memory isn't a big concern, > when a process opens a file for random-access we can either > read-ahead the whole thing or we can read blocks here and there until > (if the process touches the majority of the file) we have the whole > thing in memory. Both systems, in the end, will result in the same > memory usage. > > However, if I'm going to be doing random access on a file of moderate > or smaller size, it is much cheaper for the OS to just suck the whole > thing into memory via one bulk read operation than it is to grab it > piecemeal. > > The whole trick is defining "moderate" both in terms of first-return > read times (time to return the block the process actually asked for, > which might not be the first block pulled off disk) vs how likely the > process is to touch the majority of file blocks (something that is > somewhat less likely as the file gets bigger). > > As the file gets larger, there is also the real-world issue of how > much RAM the system has, and how much of it is actually in-use with > process and OS pages. This is true of both sequential AND random > access, although memory usage is generally easier to control in > sequential patterns. > > This is where Vista appears to be breaking down and making very poor > decisions. It seems to be giving cache pages more priority than > process and OS system pages, and generally that should never happen. > If we're correctly understanding what is going on, Vista might very > well be paging out SQLite's internal page cache to fit a few extra file > blocks in RAM. How much sense does that make? > >> By us
Re: [sqlite] Vista frustrations
Note that Windows Server 2008 use the same 'core' as Windows Vista. If you´re detecting and redirecting by using GetVersion() or other approach you might test for Server 2008 too. Virgilio Alexandre Fornazin High performance and realtime systems development Rua Brigadeiro Vicente Faria Lima, 268 Bela VistaLeme-SPCEP 13611-485 Phone: +55 19 3571-5573 Cell: +55 19 8111-4053 +55 11 8357 1491 Mail: [EMAIL PROTECTED] Web: http://www.fornazinconsultoria.com.br -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Robert Simpson Sent: quarta-feira, 17 de setembro de 2008 13:30 To: 'General Discussion of SQLite Database' Subject: Re: [sqlite] Vista frustrations I've run the tests with superfetch and prefetch disabled and enabled. Results are consistent with or without these running. The only thing that has any affect is the FILE_FLAG_RANDOM_ACCESS flag. And only on Vista. For now I'm thinking of overriding the default Windows VFS and redirecting the open function just for Vista so it doesn't use that flag. Robert -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Virgilio Alexandre Fornazin Sent: Wednesday, September 17, 2008 9:14 AM To: 'General Discussion of SQLite Database' Subject: Re: [sqlite] Vista frustrations Could not this bug be related with Vista feature called 'Superfetch' ? It tries to keep in memory the most accessed files for user, avoiding disk for read access. If you disable (or stop) this service, the problem remains or not ? Virgilio Alexandre Fornazin High performance and realtime systems development Rua Brigadeiro Vicente Faria Lima, 268 Bela VistaLeme-SPCEP 13611-485 Phone: +55 19 3571-5573 Cell: +55 19 8111-4053 +55 11 8357 1491 Mail: [EMAIL PROTECTED] Web: http://www.fornazinconsultoria.com.br -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich Sent: quarta-feira, 17 de setembro de 2008 13:01 To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall: > Robert Simpson wrote: > > To me this seems like an obvious bug in Vista, > > Actually I'd argue that it is behaving as designed. You could argue it is behaving as designed, but I'd still argue it is behaving poorly. Further, if a system component who's sole purpose is to increase performance-- as all cache systems are-- has the overall effect of decreasing performance, not only of the process it is trying to speed up, but of the whole system, it is pretty easy to argue that's a serious functional bug. Given the speed of most storage systems, filesystem cache management is an important component of overall system performance. However, if the cache system is grabbing so much physical memory (and, apparently, refusing to let go of it) that processes are forced to aggressively page and the net result is a massive performance loss, then something isn't right. As with so many things, cache management (and, indeed, the whole concept of caches) tends to be a huge web of compromises. It is extremely difficult, if not impossible, to cover all cases. But these things are not exactly new, and it should be easy enough to never get in a situation where things are actually made worse-- especially that they're not made worse for the whole system. > Generally > filesystem code will try to detect what is going on under the hood. In > particular if it looks like you are doing sequential access(*) then they > will start doing read ahead, whereas read ahead is a waste for random > access. Not to get into a whole argument about cache strategies, but this often not true. If we assume free memory isn't a big concern, when a process opens a file for random-access we can either read-ahead the whole thing or we can read blocks here and there until (if the process touches the majority of the file) we have the whole thing in memory. Both systems, in the end, will result in the same memory usage. However, if I'm going to be doing random access on a file of moderate or smaller size, it is much cheaper for the OS to just suck the whole thing into memory via one bulk read operation than it is to grab it piecemeal. The whole trick is defining "moderate" both in terms of first-return read times (time to return the block the process actually asked for, which might not be the first block pulled off disk) vs how likely the process is to touch the majority of file blocks (something that is somewhat less likely as the file gets bigger). As the file gets larger, there is also the real-world issue of how much RAM the system has, and how much of it is actually in-use with process and OS pages. This is true of both sequential AND random access, although memory usage is gen
Re: [sqlite] Vista frustrations
I've run the tests with superfetch and prefetch disabled and enabled. Results are consistent with or without these running. The only thing that has any affect is the FILE_FLAG_RANDOM_ACCESS flag. And only on Vista. For now I'm thinking of overriding the default Windows VFS and redirecting the open function just for Vista so it doesn't use that flag. Robert -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Virgilio Alexandre Fornazin Sent: Wednesday, September 17, 2008 9:14 AM To: 'General Discussion of SQLite Database' Subject: Re: [sqlite] Vista frustrations Could not this bug be related with Vista feature called 'Superfetch' ? It tries to keep in memory the most accessed files for user, avoiding disk for read access. If you disable (or stop) this service, the problem remains or not ? Virgilio Alexandre Fornazin High performance and realtime systems development Rua Brigadeiro Vicente Faria Lima, 268 Bela VistaLeme-SPCEP 13611-485 Phone: +55 19 3571-5573 Cell: +55 19 8111-4053 +55 11 8357 1491 Mail: [EMAIL PROTECTED] Web: http://www.fornazinconsultoria.com.br -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich Sent: quarta-feira, 17 de setembro de 2008 13:01 To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall: > Robert Simpson wrote: > > To me this seems like an obvious bug in Vista, > > Actually I'd argue that it is behaving as designed. You could argue it is behaving as designed, but I'd still argue it is behaving poorly. Further, if a system component who's sole purpose is to increase performance-- as all cache systems are-- has the overall effect of decreasing performance, not only of the process it is trying to speed up, but of the whole system, it is pretty easy to argue that's a serious functional bug. Given the speed of most storage systems, filesystem cache management is an important component of overall system performance. However, if the cache system is grabbing so much physical memory (and, apparently, refusing to let go of it) that processes are forced to aggressively page and the net result is a massive performance loss, then something isn't right. As with so many things, cache management (and, indeed, the whole concept of caches) tends to be a huge web of compromises. It is extremely difficult, if not impossible, to cover all cases. But these things are not exactly new, and it should be easy enough to never get in a situation where things are actually made worse-- especially that they're not made worse for the whole system. > Generally > filesystem code will try to detect what is going on under the hood. In > particular if it looks like you are doing sequential access(*) then they > will start doing read ahead, whereas read ahead is a waste for random > access. Not to get into a whole argument about cache strategies, but this often not true. If we assume free memory isn't a big concern, when a process opens a file for random-access we can either read-ahead the whole thing or we can read blocks here and there until (if the process touches the majority of the file) we have the whole thing in memory. Both systems, in the end, will result in the same memory usage. However, if I'm going to be doing random access on a file of moderate or smaller size, it is much cheaper for the OS to just suck the whole thing into memory via one bulk read operation than it is to grab it piecemeal. The whole trick is defining "moderate" both in terms of first-return read times (time to return the block the process actually asked for, which might not be the first block pulled off disk) vs how likely the process is to touch the majority of file blocks (something that is somewhat less likely as the file gets bigger). As the file gets larger, there is also the real-world issue of how much RAM the system has, and how much of it is actually in-use with process and OS pages. This is true of both sequential AND random access, although memory usage is generally easier to control in sequential patterns. This is where Vista appears to be breaking down and making very poor decisions. It seems to be giving cache pages more priority than process and OS system pages, and generally that should never happen. If we're correctly understanding what is going on, Vista might very well be paging out SQLite's internal page cache to fit a few extra file blocks in RAM. How much sense does that make? > By using the sequential or random flags you are explicitly > telling the filesystem to ignore its heuristics and do as you say only. Even if that's true (most APIs present the flags as "hints" not absolute truths), the worst an incorr
Re: [sqlite] Vista frustrations
Could not this bug be related with Vista feature called 'Superfetch' ? It tries to keep in memory the most accessed files for user, avoiding disk for read access. If you disable (or stop) this service, the problem remains or not ? Virgilio Alexandre Fornazin High performance and realtime systems development Rua Brigadeiro Vicente Faria Lima, 268 Bela VistaLeme-SPCEP 13611-485 Phone: +55 19 3571-5573 Cell: +55 19 8111-4053 +55 11 8357 1491 Mail: [EMAIL PROTECTED] Web: http://www.fornazinconsultoria.com.br -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jay A. Kreibich Sent: quarta-feira, 17 de setembro de 2008 13:01 To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall: > Robert Simpson wrote: > > To me this seems like an obvious bug in Vista, > > Actually I'd argue that it is behaving as designed. You could argue it is behaving as designed, but I'd still argue it is behaving poorly. Further, if a system component who's sole purpose is to increase performance-- as all cache systems are-- has the overall effect of decreasing performance, not only of the process it is trying to speed up, but of the whole system, it is pretty easy to argue that's a serious functional bug. Given the speed of most storage systems, filesystem cache management is an important component of overall system performance. However, if the cache system is grabbing so much physical memory (and, apparently, refusing to let go of it) that processes are forced to aggressively page and the net result is a massive performance loss, then something isn't right. As with so many things, cache management (and, indeed, the whole concept of caches) tends to be a huge web of compromises. It is extremely difficult, if not impossible, to cover all cases. But these things are not exactly new, and it should be easy enough to never get in a situation where things are actually made worse-- especially that they're not made worse for the whole system. > Generally > filesystem code will try to detect what is going on under the hood. In > particular if it looks like you are doing sequential access(*) then they > will start doing read ahead, whereas read ahead is a waste for random > access. Not to get into a whole argument about cache strategies, but this often not true. If we assume free memory isn't a big concern, when a process opens a file for random-access we can either read-ahead the whole thing or we can read blocks here and there until (if the process touches the majority of the file) we have the whole thing in memory. Both systems, in the end, will result in the same memory usage. However, if I'm going to be doing random access on a file of moderate or smaller size, it is much cheaper for the OS to just suck the whole thing into memory via one bulk read operation than it is to grab it piecemeal. The whole trick is defining "moderate" both in terms of first-return read times (time to return the block the process actually asked for, which might not be the first block pulled off disk) vs how likely the process is to touch the majority of file blocks (something that is somewhat less likely as the file gets bigger). As the file gets larger, there is also the real-world issue of how much RAM the system has, and how much of it is actually in-use with process and OS pages. This is true of both sequential AND random access, although memory usage is generally easier to control in sequential patterns. This is where Vista appears to be breaking down and making very poor decisions. It seems to be giving cache pages more priority than process and OS system pages, and generally that should never happen. If we're correctly understanding what is going on, Vista might very well be paging out SQLite's internal page cache to fit a few extra file blocks in RAM. How much sense does that make? > By using the sequential or random flags you are explicitly > telling the filesystem to ignore its heuristics and do as you say only. Even if that's true (most APIs present the flags as "hints" not absolute truths), the worst an incorrect flag should do is hurt the file access performance of the process that provided the hint. Even then, the lower end should be the same performance one would expect if there was no cache (e.g. constant misses). A poor or incorrect flag is no excuse to be overly aggressive with holding pages in RAM and killing the whole system. Even if a flag alters the read-ahead policy or cache replacement strategy, a flag should never override the decisions the cache system has to face when the system starts to run thin on free physical RAM-- especially on a file that is larger than than the RAM footprint
Re: [sqlite] Vista frustrations
On Wed, Sep 17, 2008 at 01:17:51AM -0700, Roger Binns scratched on the wall: > Robert Simpson wrote: > > To me this seems like an obvious bug in Vista, > > Actually I'd argue that it is behaving as designed. You could argue it is behaving as designed, but I'd still argue it is behaving poorly. Further, if a system component who's sole purpose is to increase performance-- as all cache systems are-- has the overall effect of decreasing performance, not only of the process it is trying to speed up, but of the whole system, it is pretty easy to argue that's a serious functional bug. Given the speed of most storage systems, filesystem cache management is an important component of overall system performance. However, if the cache system is grabbing so much physical memory (and, apparently, refusing to let go of it) that processes are forced to aggressively page and the net result is a massive performance loss, then something isn't right. As with so many things, cache management (and, indeed, the whole concept of caches) tends to be a huge web of compromises. It is extremely difficult, if not impossible, to cover all cases. But these things are not exactly new, and it should be easy enough to never get in a situation where things are actually made worse-- especially that they're not made worse for the whole system. > Generally > filesystem code will try to detect what is going on under the hood. In > particular if it looks like you are doing sequential access(*) then they > will start doing read ahead, whereas read ahead is a waste for random > access. Not to get into a whole argument about cache strategies, but this often not true. If we assume free memory isn't a big concern, when a process opens a file for random-access we can either read-ahead the whole thing or we can read blocks here and there until (if the process touches the majority of the file) we have the whole thing in memory. Both systems, in the end, will result in the same memory usage. However, if I'm going to be doing random access on a file of moderate or smaller size, it is much cheaper for the OS to just suck the whole thing into memory via one bulk read operation than it is to grab it piecemeal. The whole trick is defining "moderate" both in terms of first-return read times (time to return the block the process actually asked for, which might not be the first block pulled off disk) vs how likely the process is to touch the majority of file blocks (something that is somewhat less likely as the file gets bigger). As the file gets larger, there is also the real-world issue of how much RAM the system has, and how much of it is actually in-use with process and OS pages. This is true of both sequential AND random access, although memory usage is generally easier to control in sequential patterns. This is where Vista appears to be breaking down and making very poor decisions. It seems to be giving cache pages more priority than process and OS system pages, and generally that should never happen. If we're correctly understanding what is going on, Vista might very well be paging out SQLite's internal page cache to fit a few extra file blocks in RAM. How much sense does that make? > By using the sequential or random flags you are explicitly > telling the filesystem to ignore its heuristics and do as you say only. Even if that's true (most APIs present the flags as "hints" not absolute truths), the worst an incorrect flag should do is hurt the file access performance of the process that provided the hint. Even then, the lower end should be the same performance one would expect if there was no cache (e.g. constant misses). A poor or incorrect flag is no excuse to be overly aggressive with holding pages in RAM and killing the whole system. Even if a flag alters the read-ahead policy or cache replacement strategy, a flag should never override the decisions the cache system has to face when the system starts to run thin on free physical RAM-- especially on a file that is larger than than the RAM footprint of most machines. And in this case, I'm not even sure the flag is incorrect > Since SQLite cannot tell in advance whether access is almost entirely > random or almost entirely sequential, it makes far more sense to let the > operating system use its builtin heuristics and optimise accordingly. Since a full table scan is unlikely to result in a sequential file read for anything but the most pristine database (e.g. one that is bulk loaded and never touched), I'm not sure that's true. I would guess the file access patterns are almost always random, making the hint correct. On the other hand, given that SQLite does it's own cache management and tries to make few assumptions about the underlying system, I can also see leaving the OS to do what it wants. The hint may be correct, but it
Re: [sqlite] Vista frustrations
Possibly the reason a large number of us are still running Win 2000 :-) It seems to be the least Windows like Windoze ever released... When follow on support degrades to an untenable level, I'll either switch to Linux with a Windoze emulator or maybe run whatever the future ruler of the Universe (Google) tells me to :-) By then, I figure to be carrying around a holographic 42" 3-D display, surround sound cell phone anyhow. BTW, that will have the same form factor as the "Star trek Next Generation", Communicator. Fred -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Robert Simpson Sent: Wednesday, September 17, 2008 9:54 AM To: 'General Discussion of SQLite Database' Subject: Re: [sqlite] Vista frustrations The real frustration is that this seems to be a rather obvious bug in Vista, and definitely not SQLite's responsibility. IMO setting the flag is the "right thing to do" -- but at the same time, I don't expect any favors from Microsoft in fixing this any time soon. Meanwhile all those poor Vista people need SQLite to work well with their OS. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns Sent: Wednesday, September 17, 2008 1:18 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > To me this seems like an obvious bug in Vista, Actually I'd argue that it is behaving as designed. Generally filesystem code will try to detect what is going on under the hood. In particular if it looks like you are doing sequential access(*) then they will start doing read ahead, whereas read ahead is a waste for random access. By using the sequential or random flags you are explicitly telling the filesystem to ignore its heuristics and do as you say only. Since SQLite cannot tell in advance whether access is almost entirely random or almost entirely sequential, it makes far more sense to let the operating system use its builtin heuristics and optimise accordingly. (*) Sun's ZFS can even detect strided sequential access - ie reading X amount of data every Y kilobytes. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4 vMfbYMNVV9k6CNR7hpSQo6A= =AXGU -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
It's been reproduced on Vista x64 and x86 OS's. I don't have a pre-SP1 installation to verify if it happens against pre-patched versions. Getting a database should be easy enough ... CREATE TABLE FOO(a, b, c, d, e, f); Then do this about 16 million times or so ... whatever it takes to pump the db up to the 4-5gb mark: INSERT INTO FOO VALUES(random(), random(), random(), randomblob(16), randomblob(16), randomblob(32)); -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sherief N. Farouk Sent: Wednesday, September 17, 2008 12:38 AM To: 'General Discussion of SQLite Database' Subject: Re: [sqlite] Vista frustrations > I ran some tests using the command-line sqlite3.exe, and observed that > Windows Vista (SP1) is actually trying to cache the entire 5gb file > into > memory during the table scan! The system slows to a complete crawl and > becomes unresponsive. The sqlite3.exe's memory remains very minimal, > but > Vista itself eats every last scrap of physical memory, forcing all > other > apps to the paging file trying to cache the contents of the database. [Sherief N. Farouk] Any chance you could make a small C program to reproduce this case (via procedurally creating a similarly-sized database)? Out of curiosity, was that on x86 or x64? - Sherief ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
The real frustration is that this seems to be a rather obvious bug in Vista, and definitely not SQLite's responsibility. IMO setting the flag is the "right thing to do" -- but at the same time, I don't expect any favors from Microsoft in fixing this any time soon. Meanwhile all those poor Vista people need SQLite to work well with their OS. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns Sent: Wednesday, September 17, 2008 1:18 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > To me this seems like an obvious bug in Vista, Actually I'd argue that it is behaving as designed. Generally filesystem code will try to detect what is going on under the hood. In particular if it looks like you are doing sequential access(*) then they will start doing read ahead, whereas read ahead is a waste for random access. By using the sequential or random flags you are explicitly telling the filesystem to ignore its heuristics and do as you say only. Since SQLite cannot tell in advance whether access is almost entirely random or almost entirely sequential, it makes far more sense to let the operating system use its builtin heuristics and optimise accordingly. (*) Sun's ZFS can even detect strided sequential access - ie reading X amount of data every Y kilobytes. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4 vMfbYMNVV9k6CNR7hpSQo6A= =AXGU -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
The purpose of a cache is to improve performance and responsiveness. Any cache that uses all physical memory, forces all other apps to the paging file and turns the operating system into a brick is definitely not working as designed. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roger Binns Sent: Wednesday, September 17, 2008 1:18 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Vista frustrations -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > To me this seems like an obvious bug in Vista, Actually I'd argue that it is behaving as designed. Generally filesystem code will try to detect what is going on under the hood. In particular if it looks like you are doing sequential access(*) then they will start doing read ahead, whereas read ahead is a waste for random access. By using the sequential or random flags you are explicitly telling the filesystem to ignore its heuristics and do as you say only. Since SQLite cannot tell in advance whether access is almost entirely random or almost entirely sequential, it makes far more sense to let the operating system use its builtin heuristics and optimise accordingly. (*) Sun's ZFS can even detect strided sequential access - ie reading X amount of data every Y kilobytes. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4 vMfbYMNVV9k6CNR7hpSQo6A= =AXGU -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
"(*) Sun's ZFS can even detect strided sequential access - ie reading X amount of data every Y kilobytes." ... and so can the NT cache manager since the very first Windows NT release ;-) It's good to see that people are finally adapting these features 15 years later. F. On Wed, Sep 17, 2008 at 10:17 AM, Roger Binns <[EMAIL PROTECTED]> wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Robert Simpson wrote: >> To me this seems like an obvious bug in Vista, > > Actually I'd argue that it is behaving as designed. Generally > filesystem code will try to detect what is going on under the hood. In > particular if it looks like you are doing sequential access(*) then they > will start doing read ahead, whereas read ahead is a waste for random > access. By using the sequential or random flags you are explicitly > telling the filesystem to ignore its heuristics and do as you say only. > > Since SQLite cannot tell in advance whether access is almost entirely > random or almost entirely sequential, it makes far more sense to let the > operating system use its builtin heuristics and optimise accordingly. > > (*) Sun's ZFS can even detect strided sequential access - ie reading X > amount of data every Y kilobytes. > > Roger > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.6 (GNU/Linux) > > iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4 > vMfbYMNVV9k6CNR7hpSQo6A= > =AXGU > -END PGP SIGNATURE- > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Simpson wrote: > To me this seems like an obvious bug in Vista, Actually I'd argue that it is behaving as designed. Generally filesystem code will try to detect what is going on under the hood. In particular if it looks like you are doing sequential access(*) then they will start doing read ahead, whereas read ahead is a waste for random access. By using the sequential or random flags you are explicitly telling the filesystem to ignore its heuristics and do as you say only. Since SQLite cannot tell in advance whether access is almost entirely random or almost entirely sequential, it makes far more sense to let the operating system use its builtin heuristics and optimise accordingly. (*) Sun's ZFS can even detect strided sequential access - ie reading X amount of data every Y kilobytes. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI0L0vmOOfHg372QQRAk+7AKCEloS0d+xB+M2C/Bap38ilZZ8tVACfciC4 vMfbYMNVV9k6CNR7hpSQo6A= =AXGU -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Vista frustrations
> I ran some tests using the command-line sqlite3.exe, and observed that > Windows Vista (SP1) is actually trying to cache the entire 5gb file > into > memory during the table scan! The system slows to a complete crawl and > becomes unresponsive. The sqlite3.exe's memory remains very minimal, > but > Vista itself eats every last scrap of physical memory, forcing all > other > apps to the paging file trying to cache the contents of the database. [Sherief N. Farouk] Any chance you could make a small C program to reproduce this case (via procedurally creating a similarly-sized database)? Out of curiosity, was that on x86 or x64? - Sherief ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Vista frustrations
I recently had a user of the ADO.NET provider report a problem on Vista. His database is 5gb, and he's doing a full table scan of 16 million rows. Yea. I ran some tests using the command-line sqlite3.exe, and observed that Windows Vista (SP1) is actually trying to cache the entire 5gb file into memory during the table scan! The system slows to a complete crawl and becomes unresponsive. The sqlite3.exe's memory remains very minimal, but Vista itself eats every last scrap of physical memory, forcing all other apps to the paging file trying to cache the contents of the database. It took about 500 seconds to scan the entire table, which isn't all that bad given the circumstances. Unfortunately Vista's cache remained high and continued to make the entire system unresponsive until I quit out of the command-line. After some research, I commented out the FILE_FLAG_RANDOM_ACCESS flag in os_win.c and re-ran the test. It completed in 99 seconds. System remained highly responsive, and Vista never blew out the memory trying to cache the entire file. Now, I know Windows CE benefits greatly from having this flag hint - but has anyone tested normal desktop performance with or without it? To me this seems like an obvious bug in Vista, but the chances of getting Microsoft to fix it are slim to none. Robert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users