Re: [sqlite] Slow INDEX
> If you could use DTrace you could really find out, but since we have How about something like sysinternals diskmon? http://technet.microsoft.com/en-us/sysinternals/bb896646 That should give you (OP) some indication of what disk activity is going on. Paul. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On Thu, Nov 3, 2011 at 12:39 PM, Fabianwrote: > I just tested it, and it made no difference. The root cause of the problem > is most likely not slow writes, because inserting duplicate values (which > are ignored instead of written to disk) are just as slow. If you could use DTrace you could really find out, but since we have to guess, the best (almost certainly correct) guess is that it's the *read* I/Os that are killing you, not the writes. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
I just tested it, and it made no difference. The root cause of the problem is most likely not slow writes, because inserting duplicate values (which are ignored instead of written to disk) are just as slow. But your suggestion may help me with another problem: when I fill an empty database with million of rows, they are written to disk in the order I inserted them, not in the order a VACUUM would put them (sequential per table). If I can force an atomic commit by enabling a journal_mode (MEMORY for example), I would possibly end up with much better structured database file. So your idea is appreciated, but for the problem reported in this thread, it had no effect. 2011/11/3 nobre> > Have you tried setting journal_mode to the default DELETE option ? > Without atommic commits, maybe your inserts are going to disk one by one > instead of in a single step, when commiting your transactions, thus slowing > down disk writes. > > > Fabian-40 wrote: > > > > 2011/11/2 Black, Michael (IS) > > I do not use WAL, since I have turned 'journal_mode' off (to improve > > insert > > performance), and as far as I know WAL is only usefull when you need to > > keep a journal? > > > > I also have 'synchronous' off, so SQLite shouldn't be waiting for any > > filesystem flushes. I hoped it was writing all the newly inserted rows to > > disk using a single operation, as soon as I call 'commit'. But I observed > > it, and found out its already writing the rows as soon as soon as they > are > > inserted, not batching them for when I call commit. > > > > So that could be part of the problem. I don't have a real Windows machine > > at hand, so I will build one tomorrow, but if your expectations are > > correct, than it will be even slower than inside a virtual machine, > > because > > it will do individiual writes for the 1 million rows too, making > > performance even worse than it is now. > > > > Anothing thing is that I don't expect the slow performance have anything > > to > > do with slow disk writes, only with disk reads. I know this because when > I > > make the index UNIQUE, and try to insert 10.000 duplicate rows (which are > > all ignored), it has the same bad performance, even though there are zero > > bytes written to disk. So it points in the direction of the reads making > > it > > slow, not the writes. > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > > > -- > View this message in context: > http://old.nabble.com/Slow-INDEX-tp32766886p32772266.html > Sent from the SQLite mailing list archive at Nabble.com. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
Have you tried setting journal_mode to the default DELETE option ? Without atommic commits, maybe your inserts are going to disk one by one instead of in a single step, when commiting your transactions, thus slowing down disk writes. Fabian-40 wrote: > > 2011/11/2 Black, Michael (IS) > I do not use WAL, since I have turned 'journal_mode' off (to improve > insert > performance), and as far as I know WAL is only usefull when you need to > keep a journal? > > I also have 'synchronous' off, so SQLite shouldn't be waiting for any > filesystem flushes. I hoped it was writing all the newly inserted rows to > disk using a single operation, as soon as I call 'commit'. But I observed > it, and found out its already writing the rows as soon as soon as they are > inserted, not batching them for when I call commit. > > So that could be part of the problem. I don't have a real Windows machine > at hand, so I will build one tomorrow, but if your expectations are > correct, than it will be even slower than inside a virtual machine, > because > it will do individiual writes for the 1 million rows too, making > performance even worse than it is now. > > Anothing thing is that I don't expect the slow performance have anything > to > do with slow disk writes, only with disk reads. I know this because when I > make the index UNIQUE, and try to insert 10.000 duplicate rows (which are > all ignored), it has the same bad performance, even though there are zero > bytes written to disk. So it points in the direction of the reads making > it > slow, not the writes. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > -- View this message in context: http://old.nabble.com/Slow-INDEX-tp32766886p32772266.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On Wed, Nov 2, 2011 at 2:13 PM, Fabianwrote: > 2011/11/2 Nico Williams >> But note that this can still fail you when the file is larger than >> available RAM. In that case such a flag would be very bad. And >> SQLite3 can't know how much RAM is available. The OS can know >> (sortof) and the user can know, but SQLite3 can't. So I take the >> above back -- such a flag would probably result in posts about how >> SQLite3 startup causes thrashing... > > If the flag would respect the cache_size pragma, it could work very well. I > currently set the cache_size to 300MB (72k x 4096 pages), and the database > size on disk is only 125 MB, so it would fit in perfectly. People that > never changed the default cache_size, will never experience any trashing, > because theyre database will not be pre-cached, since it doesnt fit into > cache. That's true, though the reading in of the file would have to be delayed till you set the cache_size. I'd be happy with such a thing. And if the cache_size is smaller than the file size, then make this do nothing. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
2011/11/2 Nico Williams> > Incidentally, it pays to make the SQLite3 page size match the > filesystem preferred block size. > > I already have the page_size set to 4096, which should match the default NTFS cluster size. But note that this can still fail you when the file is larger than > available RAM. In that case such a flag would be very bad. And > SQLite3 can't know how much RAM is available. The OS can know > (sortof) and the user can know, but SQLite3 can't. So I take the > above back -- such a flag would probably result in posts about how > SQLite3 startup causes thrashing... > If the flag would respect the cache_size pragma, it could work very well. I currently set the cache_size to 300MB (72k x 4096 pages), and the database size on disk is only 125 MB, so it would fit in perfectly. People that never changed the default cache_size, will never experience any trashing, because theyre database will not be pre-cached, since it doesnt fit into cache. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On Wed, Nov 2, 2011 at 1:20 PM, Fabianwrote: > Linux will not read the whole file in, but Windows eventually does. The > inserts go progressively faster when they are reaching halfway, and Windows > reads very large pages from disk, even if you request only 10 bytes. So in The largest read will depend on how many contiguous blocks are on disk. For a file with lots of random writes that could be very low. You'd need to get down and dirty with the filesystem to find out for sure. Or use DTrace (ah, but there's no DTrace on Windows). You're left to make assumptions, and you want to make the most optimistic ones and feel disappointed when it turns out that those assumptions were wrong :) Even if the file was all contiguous on disk and the OS was smart enough to realize that reading the whole thing in is the right thing to do, there's a limit to how far you can take this since the file could be too large to fit in RAM. This isn't SQLite3's fault... > reality a very large percentage of these 10K I/O's will come from a buffer > (either Windows one or your harddrive's buffer), and will not result in any > physical reads from disk. Ofcourse you're right that these random reads > will be slower than a sequential file-copy, because they are random, and > not large, continous blocks. The I/Os will be blocksize I/Os, not application write size. Incidentally, it pays to make the SQLite3 page size match the filesystem preferred block size. >>Actually, it might be nice if SQLite3 had a function or open >>flag by which to request that the whole thing be read into memory, >>because the OS certainly won't know to do it. > > I completely agree, because all the current methods (copy the file to > 'null', etc.) didn't work well. But note that this can still fail you when the file is larger than available RAM. In that case such a flag would be very bad. And SQLite3 can't know how much RAM is available. The OS can know (sortof) and the user can know, but SQLite3 can't. So I take the above back -- such a flag would probably result in posts about how SQLite3 startup causes thrashing... Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
2011/11/2 Nico Williams> > But that's NOT what SQLite3 is doing. SQLite3 is doing random I/O. > And the OS sees the random I/O pattern and concludes it's better to > not read the whole file in. So for those 10K inserts you pay -worst > case- 10K I/Os. At ~12ms per random I/O (likely the seek times for > your disks) you're talking 120s, so you're actually far from the worst > case -- even at 7ms seek time you're talking about twice the time > you've seen in the worst case. > > Linux will not read the whole file in, but Windows eventually does. The inserts go progressively faster when they are reaching halfway, and Windows reads very large pages from disk, even if you request only 10 bytes. So in reality a very large percentage of these 10K I/O's will come from a buffer (either Windows one or your harddrive's buffer), and will not result in any physical reads from disk. Ofcourse you're right that these random reads will be slower than a sequential file-copy, because they are random, and not large, continous blocks. >Actually, it might be nice if SQLite3 had a function or open >flag by which to request that the whole thing be read into memory, >because the OS certainly won't know to do it. I completely agree, because all the current methods (copy the file to 'null', etc.) didn't work well. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
2011/11/2 Black, Michael (IS) > Then, when you restart the file is on system disk and it is flushing each > insert to system disk on the WAL file slowing things down dramaticalliy. > I do not use WAL, since I have turned 'journal_mode' off (to improve insert performance), and as far as I know WAL is only usefull when you need to keep a journal? I also have 'synchronous' off, so SQLite shouldn't be waiting for any filesystem flushes. I hoped it was writing all the newly inserted rows to disk using a single operation, as soon as I call 'commit'. But I observed it, and found out its already writing the rows as soon as soon as they are inserted, not batching them for when I call commit. So that could be part of the problem. I don't have a real Windows machine at hand, so I will build one tomorrow, but if your expectations are correct, than it will be even slower than inside a virtual machine, because it will do individiual writes for the 1 million rows too, making performance even worse than it is now. Anothing thing is that I don't expect the slow performance have anything to do with slow disk writes, only with disk reads. I know this because when I make the index UNIQUE, and try to insert 10.000 duplicate rows (which are all ignored), it has the same bad performance, even though there are zero bytes written to disk. So it points in the direction of the reads making it slow, not the writes. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On Wed, Nov 2, 2011 at 11:41 AM, Fabianwrote: > 2011/11/2 Mr. Puneet Kishor >> ahh, so you *are* getting expected behavior, just not what *you* expected. >> Did you have a different number in mind instead of a factor of 300? And, if >> so, why? > > To read an (un-cached) 150mb file from disk, doesn't take 30 seconds. And But that's NOT what SQLite3 is doing. SQLite3 is doing random I/O. And the OS sees the random I/O pattern and concludes it's better to not read the whole file in. So for those 10K inserts you pay -worst case- 10K I/Os. At ~12ms per random I/O (likely the seek times for your disks) you're talking 120s, so you're actually far from the worst case -- even at 7ms seek time you're talking about twice the time you've seen in the worst case. What you should do, given that this one file is critical to your app, is read the whole file into memory (if it were stored on contiguous blocks, which it won't be, that'd make for about one second to read it in). Actually, it might be nice if SQLite3 had a function or open flag by which to request that the whole thing be read into memory, because the OS certainly won't know to do it. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
What happens if you run your test outside of VirtualBox? I'm thinking when you first write the file VirtualBox does it locally and then flushes the whole thing to system disk. Then, when you restart the file is on system disk and it is flushing each insert to system disk on the WAL file slowing things down dramaticalliy. If this is true your test will perform correctly on your main PC. I've heard about VM systems having problems writing to files like this. Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Fabian [fabianpi...@gmail.com] Sent: Wednesday, November 02, 2011 11:57 AM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] Slow INDEX 2011/11/2 Simon Slavin <slav...@bigfraud.org> > > So compare your 'restart-and-INSERT' test with one where you restart, log > in, then do a few random operations for a minute: start your web browser > and load a page. Open and close a few documents or control panels. Sit > and do nothing for a minute. /Then/ carry on with the test. > > To rule out your suggestion of background processes slowing down the operation, or any VirtualBox performance issues, I rebooted, and immediately copied the DB file (125 MB) to another location, and it completed within 5 seconds. So it doesn't seem VirtualBox or background processes are to blame. In the past I already did some tests with waiting for 15 minutes, and it had zero effect. Thanks for the suggestion anyway! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
2011/11/2 Simon Slavin> > So compare your 'restart-and-INSERT' test with one where you restart, log > in, then do a few random operations for a minute: start your web browser > and load a page. Open and close a few documents or control panels. Sit > and do nothing for a minute. /Then/ carry on with the test. > > To rule out your suggestion of background processes slowing down the operation, or any VirtualBox performance issues, I rebooted, and immediately copied the DB file (125 MB) to another location, and it completed within 5 seconds. So it doesn't seem VirtualBox or background processes are to blame. In the past I already did some tests with waiting for 15 minutes, and it had zero effect. Thanks for the suggestion anyway! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On 2 Nov 2011, at 4:31pm, Fabian wrote: > 2011/11/2 Mr. Puneet Kishor> >> Others will have better answers, but methinks that when you reboot the >> computer, the operating system's caches are flushed out, which slows the >> operation. Try working with the db for a bit (SELECT, repeat INSERTs, etc.) >> and notice if the speed increases again to what you expect. > > The reason I reboot the PC for the test, is because I want to have the > caches flushed out, and I fully expect it to make things slower, but not by > the degree (factor 300) i'm experiencing. Doing lots of inserts under those conditions would require accessing almost every byte of the file. In other words, after your reboot, Windows is probably having to load the entire file into cache again. So let's build that into our expectations. But I agree with Fabian that loading a 150Meg file into cache shouldn't take more than a few seconds. One thing to test is to check to see that Windows is not still doing startup operations while you're doing your testing. Windows presents itself very quickly to the user. It shows a login screen then continues to start many processes in the background. Then once the user is logged in it shows the user their desktop almost immediately, but it hasn't yet started many background operations. If the user immediately starts doing things, they will be slowed down by the fact that the computer hasn't finished starting up yet. So compare your 'restart-and-INSERT' test with one where you restart, log in, then do a few random operations for a minute: start your web browser and load a page. Open and close a few documents or control panels. Sit and do nothing for a minute. /Then/ carry on with the test. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
2011/11/2 Paul Corke> > A stateful antivirus that does lots of heavy processing when you first > open the file? > > I'm running these tests on a completey clean Win7 install, without any anti-virus (or other software) installed. I should note that it runs virtualized (in VirtualBox) so I can quickly switch back to the clean state every time, but if VirtualBox performance was the root cause of these performance issues, I would expect it to slow down the initial insert (> 1M rows) too. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On 11/02/2011 12:31 PM, Fabian wrote: 2011/11/2 Mr. Puneet KishorOthers will have better answers, but methinks that when you reboot the computer, the operating system's caches are flushed out, which slows the operation. Try working with the db for a bit (SELECT, repeat INSERTs, etc.) and notice if the speed increases again to what you expect. The reason I reboot the PC for the test, is because I want to have the caches flushed out, and I fully expect it to make things slower, but not by the degree (factor 300) i'm experiencing. In the past I've seen recommendations to open and read the file, perhaps by making a copy of the file, then try your inserts. If the time does goes back down, then you know it is file cache issues. David ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On 02 November 2011 16:42, Fabian wrote: > Maybe there is a very simple explanation, I just can't think of any. A stateful antivirus that does lots of heavy processing when you first open the file? Have you tried: 1) Reboot 2) Wait 10 minutes (don't even touch the computer) 3) Test Paul. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
2011/11/2 Mr. Puneet Kishor> > ahh, so you *are* getting expected behavior, just not what *you* expected. > Did you have a different number in mind instead of a factor of 300? And, if > so, why? To read an (un-cached) 150mb file from disk, doesn't take 30 seconds. And SQLite shouldn't have to read the whole file (because of the INDEX), only the relevant rows in the INDEX, so theoreticly it should even be faster than reading the whole file. Because Windows has a aggresive caching technique (it reads much larger blocks from disk, even if you request only 10 bytes from the file), I'm not expecting SQLite to be faster than reading the whole file, but that it's so much slower just surprises me. Maybe there is a very simple explanation, I just can't think of any. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On Nov 2, 2011, at 11:31 AM, Fabian wrote: > 2011/11/2 Mr. Puneet Kishor> >> >> Others will have better answers, but methinks that when you reboot the >> computer, the operating system's caches are flushed out, which slows the >> operation. Try working with the db for a bit (SELECT, repeat INSERTs, etc.) >> and notice if the speed increases again to what you expect. >> >> > The reason I reboot the PC for the test, is because I want to have the > caches flushed out, and I fully expect it to make things slower, but not by > the degree (factor 300) i'm experiencing. ahh, so you *are* getting expected behavior, just not what *you* expected. Did you have a different number in mind instead of a factor of 300? And, if so, why? I am genuinely curious -- I know nothing about Windows. I rarely reboot my computer -- my desktop iMac hasn't been rebooted in several weeks now. I did reboot my MacBook Air a few days ago for a software update, but usually that too goes through a few weeks before it is rebooted... when I do reboot them, I experience everything to be slow for the first 10-15 mins or so. -- Puneet Kishor ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
I would like to stick my neck out over the chopping block and agree. My experience is the opposite, but appears to support Puneet's assertion. With me, it takes my C# application 12 seconds to pass 103,00 records and insert 98,000 rows into the db from it. The next time I run the application (which starts with a fresh db,) it takes 7 seconds or less. This leads me to be believe the O/S still has the original file cached, so it's i/o performance is much improved. dvn On Wed, Nov 2, 2011 at 11:27 AM, Mr. Puneet Kishorwrote: > > On Nov 2, 2011, at 11:24 AM, Fabian wrote: > > > Now if I re-open the database, I can add an additional 10.000 rows very > > fast (<1 sec). But if I reboot the (Windows) PC, and insert an additional > > 10.000 rows, it takes at least 30 secs, which seems very slow, if I can > add > > the first 1 million in under 10 seconds. > > > Others will have better answers, but methinks that when you reboot the > computer, the operating system's caches are flushed out, which slows the > operation. Try working with the db for a bit (SELECT, repeat INSERTs, etc.) > and notice if the speed increases again to what you expect. > > > > -- > Puneet Kishor > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
2011/11/2 Mr. Puneet Kishor> > Others will have better answers, but methinks that when you reboot the > computer, the operating system's caches are flushed out, which slows the > operation. Try working with the db for a bit (SELECT, repeat INSERTs, etc.) > and notice if the speed increases again to what you expect. > > The reason I reboot the PC for the test, is because I want to have the caches flushed out, and I fully expect it to make things slower, but not by the degree (factor 300) i'm experiencing. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow INDEX
On Nov 2, 2011, at 11:24 AM, Fabian wrote: > Now if I re-open the database, I can add an additional 10.000 rows very > fast (<1 sec). But if I reboot the (Windows) PC, and insert an additional > 10.000 rows, it takes at least 30 secs, which seems very slow, if I can add > the first 1 million in under 10 seconds. Others will have better answers, but methinks that when you reboot the computer, the operating system's caches are flushed out, which slows the operation. Try working with the db for a bit (SELECT, repeat INSERTs, etc.) and notice if the speed increases again to what you expect. -- Puneet Kishor ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users