[sqlite] BUG: test script exclusive2.test needs do_not_use_codec
exclusive2.test reads directly from the database using binary read It needs the following changes: -source $testdir/tester.tcl +source $testdir/tester.tcl + +# Do not use a codec for tests in this file, as the database file is +# manipulated directly using tcl scripts (using binary read) +# +do_not_use_codec ~Noah Hart -- View this message in context: http://old.nabble.com/BUG%3A-test-script-exclusive2.test-needs-do_not_use_codec-tp30927245p30927245.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] BUG: test script vacuum2.test needs do_not_use_codec
vacuum2.test reads directly from the database using hexio_read It need the following changes: -source $testdir/tester.tcl +source $testdir/tester.tcl + +# Do not use a codec for tests in this file, as the database file is +# manipulated directly using tcl scripts (using the [hexio_read] command). +# +do_not_use_codec Also, this file contains 2 tests named vacuum2-2.1 ~Noah Hart -- View this message in context: http://old.nabble.com/BUG%3A-test-script-vacuum2.test-needs-do_not_use_codec-tp30927226p30927226.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] BUG: test scripts corrupt*.test need do_not_use_codec
corruptA.test & corruptD.test need the following changes: -source $testdir/tester.tcl +source $testdir/tester.tcl + +# Do not use a codec for tests in this file, as the database file is +# manipulated directly using tcl scripts (using the [hexio_write] command). +# +do_not_use_codec Corrupt3.test current skips the tests when compiled with codec, but will run with the following changes -if {[sqlite3 -has-codec]} { finish_test ; return } +# Do not use a codec for tests in this file, as the database file is +# manipulated directly using tcl scripts (using the [hexio_write] command). +# +do_not_use_codec ~Noah Hart -- View this message in context: http://old.nabble.com/BUG%3A-test-scripts-corrupt*.test-need-do_not_use_codec-tp30927026p30927026.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On 14 Feb 2011, at 8:50pm, Black, Michael (IS) wrote: > And if you want to talk about data reliability...BACK UP YOUR DATA. And keep the backups off-site. And once a year try to actually restore one of them. I've earned /lots/ of money from companies with backup systems that claimed to work fine but were either faulty, or being used to backup to media with low-level corruption. If there was any money in it I'd write a book, but there isn't. Companies prefer to do it wrong and then hire a consultant after it fails. On 14 Feb 2011, at 9:13pm, Jim Wilcoxson wrote: > I think what would be useful is for drives to have 3 settings: > > a) the drive can do whatever it wants to optimize performance > b) the drive can reorder writes, but not across a sync (ACI, but no D) > c) the drive has to respect all syncs (ACID) > > If the drive mgfr wants to make a) the default, that's fine, but there > should be an easy way to request the other 2 from an application. > Users are not usually sophisticated enough to know when it's okay to > cache writes or not okay. For my use of SQLite, b) would probably be > fine, but a) apparently is not since it corrupts databases. This is a job for someone who wants to contact lots of manufacturers and ask if they provide models of their drives which conform. Some do, because they want to sell their drives for use in servers. Often these drives are physically identical to their mass-market drives, but they have fewer bad blocks, and have jumpers in a different place or different firmware. > Michael mentioned doing backups and forgetting about all of this, but > many people are using USB drives to store their backups. So there is > a bit of a Catch-22 here. Almost all modern-day filesystems are going > to depend on some kind of journalling to prevent corrupted file > systems, and as far I as know, journalling filesystems depend on syncs > to maintain FS consistency. If the data is backed up when SQLite has the files closed, things are fine. The really hard part of this comes when you're working with systems that must be live at all times. Which is why you'll never see a bank use SQLite to maintain its live transaction system. The big DBMSs have systems inside themselves that allow the backing-up of an active ever-changing database. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] EXT :Re: UPDATE/INSERTing 1-2k rows slower than expected
I ran some tests on my USB stick and pulled-while-writing...no corruption at all. Integrity check worked fine every time. I checked and write caching is not enabled for that drive (Windows XP 64) For Linux I believe you can force synchronous mode with a mount like this: /dev/sdb1 /media/usb vfat user,sync,dirsync 0 0 Now if the drive still doesn't obey that...then we have a problem like you mention with pull-happy users. Copy old db to new...muck with new...copy to 3rd, rename first, rename 3rd to 1st. Or something like that. So you always have one good version laying around. Michael D. Black Senior Scientist NG Information Systems Advanced Analytics Directorate From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Max Vlasov [max.vla...@gmail.com] Sent: Monday, February 14, 2011 3:06 PM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected On Mon, Feb 14, 2011 at 11:50 PM, Black, Michael (IS) < michael.bla...@ngc.com> wrote: > And if you want to talk about data reliability...BACK UP YOUR DATA. > The likely failure points I can think of are: > #1 Power supply (redundant supplies available) > #2 Hard drive smoked (and your data is toast anyways -- RAID can help). > #3 Blue screen (or kernel lockup on Unix) > #4 CPU smoked. (usually leads to #3) > #5 RAM smoked. (usually leads to #3) > #6 Motherboard smoked (usually just dies or #3) > > The only way to increase your reliability is to replicate and/or backup. > All the whining about acid-tested drives is a waste of time. #3 through #6 > have no solution though they shouldn't cause the hard drive corruption > you're worried about. And since #1 and #2 have solutions what's the > problem? > > I see some problem especially for sqlite since in contrary to server-side databases, it's more consumer-oriented. Firefox, Apple Mail to name a few. And if we remember that it's very portable we see other possibilities. For example, there's already Portable Firefox that is usually used from usb sticks. What's going to happen to user history of such browser if the user forget to safely unmount the flash drive in the middle of the page reloading? What about music-management software that is portable enough to be used right from the external hard-drive? You don't usually use PostgreSQL or MySql for such applications. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On Mon, Feb 14, 2011 at 3:02 PM, Max Vlasovwrote: > On Mon, Feb 14, 2011 at 8:42 PM, Jim Wilcoxson wrote: > >> On Mon, Feb 14, 2011 at 10:56 AM, Pavel Ivanov wrote: >> >> So my question is, does it maintain the other 3 parts of ACID, so that >> >> the database will never be in a corrupted state after a power loss, >> >> even though it may be missing some writes that were confirmed? >> > >> > Jim, I think the answer to your question is already in Max's tests: >> > the USB drive is completely unreliable and can easily lead to database >> > corruption. I'll explain. Max's tests showed that there were >> > situations when database and journal were different not by one >> > transaction but by several ones. So if one makes changes to several >> > database pages (located in different disk sectors) and/or makes >> > transactions touching several database pages (which makes multi-page >> > journal files) then these pages can be way out of sync with each other >> > (including pages inside journal). And this will easily lead to >> > database corruption. >> >> You are right I think. I wrote my own test program and ran it on a >> Macbook Pro with a Seagate USB drive. Here is the Python test >> program: >> >> [jim@mb backup]$ time py t.py >> 4999 records inserted in 17.7488458157 seconds; 281.652117097 recs/sec >> >> > > Jim, your tests also shows (this time on a different os) that either you > have a fantastic hd with 18000 Rpm or just someone at Seagate _want_ you to > think you have a fantastic hd :) > > Just wondering, I know this maybe sounds fantastic, but I'm thinking whether > some "acid-passed harddrives" at sqlite.org can encourage manufacturers to > hold the horses. The logic would be like this: if some model is present in > either section then googling it will make this page very high in the google > results (due to high pagerank of sqlite.org). So they probably very quickly > notice that this page at least partly can affect their sales. Unfortunately > the technical side is more complex, the developers just can't rely on > e-mails from users, this should be some sqlite-originated tests performing > on a known configuration and it'd better be an oss os with known tuning. > Maybe some other, less fantastic form of such tests could be possible... > > Max I think what would be useful is for drives to have 3 settings: a) the drive can do whatever it wants to optimize performance b) the drive can reorder writes, but not across a sync (ACI, but no D) c) the drive has to respect all syncs (ACID) If the drive mgfr wants to make a) the default, that's fine, but there should be an easy way to request the other 2 from an application. Users are not usually sophisticated enough to know when it's okay to cache writes or not okay. For my use of SQLite, b) would probably be fine, but a) apparently is not since it corrupts databases. Michael mentioned doing backups and forgetting about all of this, but many people are using USB drives to store their backups. So there is a bit of a Catch-22 here. Almost all modern-day filesystems are going to depend on some kind of journalling to prevent corrupted file systems, and as far I as know, journalling filesystems depend on syncs to maintain FS consistency. Jim -- HashBackup: easy onsite and offsite Unix backup http://www.hashbackup.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On Mon, Feb 14, 2011 at 11:50 PM, Black, Michael (IS) < michael.bla...@ngc.com> wrote: > And if you want to talk about data reliability...BACK UP YOUR DATA. > The likely failure points I can think of are: > #1 Power supply (redundant supplies available) > #2 Hard drive smoked (and your data is toast anyways -- RAID can help). > #3 Blue screen (or kernel lockup on Unix) > #4 CPU smoked. (usually leads to #3) > #5 RAM smoked. (usually leads to #3) > #6 Motherboard smoked (usually just dies or #3) > > The only way to increase your reliability is to replicate and/or backup. > All the whining about acid-tested drives is a waste of time. #3 through #6 > have no solution though they shouldn't cause the hard drive corruption > you're worried about. And since #1 and #2 have solutions what's the > problem? > > I see some problem especially for sqlite since in contrary to server-side databases, it's more consumer-oriented. Firefox, Apple Mail to name a few. And if we remember that it's very portable we see other possibilities. For example, there's already Portable Firefox that is usually used from usb sticks. What's going to happen to user history of such browser if the user forget to safely unmount the flash drive in the middle of the page reloading? What about music-management software that is portable enough to be used right from the external hard-drive? You don't usually use PostgreSQL or MySql for such applications. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
And you think Jim's timings are wrong because.. I've already shown you can get speed just like he's showing. That's what you get on a good write-cache-enabled drive. And if you want to talk about data reliability...BACK UP YOUR DATA. The likely failure points I can think of are: #1 Power supply (redundant supplies available) #2 Hard drive smoked (and your data is toast anyways -- RAID can help). #3 Blue screen (or kernel lockup on Unix) #4 CPU smoked. (usually leads to #3) #5 RAM smoked. (usually leads to #3) #6 Motherboard smoked (usually just dies or #3) The only way to increase your reliability is to replicate and/or backup. All the whining about acid-tested drives is a waste of time. #3 through #6 have no solution though they shouldn't cause the hard drive corruption you're worried about. And since #1 and #2 have solutions what's the problem? Linux systems have write-cache enabled by default and they are running on an awful lot of servers around the world with no data loss. Michael D. Black Senior Scientist NG Information Systems Advanced Analytics Directorate From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Max Vlasov [max.vla...@gmail.com] Sent: Monday, February 14, 2011 2:02 PM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected On Mon, Feb 14, 2011 at 8:42 PM, Jim Wilcoxsonwrote: > On Mon, Feb 14, 2011 at 10:56 AM, Pavel Ivanov wrote: > >> So my question is, does it maintain the other 3 parts of ACID, so that > >> the database will never be in a corrupted state after a power loss, > >> even though it may be missing some writes that were confirmed? > > > > Jim, I think the answer to your question is already in Max's tests: > > the USB drive is completely unreliable and can easily lead to database > > corruption. I'll explain. Max's tests showed that there were > > situations when database and journal were different not by one > > transaction but by several ones. So if one makes changes to several > > database pages (located in different disk sectors) and/or makes > > transactions touching several database pages (which makes multi-page > > journal files) then these pages can be way out of sync with each other > > (including pages inside journal). And this will easily lead to > > database corruption. > > You are right I think. I wrote my own test program and ran it on a > Macbook Pro with a Seagate USB drive. Here is the Python test > program: > > [jim@mb backup]$ time py t.py > 4999 records inserted in 17.7488458157 seconds; 281.652117097 recs/sec > > Jim, your tests also shows (this time on a different os) that either you have a fantastic hd with 18000 Rpm or just someone at Seagate _want_ you to think you have a fantastic hd :) Just wondering, I know this maybe sounds fantastic, but I'm thinking whether some "acid-passed harddrives" at sqlite.org can encourage manufacturers to hold the horses. The logic would be like this: if some model is present in either section then googling it will make this page very high in the google results (due to high pagerank of sqlite.org). So they probably very quickly notice that this page at least partly can affect their sales. Unfortunately the technical side is more complex, the developers just can't rely on e-mails from users, this should be some sqlite-originated tests performing on a known configuration and it'd better be an oss os with known tuning. Maybe some other, less fantastic form of such tests could be possible... Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On Mon, Feb 14, 2011 at 8:42 PM, Jim Wilcoxsonwrote: > On Mon, Feb 14, 2011 at 10:56 AM, Pavel Ivanov wrote: > >> So my question is, does it maintain the other 3 parts of ACID, so that > >> the database will never be in a corrupted state after a power loss, > >> even though it may be missing some writes that were confirmed? > > > > Jim, I think the answer to your question is already in Max's tests: > > the USB drive is completely unreliable and can easily lead to database > > corruption. I'll explain. Max's tests showed that there were > > situations when database and journal were different not by one > > transaction but by several ones. So if one makes changes to several > > database pages (located in different disk sectors) and/or makes > > transactions touching several database pages (which makes multi-page > > journal files) then these pages can be way out of sync with each other > > (including pages inside journal). And this will easily lead to > > database corruption. > > You are right I think. I wrote my own test program and ran it on a > Macbook Pro with a Seagate USB drive. Here is the Python test > program: > > [jim@mb backup]$ time py t.py > 4999 records inserted in 17.7488458157 seconds; 281.652117097 recs/sec > > Jim, your tests also shows (this time on a different os) that either you have a fantastic hd with 18000 Rpm or just someone at Seagate _want_ you to think you have a fantastic hd :) Just wondering, I know this maybe sounds fantastic, but I'm thinking whether some "acid-passed harddrives" at sqlite.org can encourage manufacturers to hold the horses. The logic would be like this: if some model is present in either section then googling it will make this page very high in the google results (due to high pagerank of sqlite.org). So they probably very quickly notice that this page at least partly can affect their sales. Unfortunately the technical side is more complex, the developers just can't rely on e-mails from users, this should be some sqlite-originated tests performing on a known configuration and it'd better be an oss os with known tuning. Maybe some other, less fantastic form of such tests could be possible... Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On Mon, Feb 14, 2011 at 10:56 AM, Pavel Ivanovwrote: >> So my question is, does it maintain the other 3 parts of ACID, so that >> the database will never be in a corrupted state after a power loss, >> even though it may be missing some writes that were confirmed? > > Jim, I think the answer to your question is already in Max's tests: > the USB drive is completely unreliable and can easily lead to database > corruption. I'll explain. Max's tests showed that there were > situations when database and journal were different not by one > transaction but by several ones. So if one makes changes to several > database pages (located in different disk sectors) and/or makes > transactions touching several database pages (which makes multi-page > journal files) then these pages can be way out of sync with each other > (including pages inside journal). And this will easily lead to > database corruption. You are right I think. I wrote my own test program and ran it on a Macbook Pro with a Seagate USB drive. Here is the Python test program: import os, sqlite3, time path = '/Volumes/TM/file.db' exists = os.path.exists(path) con = sqlite3.connect(path) if exists: k = con.execute('select max(k) from t').fetchone()[0] print 'Previous run:', k con.execute('drop table t') con.commit() con.execute('create table t (k integer primary key)') t0 = time.time() try: for i in range(5000): con.execute('insert into t values(?)', (i,)) con.commit() con.close() except Exception, err: print err, 'at record', i et = time.time() - t0 print i, 'records inserted in', et, 'seconds;', i/et, 'recs/sec' Here are the results: 1. First run, no unplug: [jim@mb backup]$ time py t.py 4999 records inserted in 17.7488458157 seconds; 281.652117097 recs/sec real0m17.794s user0m0.729s sys 0m2.139s 2. Unplugged during the run: [jim@mb backup]$ time py t.py Previous run: 4999 disk I/O error at record 2681 2681 records inserted in 9.2398519516 seconds; 290.156164194 recs/sec real0m9.294s user0m0.410s sys 0m1.193s 3. This shows that only 161 records made it to disk in the previous run, not 2681: [jim@mb backup]$ time py t.py Previous run: 161 disk I/O error at record 977 977 records inserted in 3.34149599075 seconds; 292.384010845 recs/sec real0m3.453s user0m0.162s sys 0m0.432s 4. This time there was no unplug, but SQLite could not recover the db: [jim@mb backup]$ time py t.py Previous run: 161 database disk image is malformed at record 1323 1323 records inserted in 4.26392698288 seconds; 310.277358246 recs/sec real0m4.341s user0m0.193s sys 0m0.535s Jim -- HashBackup: easy onsite and offsite Unix backup http://www.hashbackup.com > > > Pavel > > On Mon, Feb 14, 2011 at 8:49 AM, Jim Wilcoxson wrote: >> On Mon, Feb 14, 2011 at 2:33 AM, Max Vlasov wrote: >>> >>> On Sun, Feb 13, 2011 at 8:53 PM, Jim Wilcoxson wrote: >>> >>> > > But I thought about how it would be possible to test this explanation . >>> > I'm >>> > > going to do some tests that works like this. The same id updating, but >>> > > in >>> > > the middle of 10,000 operation I will unplug the cord, the sqlite will >>> > say >>> > > that it can't commit particular insert and I can see what is the value >>> > > of >>> > > this insert. After replugging two variants possible: >>> > > - I will find the previous value in the base. If for several dozens >>> > > tests >>> > > the result is the same, that would mean that the situation is still >>> > needed >>> > > to be explained. >>> > > - I will find some value less than previous to the one sqlite could not >>> > > commit at least in some of the tests (maybe -3, -4 to the failed one). >>> > > In >>> > > this case the explanation will be confirmed. >>> > > >>> > > How about this? Does it sound reasonable? >>> > > >>> > > >>> > Sounds like a clever test to me! >>> > >>> > >>> > >>> Thanks for supporting :) now the results. >>> >>> I switched off the journal: >>> PRAGMA journal_mode=OFF; >>> As I described, the only record contained id=1 and the sql query was >>> UPDATE TestTable SET Id=Id + 1 >>> The cord was unplugged in the middle of the 10,000 operations when about >>> 5000 records were updated. The hd was bus-powered external hard drive and I >>> repeated the test several times. No test showed expected value that confirms >>> the Jim's explanation about postponed writing. The actual values are below >>> (the first one is the expected value to be found after reopening, the next >>> one is actual value and the difference is self-explainable) >>> >>> 5094 -> 5086 = -8 >>> 5084 -> 5083 = -1 >>> 5070 -> 5049 = -21 >>> 5082 -> 5069 = -13 >>> 5095 -> 5086 = -9 >>> 5072 -> 5033 = -39 >>> 5118 -> 5053 = -65 >>> 5081 -> 5075 = -6 >>> >>> So the maximum of non-flushed commits was 65 >>> >>> I also made a couple of tests with journal on and see what is the difference >>>
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On Mon, Feb 14, 2011 at 7:48 PM, Jim Wilcoxsonwrote: > On Mon, Feb 14, 2011 at 10:56 AM, Pavel Ivanov wrote: > >> So my question is, does it maintain the other 3 parts of ACID, so that > >> the database will never be in a corrupted state after a power loss, > >> even though it may be missing some writes that were confirmed? > > > > Jim, I think the answer to your question is already in Max's tests: > > the USB drive is completely unreliable and can easily lead to database > > corruption. I'll explain. Max's tests showed that there were > > situations when database and journal were different not by one > > transaction but by several ones. So if one makes changes to several > > database pages (located in different disk sectors) and/or makes > > transactions touching several database pages (which makes multi-page > > journal files) then these pages can be way out of sync with each other > > (including pages inside journal). And this will easily lead to > > database corruption. > > > > > > Pavel > > Maybe Max would be kind enough to re-run his test with an insert > instead of an update and see if there ever is corruption or if just > the last n inserts are missing. > > > Jim, I agree it's interesting to know more about usb drive consistency in context of sqlite, at least the testing is easy. Unfortunately when I first started the tests there was already malformed db error but just once. I just managed to repeat this now. The db contains a single empty table CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT) The repeated query is: INSERT INTO TestTable DEFAULT VALUES I suppose the journal should be present othewise ACID test makes not sense pragma journal_mode=delete; The version performing insert was Sqlite 3.7.5 just in case some things were fixed recently So after about 500 inserts I pulled the cord, get 'can't insert #511' and first there was an error in system tray (Windows 7) about postponed writing error for journal file, it surprised a little since it partly breaks OS innocence that officially states it works for "quick removal", but actually complains on a not-finished operation. Maybe this means interrupting in the middle of a writing to a journal file that routes to this tray error, don't know Ok, I replug the drive. No journal file was present. On open: 3.7.5 shell said "Error: database disk image is malformed" for both pragma integrity_check; and pragma encoding; 3.6.10 correctly said UTF-8 on pragma encoding; and on integrity_check; said *** in database main *** On page 2 at right child: invalid page number 7 Btw, some of the tests (malformed and no) made system itself suspecting errors on this NTFS drive suggesting checking the disk and in the report there was a line about "fixing volume bit map" (can't say in exactly in english since the error was on russian windows). This part returns some innocence since probably even windows ntfs driver suffers from this postponed writing inside the drive. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On 14 Feb 2011, at 4:48pm, Jim Wilcoxson wrote: > I believe (though am not sure) that the drive can reorder individual > sector/record writes between syncs however it likes, as long as it > doesn't allow writes to cross a sync and also executes the sync groups > (for lack of a better term - the group of writes between syncs) in > order. At the low-level of the device driver there is a distinction between telling the hard disk to 'sync' and 'flush'. They both force all pending writes, but one returns immediately and the other waits until the operation is complete before returning. OS device-drivers writers tend to choose one or the other rather than implement them both, but which one to use is a matter of taste and prejudice. The definitive way to do it is simply to stop giving the drive commands and wait until it says "I have zero writes pending.". Unfortunately it's harder to detect this (you have keep polling) than it is to just issue a 'sync'. Add to this the problems that the drive you're using may be part of a RAID array or even a mounted drive image (because you're running on a virtual machine), and you have a complete nightmare of consistency out there. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On Mon, Feb 14, 2011 at 10:56 AM, Pavel Ivanovwrote: >> So my question is, does it maintain the other 3 parts of ACID, so that >> the database will never be in a corrupted state after a power loss, >> even though it may be missing some writes that were confirmed? > > Jim, I think the answer to your question is already in Max's tests: > the USB drive is completely unreliable and can easily lead to database > corruption. I'll explain. Max's tests showed that there were > situations when database and journal were different not by one > transaction but by several ones. So if one makes changes to several > database pages (located in different disk sectors) and/or makes > transactions touching several database pages (which makes multi-page > journal files) then these pages can be way out of sync with each other > (including pages inside journal). And this will easily lead to > database corruption. > > > Pavel Hmm... I guess I am not quite convinced. :) I believe (though am not sure) that the drive can reorder individual sector/record writes between syncs however it likes, as long as it doesn't allow writes to cross a sync and also executes the sync groups (for lack of a better term - the group of writes between syncs) in order. As an example, say you are inserting records on each commit instead of updating. Maybe the last record inserted was 1000 and the plug is pulled. When plugged back in, if you only have 800 records in the db, it's not so bad. However, if there is a corrupt db instead, it is much worse. Maybe Max would be kind enough to re-run his test with an insert instead of an update and see if there ever is corruption or if just the last n inserts are missing. Jim -- HashBackup: easy onsite and offsite Unix backup http://www.hashbackup.com > > On Mon, Feb 14, 2011 at 8:49 AM, Jim Wilcoxson wrote: >> On Mon, Feb 14, 2011 at 2:33 AM, Max Vlasov wrote: >>> >>> On Sun, Feb 13, 2011 at 8:53 PM, Jim Wilcoxson wrote: >>> >>> > > But I thought about how it would be possible to test this explanation . >>> > I'm >>> > > going to do some tests that works like this. The same id updating, but >>> > > in >>> > > the middle of 10,000 operation I will unplug the cord, the sqlite will >>> > say >>> > > that it can't commit particular insert and I can see what is the value >>> > > of >>> > > this insert. After replugging two variants possible: >>> > > - I will find the previous value in the base. If for several dozens >>> > > tests >>> > > the result is the same, that would mean that the situation is still >>> > needed >>> > > to be explained. >>> > > - I will find some value less than previous to the one sqlite could not >>> > > commit at least in some of the tests (maybe -3, -4 to the failed one). >>> > > In >>> > > this case the explanation will be confirmed. >>> > > >>> > > How about this? Does it sound reasonable? >>> > > >>> > > >>> > Sounds like a clever test to me! >>> > >>> > >>> > >>> Thanks for supporting :) now the results. >>> >>> I switched off the journal: >>> PRAGMA journal_mode=OFF; >>> As I described, the only record contained id=1 and the sql query was >>> UPDATE TestTable SET Id=Id + 1 >>> The cord was unplugged in the middle of the 10,000 operations when about >>> 5000 records were updated. The hd was bus-powered external hard drive and I >>> repeated the test several times. No test showed expected value that confirms >>> the Jim's explanation about postponed writing. The actual values are below >>> (the first one is the expected value to be found after reopening, the next >>> one is actual value and the difference is self-explainable) >>> >>> 5094 -> 5086 = -8 >>> 5084 -> 5083 = -1 >>> 5070 -> 5049 = -21 >>> 5082 -> 5069 = -13 >>> 5095 -> 5086 = -9 >>> 5072 -> 5033 = -39 >>> 5118 -> 5053 = -65 >>> 5081 -> 5075 = -6 >>> >>> So the maximum of non-flushed commits was 65 >>> >>> I also made a couple of tests with journal on and see what is the difference >>> between expected, non-restored value, and restored value. One of the >>> results: >>> 5078 (expeced) -> 5077 (non restored) -> 5021 (restored) >>> It seems that for correctly operating os/hardware the numbers should be >>> 5078->5077->5077 or no journal present depending on the moment of >>> unplugging. So this postponing also made possible existing of some prior >>> version of journal file. >>> >>> So, the next question was 'where?'. Is this software or hardware to blame. >>> Richard always mentioned hardware in this context, I decided to check this >>> with another device. This time it was a harddisk box with separated bus and >>> power (the hd was Maxtor IDE 80Gb drive). >>> >>> The first variation was similar to the one with bus-powered, this time I >>> used hard button on the box that is equivalent to unplugging both connection >>> and power. The difference for a single test was really impressive 5355 -> >>> 4445 = -910. And when I calculated
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On 14 Feb 2011, at 3:56pm, Pavel Ivanov wrote: >> So my question is, does it maintain the other 3 parts of ACID, so that >> the database will never be in a corrupted state after a power loss, >> even though it may be missing some writes that were confirmed? > > Jim, I think the answer to your question is already in Max's tests: > the USB drive is completely unreliable and can easily lead to database > corruption. I'll explain. Max's tests showed that there were > situations when database and journal were different not by one > transaction but by several ones. So if one makes changes to several > database pages (located in different disk sectors) and/or makes > transactions touching several database pages (which makes multi-page > journal files) then these pages can be way out of sync with each other > (including pages inside journal). And this will easily lead to > database corruption. It's worth noting that SQLite's journaling mechanism is designed to make up for some ACID frailties of the storage medium. Even if your drive doesn't support ACID correctly, SQLite (seen as a whole and used as intended) may still be able to manage ACID on it. You can see a discussion of this in section 2 of http://www.sqlite.org/atomiccommit.html Technology has now moved on from when that page was written. For example, it's no longer possible for a drive to write part of a sector: the entire sector would be marked as corrupt now. However, it might be worth changing the page (or splitting section 2 off to another page) to list precisely the requirements of the storage medium SQLite needs to accomplish this. For instance, it may be that if the drive doesn't support write-in-order then even SQLite can't guarantee ACID. I don't see a proper cure for this: software has to trust hardware to some extent, but unlike the old days hardware is now a set of independent components, each of which can lie to the others. For SQLite to support ACID it may be that you need to set up your computer as a server, using drives bought as server drives which are set up with write-through-caching and other slower but safer settings. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
> So my question is, does it maintain the other 3 parts of ACID, so that > the database will never be in a corrupted state after a power loss, > even though it may be missing some writes that were confirmed? Jim, I think the answer to your question is already in Max's tests: the USB drive is completely unreliable and can easily lead to database corruption. I'll explain. Max's tests showed that there were situations when database and journal were different not by one transaction but by several ones. So if one makes changes to several database pages (located in different disk sectors) and/or makes transactions touching several database pages (which makes multi-page journal files) then these pages can be way out of sync with each other (including pages inside journal). And this will easily lead to database corruption. Pavel On Mon, Feb 14, 2011 at 8:49 AM, Jim Wilcoxsonwrote: > On Mon, Feb 14, 2011 at 2:33 AM, Max Vlasov wrote: >> >> On Sun, Feb 13, 2011 at 8:53 PM, Jim Wilcoxson wrote: >> >> > > But I thought about how it would be possible to test this explanation . >> > I'm >> > > going to do some tests that works like this. The same id updating, but in >> > > the middle of 10,000 operation I will unplug the cord, the sqlite will >> > say >> > > that it can't commit particular insert and I can see what is the value of >> > > this insert. After replugging two variants possible: >> > > - I will find the previous value in the base. If for several dozens tests >> > > the result is the same, that would mean that the situation is still >> > needed >> > > to be explained. >> > > - I will find some value less than previous to the one sqlite could not >> > > commit at least in some of the tests (maybe -3, -4 to the failed one). In >> > > this case the explanation will be confirmed. >> > > >> > > How about this? Does it sound reasonable? >> > > >> > > >> > Sounds like a clever test to me! >> > >> > >> > >> Thanks for supporting :) now the results. >> >> I switched off the journal: >> PRAGMA journal_mode=OFF; >> As I described, the only record contained id=1 and the sql query was >> UPDATE TestTable SET Id=Id + 1 >> The cord was unplugged in the middle of the 10,000 operations when about >> 5000 records were updated. The hd was bus-powered external hard drive and I >> repeated the test several times. No test showed expected value that confirms >> the Jim's explanation about postponed writing. The actual values are below >> (the first one is the expected value to be found after reopening, the next >> one is actual value and the difference is self-explainable) >> >> 5094 -> 5086 = -8 >> 5084 -> 5083 = -1 >> 5070 -> 5049 = -21 >> 5082 -> 5069 = -13 >> 5095 -> 5086 = -9 >> 5072 -> 5033 = -39 >> 5118 -> 5053 = -65 >> 5081 -> 5075 = -6 >> >> So the maximum of non-flushed commits was 65 >> >> I also made a couple of tests with journal on and see what is the difference >> between expected, non-restored value, and restored value. One of the >> results: >> 5078 (expeced) -> 5077 (non restored) -> 5021 (restored) >> It seems that for correctly operating os/hardware the numbers should be >> 5078->5077->5077 or no journal present depending on the moment of >> unplugging. So this postponing also made possible existing of some prior >> version of journal file. >> >> So, the next question was 'where?'. Is this software or hardware to blame. >> Richard always mentioned hardware in this context, I decided to check this >> with another device. This time it was a harddisk box with separated bus and >> power (the hd was Maxtor IDE 80Gb drive). >> >> The first variation was similar to the one with bus-powered, this time I >> used hard button on the box that is equivalent to unplugging both connection >> and power. The difference for a single test was really impressive 5355 -> >> 4445 = -910. And when I calculated numbers for non-interrupted test the >> drive showed about 300 commits per second. >> >> The second variation was just unplugging the cord but keeping the power >> intact, so if it's drive that caches, it would end its operations >> completely. This time the results were perfect, for example 4822 -> 4822, >> and even 5371 -> 5372 = +1 that actually would mean the process was >> interrupted after all data is written but before sqlite made winsync, os >> reported failure, but the data was already there. >> >> So the sad news about faulty hardware is probably true once again. > > Hi Max - this turned out to be a good test you devised! You know, you > can never learn *too* much. :) > > Your test confirms that the USB drive is saying data is written to the > drive when it really isn't, which seems bad. But it got me thinking a > little more. Your drive for sure has lost the Durability property of > ACID: when SQLite comes back from a commit, the data just written may > or may not be on the drive, as you proved. > > So my question is, does it maintain the other 3
Re: [sqlite] Question about having 2 db connections on same
>Date: Sun, 13 Feb 2011 22:14:29 -0500 >From: Pavel Ivanov> >> I do a transaction on connection 1 using BEGIN IMMEDIATE, some rows updated, >> COMMIT. >> Then, when I attempt same sequence on connection 2, when I do first call to >>step >> to execute begin transaction and it never returns. > >Do you check result code from COMMIT and is it successful? My guess is >you have some SELECT statements which are not reset or finalized >before COMMIT. And your COMMIT in such situation should return >SQLITE_BUSY. If that's true then second BEGIN IMMEDIATE just can't >finish because another writing transaction is not finished yet. > >Pavel Thanks, Pavel. My wrapper class for SQLite prevents a COMMIT with an outstanding statement. You are however correct, it was a prior failed COMMIT. It just happen on the one place in my code where I do not check return from COMMIT. It was a "constraint failure". Thanks also to others who have offered suggestion. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On Mon, Feb 14, 2011 at 2:33 AM, Max Vlasovwrote: > > On Sun, Feb 13, 2011 at 8:53 PM, Jim Wilcoxson wrote: > > > > But I thought about how it would be possible to test this explanation . > > I'm > > > going to do some tests that works like this. The same id updating, but in > > > the middle of 10,000 operation I will unplug the cord, the sqlite will > > say > > > that it can't commit particular insert and I can see what is the value of > > > this insert. After replugging two variants possible: > > > - I will find the previous value in the base. If for several dozens tests > > > the result is the same, that would mean that the situation is still > > needed > > > to be explained. > > > - I will find some value less than previous to the one sqlite could not > > > commit at least in some of the tests (maybe -3, -4 to the failed one). In > > > this case the explanation will be confirmed. > > > > > > How about this? Does it sound reasonable? > > > > > > > > Sounds like a clever test to me! > > > > > > > Thanks for supporting :) now the results. > > I switched off the journal: > PRAGMA journal_mode=OFF; > As I described, the only record contained id=1 and the sql query was > UPDATE TestTable SET Id=Id + 1 > The cord was unplugged in the middle of the 10,000 operations when about > 5000 records were updated. The hd was bus-powered external hard drive and I > repeated the test several times. No test showed expected value that confirms > the Jim's explanation about postponed writing. The actual values are below > (the first one is the expected value to be found after reopening, the next > one is actual value and the difference is self-explainable) > > 5094 -> 5086 = -8 > 5084 -> 5083 = -1 > 5070 -> 5049 = -21 > 5082 -> 5069 = -13 > 5095 -> 5086 = -9 > 5072 -> 5033 = -39 > 5118 -> 5053 = -65 > 5081 -> 5075 = -6 > > So the maximum of non-flushed commits was 65 > > I also made a couple of tests with journal on and see what is the difference > between expected, non-restored value, and restored value. One of the > results: > 5078 (expeced) -> 5077 (non restored) -> 5021 (restored) > It seems that for correctly operating os/hardware the numbers should be > 5078->5077->5077 or no journal present depending on the moment of > unplugging. So this postponing also made possible existing of some prior > version of journal file. > > So, the next question was 'where?'. Is this software or hardware to blame. > Richard always mentioned hardware in this context, I decided to check this > with another device. This time it was a harddisk box with separated bus and > power (the hd was Maxtor IDE 80Gb drive). > > The first variation was similar to the one with bus-powered, this time I > used hard button on the box that is equivalent to unplugging both connection > and power. The difference for a single test was really impressive 5355 -> > 4445 = -910. And when I calculated numbers for non-interrupted test the > drive showed about 300 commits per second. > > The second variation was just unplugging the cord but keeping the power > intact, so if it's drive that caches, it would end its operations > completely. This time the results were perfect, for example 4822 -> 4822, > and even 5371 -> 5372 = +1 that actually would mean the process was > interrupted after all data is written but before sqlite made winsync, os > reported failure, but the data was already there. > > So the sad news about faulty hardware is probably true once again. Hi Max - this turned out to be a good test you devised! You know, you can never learn *too* much. :) Your test confirms that the USB drive is saying data is written to the drive when it really isn't, which seems bad. But it got me thinking a little more. Your drive for sure has lost the Durability property of ACID: when SQLite comes back from a commit, the data just written may or may not be on the drive, as you proved. So my question is, does it maintain the other 3 parts of ACID, so that the database will never be in a corrupted state after a power loss, even though it may be missing some writes that were confirmed? There are many use cases where a drive like yours would be usable by SQLite: it would be as if the async vfs thing was being used. But if using this drive could lead to a corrupted database, the drive isn't safe to use at all with SQLite, IMO. It would be interesting to know the difference, and even better, if SQLite or an SQLite app could somehow run a test on a drive to figure out whether the drive is: a) perfectly safe: every commit is on the metal b) somewhat safe: the last n commits may be missing, but the db will never be corrupted after a power loss c) not safe: the db may be corrupted on power loss I think the difference between a) and b) has to do with write ordering. If you issue the write sequence: a1 a2 a3 sync b1 b2 sync c1 c2 c3 sync then (I think) as long as all a's are written before b's and b's are written before c's,
Re: [sqlite] BUG (3.7.5): identifier quotation characters get part of column names for certain statements
Hi, sorry for the noise. Again trying to supply attachments. Bye. -- Reinhard Nißl -Ursprüngliche Nachricht- Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von Nißl Reinhard Gesendet: Montag, 14. Februar 2011 13:56 An: General Discussion of SQLite Database Betreff: Re: [sqlite] BUG (3.7.5): identifier quotation characters get part of column names for certain statements Hi, looks like the patch did got stripped. I've attached it again. And please find attached a further patch which addresses TK_AGG_COLUMNS. Bye. -- Reinhard Nißl -Ursprüngliche Nachricht- Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von Nißl Reinhard Gesendet: Samstag, 12. Februar 2011 23:42 An: General Discussion of SQLite Database Betreff: Re: [sqlite] BUG (3.7.5): identifier quotation characters get part of column names for certain statements Hi, it took me 6 hours to find the source location which behaves inconsistent. Please find attached a fix for this bug. Bye. -- Reinhard Nißl -Ursprüngliche Nachricht- Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von Nißl Reinhard Gesendet: Freitag, 11. Februar 2011 10:42 An: j...@kreibi.ch; General Discussion of SQLite Database Betreff: Re: [sqlite] BUG (3.7.5): identifier quotation characters get part of column names for certain statements I'd like to create a patch which changes the behavior in that way, but I'm not that used to the sqlite3 internals. From a quick glance at the source, I think it has something to do with TK_COLUMN and TK_VARIABLE. It would be nice if you could give me a pointer where to place the change in the source code. --- src/select.c +++ src/select.c @@ -2710,10 +2710,53 @@ } } /* If we reach this point, flattening is permitted. */ + pList = p->pEList; + { +SrcList *pTabList = pSrc; /* List of tables to select from */ +int fullNames, shortNames; +fullNames = (db->flags & SQLITE_FullColNames)!=0; +shortNames = (db->flags & SQLITE_ShortColNames)!=0; +/* generate alias names for columns that are consistent with generateColumnNames() */ +for(i=0; inExpr; i++){ + if( pList->a[i].zName==0 ){ +Expr *p; +p = pList->a[i].pExpr; + +if( (p->op==TK_COLUMN || p->op==TK_AGG_COLUMN) && pTabList && (shortNames || fullNames) ){ + Table *pTab; + char *zCol; + int j, iCol = p->iColumn; + for(j=0; ALWAYS(jnSrc); j++){ +if( pTabList->a[j].iCursor==p->iTable ) break; + } + assert( jnSrc ); + pTab = pTabList->a[j].pTab; + if( iCol<0 ) iCol = pTab->iPKey; + assert( iCol==-1 || (iCol>=0 && iColnCol) ); + if( iCol<0 ){ +zCol = "rowid"; + }else{ +zCol = pTab->aCol[iCol].zName; + } + if( fullNames ){ +pList->a[i].zName = sqlite3MPrintf(db, "%s.%s", pTab->zName, zCol); + }else{ +pList->a[i].zName = sqlite3DbStrDup(db, zCol); + } +}else{ + const char *zSpan = pList->a[i].zSpan; + if( ALWAYS(zSpan) ){ +pList->a[i].zName = sqlite3DbStrDup(db, zSpan); + } +} + } +} + } + /* Authorize the subquery */ pParse->zAuthContext = pSubitem->zName; sqlite3AuthCheck(pParse, SQLITE_SELECT, 0, 0, 0); pParse->zAuthContext = zSavedAuthContext; @@ -2884,19 +2927,10 @@ **\_ outer query __/ ** ** We look at every expression in the outer query and every place we see ** "a" we substitute "x*3" and every place we see "b" we substitute "y+10". */ -pList = pParent->pEList; -for(i=0; inExpr; i++){ - if( pList->a[i].zName==0 ){ -const char *zSpan = pList->a[i].zSpan; -if( ALWAYS(zSpan) ){ - pList->a[i].zName = sqlite3DbStrDup(db, zSpan); -} - } -} substExprList(db, pParent->pEList, iParent, pSub->pEList); if( isAgg ){ substExprList(db, pParent->pGroupBy, iParent, pSub->pEList); pParent->pHaving = substExpr(db, pParent->pHaving, iParent, pSub->pEList); } --- src/select.c +++ src/select.c @@ -89284,7 +89284,7 @@ Expr *pColExpr = p; /* The expression that is the result column name */ Table *pTab; /* Table associated with this expression */ while( pColExpr->op==TK_DOT ) pColExpr = pColExpr->pRight; - if( pColExpr->op==TK_COLUMN && ALWAYS(pColExpr->pTab!=0) ){ + if( (pColExpr->op==TK_COLUMN || pColExpr->op==TK_AGG_COLUMN) && ALWAYS(pColExpr->pTab!=0) ){ /* For columns use the column name name */ int iCol = pColExpr->iColumn; pTab = pColExpr->pTab; ___ sqlite-users mailing list sqlite-users@sqlite.org
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
Also...here's another part of the benchmark which shows triggers aren't as evil as I thought. Trigger for this example was 2X the manual update. F:\>d:batch 50 1000 12 using wal mode using update trigger Sqlite Version: 3.7.5 Inserting 50 rows using a bulk of 1000 commits per second: 71428.6 F:\>d:batch 50 1000 20 using wal mode using manual update after insert Sqlite Version: 3.7.5 Inserting 50 rows using a bulk of 1000 commits per second: 38461.5 Michael D. Black Senior Scientist NG Information Systems Advanced Analytics Directorate ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] BUG (3.7.5): identifier quotation characters get part of column names for certain statements
Hi, looks like the patch did got stripped. I've attached it again. And please find attached a further patch which addresses TK_AGG_COLUMNS. Bye. -- Reinhard Nißl -Ursprüngliche Nachricht- Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von Nißl Reinhard Gesendet: Samstag, 12. Februar 2011 23:42 An: General Discussion of SQLite Database Betreff: Re: [sqlite] BUG (3.7.5): identifier quotation characters get part of column names for certain statements Hi, it took me 6 hours to find the source location which behaves inconsistent. Please find attached a fix for this bug. Bye. -- Reinhard Nißl -Ursprüngliche Nachricht- Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] Im Auftrag von Nißl Reinhard Gesendet: Freitag, 11. Februar 2011 10:42 An: j...@kreibi.ch; General Discussion of SQLite Database Betreff: Re: [sqlite] BUG (3.7.5): identifier quotation characters get part of column names for certain statements I'd like to create a patch which changes the behavior in that way, but I'm not that used to the sqlite3 internals. From a quick glance at the source, I think it has something to do with TK_COLUMN and TK_VARIABLE. It would be nice if you could give me a pointer where to place the change in the source code. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
SOLVED!! Marcus Grimm and I went back forth a number of times trying to figure out why my benchmarks were so much faster than his. Found it... My SATA RAID setup had "Enable Advanced Performance" on by default (I had never turned it on). My secondary tests on an IDE drive showed similar performance to Marcus. Then I turned on the performance mode for that one and got a 10X performance improvement (advanced write caching...power loss susceptible...UPS needed if you want reliability). The "Enable write caching" was already turned on. To check your setup My Computer/Properties on drive/Hardware/Properties on drive/Policies -- and check "Enable advanced performance". Here's the program we're benching with now. Now compilable on Unix, Windows with VS 2008/2010/tdm-gcc 4.5.1. Some results on a Western Digital 7200RPM IDE drive with advanced performance turned on F:\>d:batch 1000 1 0 Sqlite Version: 3.7.5 Inserting 1000 rows using a bulk of 1 commits per second: 166.7 F:\>d:batch 1000 1 4 using wal mode Sqlite Version: 3.7.5 Inserting 1000 rows using a bulk of 1 commits per second: 1000.0 F:\>d:batch 50 1000 0 Sqlite Version: 3.7.5 Inserting 50 rows using a bulk of 1000 commits per second: 71428.6 F:\>d:batch 50 1000 4 using wal mode Sqlite Version: 3.7.5 Inserting 50 rows using a bulk of 1000 commits per second: 16.7 F:\>d:batch 50 1000 5 << note that in-memory is less than 2X disk-based now. using mode: :memory: using wal mode Sqlite Version: 3.7.5 Inserting 50 rows using a bulk of 1000 commits per second: 25.0 #include #include #include #ifdef _WIN32 #include #include #include #else #include #endif #include #include "sqlite3.h" double elapsed() { #ifdef _WIN32 struct _timeb timebuffer; _ftime( ); return( (double)timebuffer.time + timebuffer.millitm / 1000.0); #else double t1; struct timeval tv; static long base_seconds; gettimeofday(,NULL); if (base_seconds==0) base_seconds=tv.tv_sec; return (tv.tv_sec-base_seconds)+tv.tv_usec/(double)100; #endif } int AddTrigger(sqlite3 *db) { char SqlTxt[256]; int rc; strcpy(SqlTxt,"CREATE TRIGGER XX AFTER INSERT ON t FOR EACH ROW BEGIN "); strcat(SqlTxt,"UPDATE t2 SET n = NEW.i+1;"); strcat(SqlTxt,"END;"); rc = sqlite3_exec(db,SqlTxt,NULL,NULL,NULL); if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } return(0); } int main(int argc, char *argv[]) { sqlite3 *db; sqlite3_stmt *stmt=NULL; int rc; int n=0; int nrec=0; int interval=0; int flags=0; double t1; char SqlTxt[256]; if (argc != 4) { fprintf(stderr,"Usage: %s nrecords commit_interval flags\n",argv[0]); fprintf(stderr,"Flag 0 = simple table\n"); fprintf(stderr,"Flag 1 = in-memory database\n"); fprintf(stderr,"Flag 2 = add index\n"); fprintf(stderr,"Flag 4 = WAL mode\n"); fprintf(stderr,"Flag 8 = Add an update trigger per insert\n"); fprintf(stderr,"Flag 16= Add a manual update per insert\n"); fprintf(stderr,"Flag 32 = Synchronous=Off\n"); fprintf(stderr,"Add flags to combine features\n"); exit(-1); } nrec=atoi(argv[1]); interval=atoi(argv[2]); flags=atoi(argv[3]); if (flags & 1) { puts("using mode: :memory:"); rc=sqlite3_open(":memory:",); } else { remove("batch.db"); rc=sqlite3_open("batch.db",); } if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } rc=sqlite3_exec(db,"create table t (i integer)",NULL,NULL,NULL); if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } rc=sqlite3_exec(db,"create table t2 (n integer)",NULL,NULL,NULL); if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } rc=sqlite3_exec(db,"insert into t2 values(0)",NULL,NULL,NULL); if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } if (flags & 32) { rc=sqlite3_exec(db,"pragma synchronous=OFF",NULL,NULL,NULL); if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } puts("using pragma synchronous=OFF"); } if (flags & 2) { rc=sqlite3_exec(db,"create index tx on t (i)",NULL,NULL,NULL); if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } puts("using index on t(i)"); } if (flags & 4) { rc=sqlite3_exec(db,"pragma journal_mode=WAL",NULL,NULL,NULL); if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } puts("using wal mode"); } if (flags & 8) { AddTrigger(db); puts("using update trigger"); /** if (interval != 1) { fprintf(stderr,"Can't do trigger and begin/commit together\n"); exit(-1); } **/ } if( flags & 16 ) puts("using manual update after insert"); printf("Sqlite Version: %s\n", sqlite3_version); printf("Inserting %d rows using a bulk of %d\n", nrec, interval); sqlite3_prepare_v2(db,"insert into t values(?)",-1,,NULL); t1=elapsed(); if (interval != 1) rc=sqlite3_exec(db,"begin",NULL,NULL,NULL); while(n < nrec) { ++n; if (interval != 1 && (n% interval)==0) {
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On 14 Feb 2011, at 10:30am, Max Vlasov wrote: > On Mon, Feb 14, 2011 at 12:05 PM, Simon Slavinwrote: > >> Can you expand upon your conclusion ? I like your test, and am interested >> to know precisely where you think the lag is happening. > > Simon, the conclusion came from the assumption that if the lag is at the OS > level then unplugging usb cord from powered hd box should also lead to bad > results, but it doesn't. I.e. only in case of keeping power for unplugged hd > box, the last commit reported by sqlite as successful is actually written. > So looks like windows takes very seriously the option "optimize for quick > removal" for removable drives. But both examples show that the drive itself > doesn't obey this option. On the other side SATA/PATA drives connected > directly to the motherboard have option "enable writing cache" that is > explained like hardware-only option, maybe in this case one can actually > affect how the drive is responding, but in contrary to usb tests, these ones > are harder to make. Okay. Nice piece of work. simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On Mon, Feb 14, 2011 at 12:05 PM, Simon Slavinwrote: > > The second variation was just unplugging the cord but keeping the power > > intact, so if it's drive that caches, it would end its operations > > completely. This time the results were perfect, for example 4822 -> 4822, > > and even 5371 -> 5372 = +1 that actually would mean the process was > > interrupted after all data is written but before sqlite made winsync, os > > reported failure, but the data was already there. > > > > So the sad news about faulty hardware is probably true once again. > > Can you expand upon your conclusion ? I like your test, and am interested > to know precisely where you think the lag is happening. > > Simon, the conclusion came from the assumption that if the lag is at the OS level then unplugging usb cord from powered hd box should also lead to bad results, but it doesn't. I.e. only in case of keeping power for unplugged hd box, the last commit reported by sqlite as successful is actually written. So looks like windows takes very seriously the option "optimize for quick removal" for removable drives. But both examples show that the drive itself doesn't obey this option. On the other side SATA/PATA drives connected directly to the motherboard have option "enable writing cache" that is explained like hardware-only option, maybe in this case one can actually affect how the drive is responding, but in contrary to usb tests, these ones are harder to make. I also tried to find any jumper except Master/Slave/etc in the hd, but there are no other. There is a little circuit in the box itself, but it lacks any jumper. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
On 14 Feb 2011, at 7:33am, Max Vlasov wrote: > The first variation was similar to the one with bus-powered, this time I > used hard button on the box that is equivalent to unplugging both connection > and power. The difference for a single test was really impressive 5355 -> > 4445 = -910. And when I calculated numbers for non-interrupted test the > drive showed about 300 commits per second. > > The second variation was just unplugging the cord but keeping the power > intact, so if it's drive that caches, it would end its operations > completely. This time the results were perfect, for example 4822 -> 4822, > and even 5371 -> 5372 = +1 that actually would mean the process was > interrupted after all data is written but before sqlite made winsync, os > reported failure, but the data was already there. > > So the sad news about faulty hardware is probably true once again. Can you expand upon your conclusion ? I like your test, and am interested to know precisely where you think the lag is happening. You may be able to find either low-level OS settings or drive jumper positions which will change the results. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users