On Tue, May 21, 2013 at 12:04 PM, Thanumalayan Sankaranarayana Pillai < madth...@cs.wisc.edu> wrote:
> Hi all, > > Did anyone look into this? I might be setting some config option wrong, > so it would be great if you sent me a "you did something wrong" reply if > you feel that I might have the wrong config (or might be doing something > totally idiotic). > There are no configuration options set wrong. You shouldn't ever have to set configuration options in order to get SQLite to work right. I think instead you are just running up against fundamental limits of modern consumer-grade hardware. I don't think there is anything anybody can do about it. > > I tested with a few other Linux machines and a few different SQLite > versions, and found that there is an upto-5-second delay (the default > filesystem commit interval) in the transactions getting durable (with power > crashes, that is). > > Thanks in advance, > Thanu > > On Sat, May 18, 2013 at 3:41 AM, thanumalayan mad <madth...@gmail.com > >wrote: > > > Hi All, > > > > I was testing out SQLite with a framework I developed. I believe, while > > running on Linux, transactions might not be durable when a power crash > > happens immediately after a commit. I observed this using "SQLite version > > 3.7.16.2 2013-04-12 11:52:43", and kernel "3.8.4-102.fc17.x86_64". Steps > to > > reproduce: > > > > 1. Use a Linux machine with an Ext4 filesystem (default mount options). > > 2. Create a database file, set journal_mode to DELETE, perform a > > transaction using "begin transaction ... commit;". > > 3. Pull the power plug immediately after "commit;" returns. > > 4. Put plug back in, power on the machine, open database file with > SQLite, > > and examine whether the transaction has been executed. > > > > Expected result: You always find that the transaction had been executed. > > Observed result: You sometimes find that the transaction did not execute. > > (To increase the chance that you end with the transaction not having > > executed, create an Ext4 partition in an unmounted hard disk, mount the > > partition with the "-o commit=30" mount option, and do the experiment > with > > the database in that partition). > > > > Of course, it's possible that the transactions are durable, and I'm just > > being totally bonkers, or running SQLite in some really wrong way. If > not, > > I believe that the problem lies in SQLite not doing an fsync() on the > > parent directory after unlink()-ing the rollback journal file. It seems > > that, in Ext4, unlink() is atomic, but not synchronous: the rollback > > journal can exist after rebooting if a power crash follows the unlink(). > > Also, ftruncate() doesn't seem to be synchronous either (non-durability > > even if the journal_mode is set to TRUNCATE). > > > > I haven't examined how other filesystems treat unlink(). I haven't also > > tested with other pragma options and other journal_modes. I expect it > > wouldn't be a problem with WAL (it should be correct even if the log > exists > > after the reboot); however, with things like VACUUM-ing, the database > size > > would probably not be reduced if a power crash happened, because of > > ftruncate() not being synchronous. More generally, I think some of the > code > > assumes that all non-write and non-create operations that modify a file > (in > > other words, unlink and ftruncate) are synchronous; but that's true in > > Linux only if the partition is mounted with "-o dirsync". One more random > > thought: if I'm not wrong, "SQLite crash-tests" currently tests > > partial-flushing only with write() calls.... > > > > Again, sorry for the spam if I'm just being bonkers. > > > > Thanks, > > Thanu > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- D. Richard Hipp d...@sqlite.org _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users