Hi all,

Did anyone look into this? I might be setting some config option wrong,
so it would be great if you sent me a "you did something wrong" reply if
you feel that I might have the wrong config (or might be doing something
totally idiotic).

I tested with a few other Linux machines and a few different SQLite
versions, and found that there is an upto-5-second delay (the default
filesystem commit interval) in the transactions getting durable (with power
crashes, that is).

Thanks in advance,
Thanu

On Sat, May 18, 2013 at 3:41 AM, thanumalayan mad <madth...@gmail.com>wrote:

> Hi All,
>
> I was testing out SQLite with a framework I developed. I believe, while
> running on Linux, transactions might not be durable when a power crash
> happens immediately after a commit. I observed this using "SQLite version
> 3.7.16.2 2013-04-12 11:52:43", and kernel "3.8.4-102.fc17.x86_64". Steps to
> reproduce:
>
> 1. Use a Linux machine with an Ext4 filesystem (default mount options).
> 2. Create a database file, set journal_mode to DELETE, perform a
> transaction using "begin transaction ... commit;".
> 3. Pull the power plug immediately after "commit;" returns.
> 4. Put plug back in, power on the machine, open database file with SQLite,
> and examine whether the transaction has been executed.
>
> Expected result: You always find that the transaction had been executed.
> Observed result: You sometimes find that the transaction did not execute.
> (To increase the chance that you end with the transaction not having
> executed, create an Ext4 partition in an unmounted hard disk, mount the
> partition with the "-o commit=30" mount option, and do the experiment with
> the database in that partition).
>
> Of course, it's possible that the transactions are durable, and I'm just
> being totally bonkers, or running SQLite in some really wrong way. If not,
> I believe that the problem lies in SQLite not doing an fsync() on the
> parent directory after unlink()-ing the rollback journal file. It seems
> that, in Ext4, unlink() is atomic, but not synchronous: the rollback
> journal can exist after rebooting if a power crash follows the unlink().
> Also, ftruncate() doesn't seem to be synchronous either (non-durability
> even if the journal_mode is set to TRUNCATE).
>
> I haven't examined how other filesystems treat unlink(). I haven't also
> tested with other pragma options and other journal_modes. I expect it
> wouldn't be a problem with WAL (it should be correct even if the log exists
> after the reboot); however, with things like VACUUM-ing, the database size
> would probably not be reduced if a power crash happened, because of
> ftruncate() not being synchronous. More generally, I think some of the code
> assumes that all non-write and non-create operations that modify a file (in
> other words, unlink and ftruncate) are synchronous; but that's true in
> Linux only if the partition is mounted with "-o dirsync". One more random
> thought: if I'm not wrong, "SQLite crash-tests" currently tests
> partial-flushing only with write() calls....
>
> Again, sorry for the spam if I'm just being bonkers.
>
> Thanks,
> Thanu
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to