Re: [sqlite] Database corruption check.

2019-04-15 Thread Richard Hipp
On 4/15/19, Tim Streater  wrote:
>>
>> This command was added to the command-line tool recently.
>
> 3.19.3 has it - that's almost two years ago.
>

The .selftest command was added on 2017-03-09 by
https://www.sqlite.org/src/timeline?c=f4fcd46f08ba59d2 and hence as
likely first in release 3.18.0 on 2017-03-28.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption check.

2019-04-15 Thread Tim Streater
On 15 Apr 2019, at 11:02, Simon Slavin  wrote:

> On 15 Apr 2019, at 10:36am, Lullaby Dayal  wrote:
>
>> Thank you very much for your response. The link seems to be helpful. But I
>> fail to run the .selftest command from my sqlite3 prompt. I got the error:
>> unknown command or invalid arguments error.
>
> This command was added to the command-line tool recently.

3.19.3 has it - that's almost two years ago.


-- 
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption check.

2019-04-15 Thread Dominique Devienne
On Mon, Apr 15, 2019 at 11:37 AM Lullaby Dayal 
wrote:

> [...]. But I fail to run the .selftest command from my sqlite3 prompt. I
> got the error:
> unknown command or invalid arguments error.
>

That code dates back to July 2017. So you must have a very old version.


> I am a newbie in SQLite. I am not sure how can I get this to working in our
> application running in QNX on an ARM board.
>

Well, try on desktop first, with up-to-date versions you download from
sqlite.org
Then you you worry about more exotic environments like QNX. --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption check.

2019-04-15 Thread Simon Slavin
On 15 Apr 2019, at 10:36am, Lullaby Dayal  wrote:

> Thank you very much for your response. The link seems to be helpful. But I
> fail to run the .selftest command from my sqlite3 prompt. I got the error:
> unknown command or invalid arguments error.

This command was added to the command-line tool recently.  You can download a 
current version of the tool from



under 'precompiled binaries'.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption check.

2019-04-15 Thread Lullaby Dayal
Hi Richard,

Thank you very much for your response. The link seems to be helpful. But I
fail to run the .selftest command from my sqlite3 prompt. I got the error:
unknown command or invalid arguments error.

I am a newbie in SQLite. I am not sure how can I get this to working in our
application running in QNX on an ARM board.

Thank you!
Lullaby

On Sun, Apr 14, 2019, 5:38 PM Richard Hipp  On 4/14/19, Lullaby Dayal  wrote:
> >
> > For Sqlite database, as per my understanding, implementing pragma
> > integrity_check won't guarantee all errors to be detected.
>
> Maybe you are confused with "PRAGMA quick_check"?  The "PRAGMA
> integrity_check" takes a little longer, but does a better job.  There
> is very little that will slip past integrity_check.
>
> Both of these pragmas only check the meta-data and linkage. If a bit
> flips in the middle of a data field (perhaps due to a cosmic-ray hit
> or something) and that field is not indexed, then there is nothing
> that will detect that change.
>
> To verify the data, you can use checksums.  The
> https://www.sqlite.org/src/file/ext/misc/shathree.c extension
> implements SHA3 hash functions, for example.  The CLI uses those hash
> functions to checksum the data.  In the CLI you can type:
>
>  .selftest --init
>
> And that will create a new table named "selftest" that contains
> checksums for all other tables.  Then later to verify those checksums,
> run just:
>
>  .selftest
>
> You can look at the CLI source code
> (https://www.sqlite.org/src/artifact?ln=7748-7852=c1986496062f9dba)
> to see how this is implemented, and even copy/paste the CLI source
> code into your application, if you want.
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption check.

2019-04-14 Thread Richard Hipp
On 4/14/19, Lullaby Dayal  wrote:
>
> For Sqlite database, as per my understanding, implementing pragma
> integrity_check won't guarantee all errors to be detected.

Maybe you are confused with "PRAGMA quick_check"?  The "PRAGMA
integrity_check" takes a little longer, but does a better job.  There
is very little that will slip past integrity_check.

Both of these pragmas only check the meta-data and linkage. If a bit
flips in the middle of a data field (perhaps due to a cosmic-ray hit
or something) and that field is not indexed, then there is nothing
that will detect that change.

To verify the data, you can use checksums.  The
https://www.sqlite.org/src/file/ext/misc/shathree.c extension
implements SHA3 hash functions, for example.  The CLI uses those hash
functions to checksum the data.  In the CLI you can type:

 .selftest --init

And that will create a new table named "selftest" that contains
checksums for all other tables.  Then later to verify those checksums,
run just:

 .selftest

You can look at the CLI source code
(https://www.sqlite.org/src/artifact?ln=7748-7852=c1986496062f9dba)
to see how this is implemented, and even copy/paste the CLI source
code into your application, if you want.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database corruption check.

2019-04-14 Thread Lullaby Dayal
Hi,

We are using sqlite for our embedded automotive system based on QNX. We
have a requirement to check whether database is corrupted on start-up and
replace it with default database if such a scenario happens.

For Sqlite database, as per my understanding, implementing pragma
integrity_check won't guarantee all errors to be detected. Are there any
recommendations/best practices you suggest to check for corruption? For
eg:- CRC or hashing or ping pong technique? Please share some thoughts on
this..

Thanks,
Lullaby
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-19 Thread Brendan Duddridge
I'm using SQLCipher in my main app and it's using SQLite 3.15.2. However,
my little sample app where I could also easily demonstrate the problem, is
using whatever the built-in SQLite version is in macOS Sierra 10.12.4. I
couldn't find SQLITE_MMAP_READWRITE in the Couchbase Lite source anywhere,
so I'm sure it's not setting it. All I know now is with mmap I/O on, I get
the corruption when the power goes out. With it off, I don't.

On Wed, Apr 19, 2017 at 12:21 AM, Dan Kennedy  wrote:

> On 04/19/2017 02:42 AM, Jens Alfke wrote:
>
>> On Apr 18, 2017, at 2:20 AM, Deon Brewis  wrote:
>>>
>>> It's not like it was subtle - it's a dead on repro. I was able to repro
>>> this by doing a power cycle 2 hours after shutting the app down. OSX didn't
>>> seem to have any interest in flushing mmap files until you soft reboot the
>>> machine.
>>>
>> OK, hang on — I just reread the docs on memory-mapped I/O in SQLite.
>> Memory-mapping is *only used for reads*, never for writes:
>>
>
> It was used for writes in versions before 3.10.0 (January 2016). And still
> is if SQLITE_MMAP_READWRITE is defined (not the default).
>
> Dan.
>
>
>
>
>
>
>
>> When updating the database file, SQLite always makes a copy of the page
 content into heap memory before modifying the page. This is necessary for
 two reasons. First, changes to the database are not supposed to be visible
 to other processes until after the transaction commits and so the changes
 must occur in private memory. Second, SQLite uses a read-only memory map to
 prevent stray pointers in the application from overwriting and corrupting
 the database file.

>>> — https://www.sqlite.org/mmap.html
>>
>> Therefore I can’t imagine how using it could trigger database corruption.
>> It doesn’t affect the way data is written at all!
>>
>> I accept that both of you have experimentally seen that memory-mapping
>> leads to corruption, so I can only assume that either the above
>> documentation is wrong, or that there’s some subtle bug in SQLite that
>> alters the way data is written when memory-mapping is enabled.
>>
>> —Jens
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-19 Thread Dan Kennedy

On 04/19/2017 02:42 AM, Jens Alfke wrote:

On Apr 18, 2017, at 2:20 AM, Deon Brewis  wrote:

It's not like it was subtle - it's a dead on repro. I was able to repro this by 
doing a power cycle 2 hours after shutting the app down. OSX didn't seem to 
have any interest in flushing mmap files until you soft reboot the machine.

OK, hang on — I just reread the docs on memory-mapped I/O in SQLite. 
Memory-mapping is *only used for reads*, never for writes:


It was used for writes in versions before 3.10.0 (January 2016). And 
still is if SQLITE_MMAP_READWRITE is defined (not the default).


Dan.








When updating the database file, SQLite always makes a copy of the page content 
into heap memory before modifying the page. This is necessary for two reasons. 
First, changes to the database are not supposed to be visible to other 
processes until after the transaction commits and so the changes must occur in 
private memory. Second, SQLite uses a read-only memory map to prevent stray 
pointers in the application from overwriting and corrupting the database file.

— https://www.sqlite.org/mmap.html

Therefore I can’t imagine how using it could trigger database corruption. It 
doesn’t affect the way data is written at all!

I accept that both of you have experimentally seen that memory-mapping leads to 
corruption, so I can only assume that either the above documentation is wrong, 
or that there’s some subtle bug in SQLite that alters the way data is written 
when memory-mapping is enabled.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Jens Alfke

> On Apr 18, 2017, at 9:46 PM, Deon Brewis  wrote:
> 
> I did report it:
> http://sqlite.1065341.n5.nabble.com/SQLITE-vs-OSX-mmap-inevitable-catalog-corruption-td85620.html

Thanks for the pointer to the thread. There was a reply by Dan Kennedy shortly 
thereafter:

>> As of yesterday, SQLite uses a read-only mapping in mmap mode. The db 
>> file is written using plain old write(), just as in non-mmap mode: 
>> 
>>http://sqlite.org/src/info/67c5d3c646c8198c
>> 
>> It would be interesting to know if this clears the problem in your 
>> environment. 

Have you re-run your test with versions of SQLite from after this change?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Deon Brewis
> That is a hugely important piece of information, and while I thank you for 
> letting me know, I’m shocked that this hasn’t been conveyed to the SQLite 
> developers (or, apparently, to Apple.) Did you report it anywhere?

I did report it:
http://sqlite.1065341.n5.nabble.com/SQLITE-vs-OSX-mmap-inevitable-catalog-corruption-td85620.html

- Deon 

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Jens Alfke
Sent: Tuesday, April 18, 2017 10:47 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS


> On Apr 18, 2017, at 1:50 AM, Deon Brewis <de...@outlook.com> wrote:
> 
> Are you by change using memory mapped IO (MMAP_SIZE something other than 0)?
> 
> This does not work on OSX. Not even remotely. I tracked an issue down in 
> November 2015, and was able to trivially corrupt a database 100% of the time 
> using the repro steps below. This happens long after our app gets shut down 
> and SQLITE is flushed.  It got fixed when I set MMAP_SIZE to 0.

That is a hugely important piece of information, and while I thank you for 
letting me know, I’m shocked that this hasn’t been conveyed to the SQLite 
developers (or, apparently, to Apple.) Did you report it anywhere?

I’m the architect of Couchbase Lite and the lead developer for iOS and macOS. I 
enabled SQLite’s memory-mapped I/O at least two years ago. I never considered 
it could be problematic since (a) Brendan is IIRC the only iOS/Mac developer 
who’s reported database corruption, and (b) I assumed scenarios like this would 
be covered as part of SQLite testing, either by SQLite themselves or by Apple.

> “From the OSX documentation:
> Note that while fsync() will flush all data from the host to the drive (i.e. 
> the "permanent storage device"), the drive itself may not physically write 
> the data to the platters for quite some time and it may be written in an 
> out-of-order sequence. Specifically, if the drive loses power or the OS 
> crashes, the application may find that only some or none of their data was 
> written.  The disk drive may also re-order the data so that later writes may 
> be present, while earlier writes are not.

This is a statement about hard disk controller firmware and is true for any OS. 
(I used to work for Apple, and corresponded with filesystem architect Dominic 
Giampaolo about this back in the day.) Some disk controllers don’t flush all 
cached data to the platters when told to flush, apparently to boost benchmark 
scores. Darwin has a FULLFSYNC command (an option to ioctl) that triggers an 
expensive full reset of the disk controller, which does produce a full flush.

The filesystem periodically performs FULLFSYNCs to ensure durability of changes 
to filesystem metadata, so that the filesystem itself can’t become corrupted by 
power loss. So I’m surprised that, in your test, cutting power after 
macroscopic amounts of time (2 minutes) have passed since closing the SQLite 
file still resulted in data being lost.

This doesn’t seem like it would be specific to memory-mapped I/O, though. 
Darwin has a universal buffer cache (like many other kernels) so memory-mapped 
writes and ordinary file writes are treated the same way by the filesystem 
cache.

Of course it’s possible there’s some kind of OS bug involved here; if so, it 
seems pretty serious. I’ll be glad to file a bug report with Apple if it turns 
out to be so.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Jens Alfke

> On Apr 18, 2017, at 2:35 PM, Bob Friesenhahn  
> wrote:
> 
> If the filesystem implementation is not fully-coherent, then data written by 
> programmed file I/O may not be reflected in the memory mapped space, 
> resulting in programs using something other than what is in the file, 
> resulting in wrong data being written to the file.

Darwin is fully coherent this way (that’s basically what a Universal Buffer 
Cache guarantees.)

Even if it weren’t, the incoherency would cause problems while making changes, 
i.e. _before_ the power-loss, not afterwards. Which is not what’s seen.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Bob Friesenhahn

On Tue, 18 Apr 2017, Jens Alfke wrote:




On Apr 18, 2017, at 2:20 AM, Deon Brewis  wrote:

It's not like it was subtle - it's a dead on repro. I was able to repro this by 
doing a power cycle 2 hours after shutting the app down. OSX didn't seem to 
have any interest in flushing mmap files until you soft reboot the machine.


OK, hang on — I just reread the docs on memory-mapped I/O in SQLite. 
Memory-mapping is *only used for reads*, never for writes:


When updating the database file, SQLite always makes a copy of the page content 
into heap memory before modifying the page. This is necessary for two reasons. 
First, changes to the database are not supposed to be visible to other 
processes until after the transaction commits and so the changes must occur in 
private memory. Second, SQLite uses a read-only memory map to prevent stray 
pointers in the application from overwriting and corrupting the database file.

— https://www.sqlite.org/mmap.html

Therefore I can’t imagine how using it could trigger database corruption. It 
doesn’t affect the way data is written at all!


If the filesystem implementation is not fully-coherent, then data 
written by programmed file I/O may not be reflected in the memory 
mapped space, resulting in programs using something other than what is 
in the file, resulting in wrong data being written to the file.


GraphicsMagick has a configure test for this and some operating 
systems fail the test.  Sqlite3 is free to adopt this same test.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Brendan Duddridge
Hi Jens,

It would be a good test if you could independently verify my findings using
the sample app I wrote to see if you are able to reproduce the corruption
with memory mapped I/O turned ON and the fact that there's no corruption
when it's turned OFF. I know it seem strange given the documentation you
found.

Can I ask what the benefit of having memory mapped I/O is for SQLite? And
what are the drawbacks of turning it off? I understand what memory mapped
I/O is, reducing the amount of memory needed to load in the data from a
file. I use it in other parts of my app such as adding file attachments to
an email, but I use the higher level NSDataReadingzMappedAlways option on
NSData.

NSData *pdfData = [[NSData alloc] initWithContentsOfURL:pdfURL options:
NSDataReadingMappedAlways error:nil];


I'm just wondering if I'm going to encounter any other issues by disabling
it. Perhaps it needs to be conditionalized for Mac vs. iOS?

On Tue, Apr 18, 2017 at 1:42 PM, Jens Alfke  wrote:

>
> > On Apr 18, 2017, at 2:20 AM, Deon Brewis  wrote:
> >
> > It's not like it was subtle - it's a dead on repro. I was able to repro
> this by doing a power cycle 2 hours after shutting the app down. OSX didn't
> seem to have any interest in flushing mmap files until you soft reboot the
> machine.
>
> OK, hang on — I just reread the docs on memory-mapped I/O in SQLite.
> Memory-mapping is *only used for reads*, never for writes:
>
> >> When updating the database file, SQLite always makes a copy of the page
> content into heap memory before modifying the page. This is necessary for
> two reasons. First, changes to the database are not supposed to be visible
> to other processes until after the transaction commits and so the changes
> must occur in private memory. Second, SQLite uses a read-only memory map to
> prevent stray pointers in the application from overwriting and corrupting
> the database file.
> — https://www.sqlite.org/mmap.html
>
> Therefore I can’t imagine how using it could trigger database corruption.
> It doesn’t affect the way data is written at all!
>
> I accept that both of you have experimentally seen that memory-mapping
> leads to corruption, so I can only assume that either the above
> documentation is wrong, or that there’s some subtle bug in SQLite that
> alters the way data is written when memory-mapping is enabled.
>
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Jens Alfke

> On Apr 18, 2017, at 12:55 PM, Jeffrey Mattox  wrote:
> 
> I'm an iOS and macOS developer.  Mac app bundles are special in other ways 
> beside just having a bit set.

Brendan is not storing his database in the app bundle. Both he and Simon have 
already said that.

—Jens

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Jeffrey Mattox
I'm an iOS and macOS developer.  Mac app bundles are special in other ways 
beside just having a bit set.  For one, there's a security check somewhere that 
verifies that the app bundle has not been changed, as those files are expected 
to be read-only.  Apple says apps should put their data elsewhere (e.g., 
~/Documents/).  I'd never store a file that I write to in the app bundle.

Jeff


> On Apr 18, 2017, at 1:45 PM, Simon Slavin  wrote:
> 
> I don’t think this can be the problem.  As you wrote, a package is just a 
> folder with a bit set.  Not only do they not affect anything but it’s 
> difficult for your app to even tell whether the file is in a folder or a 
> package.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Jens Alfke

> On Apr 18, 2017, at 2:20 AM, Deon Brewis  wrote:
> 
> It's not like it was subtle - it's a dead on repro. I was able to repro this 
> by doing a power cycle 2 hours after shutting the app down. OSX didn't seem 
> to have any interest in flushing mmap files until you soft reboot the machine.

OK, hang on — I just reread the docs on memory-mapped I/O in SQLite. 
Memory-mapping is *only used for reads*, never for writes:

>> When updating the database file, SQLite always makes a copy of the page 
>> content into heap memory before modifying the page. This is necessary for 
>> two reasons. First, changes to the database are not supposed to be visible 
>> to other processes until after the transaction commits and so the changes 
>> must occur in private memory. Second, SQLite uses a read-only memory map to 
>> prevent stray pointers in the application from overwriting and corrupting 
>> the database file.
— https://www.sqlite.org/mmap.html

Therefore I can’t imagine how using it could trigger database corruption. It 
doesn’t affect the way data is written at all!

I accept that both of you have experimentally seen that memory-mapping leads to 
corruption, so I can only assume that either the above documentation is wrong, 
or that there’s some subtle bug in SQLite that alters the way data is written 
when memory-mapping is enabled.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Simon Slavin

On 18 Apr 2017, at 7:37pm, Brendan Duddridge  wrote:

> Well, perhaps the way I've been bundling the db.sqlite file within a
> package triggers this particular bug. Although a package is really just a
> sub-folder, so I don't see how that would make a difference.

I don’t think this can be the problem.  As you wrote, a package is just a 
folder with a bit set.  Not only do they not affect anything but it’s difficult 
for your app to even tell whether the file is in a folder or a package.

If you’re keeping the database inside the application’s bundle, that’s 
different.  But you’re not doing that.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Brendan Duddridge
> I’m the architect of Couchbase Lite and the lead developer for iOS and
> macOS. I enabled SQLite’s memory-mapped I/O at least two years ago. I never
> considered it could be problematic since (a) Brendan is IIRC the only
> iOS/Mac developer who’s reported database corruption, and (b) I assumed
> scenarios like this would be covered as part of SQLite testing, either by
> SQLite themselves or by Apple.
>

Well, perhaps the way I've been bundling the db.sqlite file within a
package triggers this particular bug. Although a package is really just a
sub-folder, so I don't see how that would make a difference. I'm not sure.
But I've been testing more and more now and by disabling the memory mapped
I/O, the database file corruption bug appears to be gone. It's all working
great now. And I could easily cause the corruption prior to that.

I just reverted back to the build that included the memory mapped I/O call
and I was again able to corrupt the database file with the power failure.
Then I again used the version that had memory mapped I/O disabled and I was
no longer able to corrupt the database. So I've tested it both ways
repeatedly and I think I'm going to conclude that the memory mapped I/O
call is the culprit.

Thanks again Deon for sharing your experience with memory mapped I/O in
SQLite on macOS. And thanks Jens for starting this thread to try and get to
the bottom of this problem.

Brendan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Simon Slavin
On 18 Apr 2017, at 6:46pm, Jens Alfke  wrote:

> This is a statement about hard disk controller firmware and is true for any 
> OS. (I used to work for Apple, and corresponded with filesystem architect 
> Dominic Giampaolo about this back in the day.) Some disk controllers don’t 
> flush all cached data to the platters when told to flush, apparently to boost 
> benchmark scores. Darwin has a FULLFSYNC command (an option to ioctl) that 
> triggers an expensive full reset of the disk controller, which does produce a 
> full flush.

That’s what "PRAGMA fullfsync" is about.  From what I can see, SQLite uses it 
in a way that Apple’s documentation says should work.  But it seems that 
despite being all Apple says is necessary, it’s not doing the job.  Note, 
however, that SQLite documentation says that the default value for this flag is 
off, even when running on a Mac.

> [snip]
> 
> This doesn’t seem like it would be specific to memory-mapped I/O, though. 
> Darwin has a universal buffer cache (like many other kernels) so 
> memory-mapped writes and ordinary file writes are treated the same way by the 
> filesystem cache.

Need to check Darwin source code.  The documentation for some operating systems 
considers memory-mapping to be special, and separate from normal file access.  
So things which configure how normal file access is done sometimes don’t apply 
to memory-mapped file access or memory-mapped I/O in general.  Power-loss is a 
good test of memory-mapped changes and problems with them.

> Of course it’s possible there’s some kind of OS bug involved here; if so, it 
> seems pretty serious. I’ll be glad to file a bug report with Apple if it 
> turns out to be so.

People who’ve posted things connected to this don’t tend to mention that 
they’ve set "PRAGMA fullfsync = ON".  It’s the sort of thing you’d only know to 
do after significant debugging.  Maybe they’re not doing it and that’s the 
cause of their problems.

Simon.

Obligatory Disclosure: I’m a Macintosh specialist, a Macintosh programmer, and 
I know a lot about computer security.  But this stuff is at the edge of my area 
of competence.  I’ve never worked for Apple and don’t know the tiny details of 
this stuff.  Don't depend on the things I write here for critical decisions.  
If you’re a member of Apple’s Developer Programme perhaps you should get Apple 
involved.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Jens Alfke

> On Apr 18, 2017, at 1:50 AM, Deon Brewis  wrote:
> 
> Are you by change using memory mapped IO (MMAP_SIZE something other than 0)?
> 
> This does not work on OSX. Not even remotely. I tracked an issue down in 
> November 2015, and was able to trivially corrupt a database 100% of the time 
> using the repro steps below. This happens long after our app gets shut down 
> and SQLITE is flushed.  It got fixed when I set MMAP_SIZE to 0.

That is a hugely important piece of information, and while I thank you for 
letting me know, I’m shocked that this hasn’t been conveyed to the SQLite 
developers (or, apparently, to Apple.) Did you report it anywhere?

I’m the architect of Couchbase Lite and the lead developer for iOS and macOS. I 
enabled SQLite’s memory-mapped I/O at least two years ago. I never considered 
it could be problematic since (a) Brendan is IIRC the only iOS/Mac developer 
who’s reported database corruption, and (b) I assumed scenarios like this would 
be covered as part of SQLite testing, either by SQLite themselves or by Apple.

> “From the OSX documentation:
> Note that while fsync() will flush all data from the host to the drive (i.e. 
> the "permanent storage device"), the drive itself may not physically write 
> the data to the platters for quite some time and it may be written in an 
> out-of-order sequence. Specifically, if the drive loses power or the OS 
> crashes, the application may find that only some or none of their data was 
> written.  The disk drive may also re-order the data so that later writes may 
> be present, while earlier writes are not.

This is a statement about hard disk controller firmware and is true for any OS. 
(I used to work for Apple, and corresponded with filesystem architect Dominic 
Giampaolo about this back in the day.) Some disk controllers don’t flush all 
cached data to the platters when told to flush, apparently to boost benchmark 
scores. Darwin has a FULLFSYNC command (an option to ioctl) that triggers an 
expensive full reset of the disk controller, which does produce a full flush.

The filesystem periodically performs FULLFSYNCs to ensure durability of changes 
to filesystem metadata, so that the filesystem itself can’t become corrupted by 
power loss. So I’m surprised that, in your test, cutting power after 
macroscopic amounts of time (2 minutes) have passed since closing the SQLite 
file still resulted in data being lost.

This doesn’t seem like it would be specific to memory-mapped I/O, though. 
Darwin has a universal buffer cache (like many other kernels) so memory-mapped 
writes and ordinary file writes are treated the same way by the filesystem 
cache.

Of course it’s possible there’s some kind of OS bug involved here; if so, it 
seems pretty serious. I’ll be glad to file a bug report with Apple if it turns 
out to be so.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Simon Slavin

On 18 Apr 2017, at 11:52am, Richard Hipp  wrote:

> I was unaware that memory-mapped I/O was busted on MacOS.  I will
> investigate further, but probably the solution will be that we will
> completely disable memory-mapped I/O on MacOS, just as we have had to
> do for OpenBSD.

I’m guessing that disabling memory-mapping will also prevent the 
SQLITE_IOERR_VNODE error that was mentioned here last week.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Richard Hipp
On 4/18/17, Brendan Duddridge  wrote:
>
> I commented out this line of code in the Couchbase lite CBL_SQliteStorage.m
> source file and no more corruption
>
> //int err = sqlite3_config(SQLITE_CONFIG_MMAP_SIZE,
> (SInt64)kSQLiteMMapSize, (SInt64)-1);
>
>
> It would be really awesome of this were added to the How to Corrupt an
> SQLite Database web page.

I was unaware that memory-mapped I/O was busted on MacOS.  I will
investigate further, but probably the solution will be that we will
completely disable memory-mapped I/O on MacOS, just as we have had to
do for OpenBSD.  Thanks for letting us know that this is a problem.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Brendan Duddridge
Hi Deon,

BINGO

I commented out this line of code in the Couchbase lite CBL_SQliteStorage.m
source file and no more corruption

//int err = sqlite3_config(SQLITE_CONFIG_MMAP_SIZE,
(SInt64)kSQLiteMMapSize, (SInt64)-1);


It would be really awesome of this were added to the How to Corrupt an
SQLite Database web page. Although it would probably be prudent if someone
could verify these findings using my sample app and a patched version of
Couchbase Lite. I'll post the solution to the Couchbase Lite Github issue
that I started.

I used to be able to reproduce the corruption with a single power cut. But
now after commenting out that code, I'm unable to corrupt the database
after about 6 or 7 power cuts. So I think it's safe to say that this
problem is solved.

Big thanks to you Deon. You just saved my bacon! Thank you so much.


Brendan


On Tue, Apr 18, 2017 at 3:20 AM, Deon Brewis <de...@outlook.com> wrote:

> Yip. Tried that. At some point I had like 6 or 7 debug switches in the app
> to try all manner of FULLSYNC/WAL/SYNCHRONOUS combinations. At the end it
> was the MMAP_SIZE that did it.
>
> It's not like it was subtle - it's a dead on repro. I was able to repro
> this by doing a power cycle 2 hours after shutting the app down. OSX didn't
> seem to have any interest in flushing mmap files until you soft reboot the
> machine.
>
> The last time I tried this though was on Yosemite and Mavericks and
> whatever version of SQLITE was out at the time, so things may be different
> now. But it would be the first place I would look for corruption on OSX
> related to power cycling.
>
> - Deon
>
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Simon Slavin
> Sent: Tuesday, April 18, 2017 1:57 AM
> To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
> Subject: Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS
>
>
> On 18 Apr 2017, at 9:50am, Deon Brewis <de...@outlook.com> wrote:
>
> > “From the OSX documentation:
> >
> > Note that while fsync() will flush all data from the host to the drive
> (i.e. the "permanent storage device"),
>
> Deon,
>
> I’m not sure this is related, but have you seen
>
> <http://www.sqlite.org/pragma.html#pragma_fullfsync>
>
> <http://www.sqlite.org/pragma.html#pragma_checkpoint_fullfsync>
>
> ?
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Deon Brewis
Yip. Tried that. At some point I had like 6 or 7 debug switches in the app to 
try all manner of FULLSYNC/WAL/SYNCHRONOUS combinations. At the end it was the 
MMAP_SIZE that did it.

It's not like it was subtle - it's a dead on repro. I was able to repro this by 
doing a power cycle 2 hours after shutting the app down. OSX didn't seem to 
have any interest in flushing mmap files until you soft reboot the machine.

The last time I tried this though was on Yosemite and Mavericks and whatever 
version of SQLITE was out at the time, so things may be different now. But it 
would be the first place I would look for corruption on OSX related to power 
cycling.

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Tuesday, April 18, 2017 1:57 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS


On 18 Apr 2017, at 9:50am, Deon Brewis <de...@outlook.com> wrote:

> “From the OSX documentation:
> 
> Note that while fsync() will flush all data from the host to the drive (i.e. 
> the "permanent storage device"),

Deon,

I’m not sure this is related, but have you seen

<http://www.sqlite.org/pragma.html#pragma_fullfsync>

<http://www.sqlite.org/pragma.html#pragma_checkpoint_fullfsync>

?

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Simon Slavin

On 18 Apr 2017, at 10:07am, Brendan Duddridge  wrote:
> 
>// Enable memory-mapped I/O if available

Please try an alternative version of that code which definitely does not use 
memory mapping in any way.  You may be able to do it by changing that IFDEF 
sequence.  Or you should instead have SQLite execute this command

PRAGMA mmap_size=0;

before it opens any files.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Brendan Duddridge
Hi Deon,

That's interesting about the memory mapped IO. I just looked in the source
for Couchbase Lite and they do something like that on setup:

+ (void) firstTimeSetup {

// Test the version of the actual SQLite implementation at runtime.
Necessary because

// the app might be linked with a custom version of SQLite (like
SQLCipher) instead of the

// system library, so the actual version/features may differ from what
was declared in

// sqlite3.h at compile time.

Log(@"Couchbase Lite using SQLite version %s (%s)",

sqlite3_libversion(), sqlite3_sourceid());

#if 0

for (int i=0; true; i++) {

const char* opt = sqlite3_compileoption_get(i);

if (!opt)

break;

Log(@"SQLite option '%s'", opt);

}

#endif

sSQLiteVersion = sqlite3_libversion_number();

Assert(sSQLiteVersion >= 3007000,

   @"SQLite library is too old (%s); needs to be at least 3.7",
sqlite3_libversion());


// Enable memory-mapped I/O if available

#ifndef SQLITE_CONFIG_MMAP_SIZE

#define SQLITE_CONFIG_MMAP_SIZE22  /* sqlite3_int64, sqlite3_int64 */

#endif

int err = sqlite3_config(SQLITE_CONFIG_MMAP_SIZE, (SInt64)
kSQLiteMMapSize, (SInt64)-1);

if (err != SQLITE_OK)

Log(@"FYI, couldn't enable SQLite mmap: error %d", err);


sqlite3_config(SQLITE_CONFIG_LOG, errorLogCallback, NULL);

}

From your description, it sounds like that be causing this problem. Hmm...

Brendan


On Tue, Apr 18, 2017 at 2:50 AM, Deon Brewis <de...@outlook.com> wrote:

> Are you by change using memory mapped IO (MMAP_SIZE something other than
> 0)?
>
> This does not work on OSX. Not even remotely. I tracked an issue down in
> November 2015, and was able to trivially corrupt a database 100% of the
> time using the repro steps below. This happens long after our app gets shut
> down and SQLITE is flushed.  It got fixed when I set MMAP_SIZE to 0.
>
> Repro (100%):
>
> 1) Run our app
> 2) Shutdown our app
> 3) Wait for our app to cleanly shut down – nothing showing in Activity
> Monitor – and app.db-wal deleted from disk (i.e. SQLITE clean close)
> 4) Wait 2 minutes (so our app isn’t running in this 2 minute period at all)
>
> 5) Copy the our db file to a NAS
> 6) Hard reboot the machine (power cycle).
> 7) Copy the db file to the NAS again (no reopening the app, just copy the
> file back to the NAS after the reboot).
>
> Observe…
>
> The file from #5 still works fine.
> The file from #7 is corrupted.
>
> Note that the app or sqlite is nowhere involved in between #5 and #7
>
>
> I made this note in the bug when I fixed it - I believe it's related, but
> don't have the exact context:
> “From the OSX documentation:
>
> Note that while fsync() will flush all data from the host to the drive
> (i.e. the "permanent storage device"), the drive itself may not physically
> write the data to the platters for quite some time and it may be written in
> an out-of-order sequence. Specifically, if the drive loses power or the OS
> crashes, the application may find that only some or none of their data was
> written.  The disk drive may also re-order the data so that later writes
> may be present, while earlier writes are not.
>
> This is not a theoretical edge case.  This scenario is easily reproduced
> with real world workloads and drive power failures.”
>
> - Deon
>
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Brendan Duddridge
> Sent: Tuesday, April 18, 2017 1:36 AM
> To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
> Subject: Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS
>
> Hi Rowan,
>
> My apologies. My sample app does use NSDocument. But my production app
> doesn't. And they both exhibit the same problem.
>
> On Tue, Apr 18, 2017 at 12:29 AM, Rowan Worth <row...@dug.com> wrote:
>
> > Hey Brendan,
> >
> > I'm no OSX expert, but from what I've read this afternoon about
> > NSDocument and friends combined with what I know about sqlite I have
> > to say you are completely mad to continue passing sqlite databases to
> > NSDocument,
> > *especially* as you don't define your own sub-class to do any of the
> > file management.
> >
> > Relying on some NSDocument sub-class builtin to the system and then
> > also opening the DB with CouchbaseLite may well violate section 2.2.1
> > of "how to corrupt an sqlite database" (multiple copies of sqlite
> > linked into the same application). Even if not, it sounds like
> > NSDocument has a tendency to copy files around for eg. auto-save
> purposes.
> >
> > I'd be very *ve

Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Brendan Duddridge
Hi Simon,

I'm using Couchbase Lite actually, not Couchbase server. Couchbase Lite 1.4
to be specific. Couchbase Lite 2.0 is still under development and it does
in fact use SQLite under the hood as does Couchbase Lite 1.4. They had
worked on using ForestDB, but I think they've abandoned that for their
Couchbase Lite 2.0 version.

I've tried calling pragma fullfsync=1 just after the database file was
opened, but I still had the same problem with corruption after a power
failure.

There's a big discussion I've had with the Couchbase Lite developers here
on their Github issues:

https://github.com/couchbase/couchbase-lite-ios/issues/1482

(newer posts are at the bottom)

On Tue, Apr 18, 2017 at 2:52 AM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> On 18 Apr 2017, at 6:01am, Brendan Duddridge <brend...@gmail.com> wrote:
>
> > Perhaps the only think I can think of that I'm doing that may be
> unorthodox
> > is by storing the SQLite files inside a macOS package. But at the unix
> > layer that's really just a directory, so I don't know how that could
> cause
> > a problem. Unless macOS treats the package in a way differently than a
> > normal folder and is causing things to get moved around or written when a
> > power failure occurs.
>
> You should be safe storing a SQLite database inside a package.  As you
> write, a package is really just a folder with one extra bit set.
>
> Any demonstration code which uses the SQLite API directly we can attempt
> to debug.
>
> If the problem you need solving is actually with Couchbase, you can solve
> it by upgrading to any version from 2.0 onwards.  Since that no longer uses
> SQLite, SQLite database corruption can no longer be a problem.  Current
> versions of Couchbase no longer use SQLite for a persistence layer and
> Couchbase developers are unlikely to assist with any problem you find with
> such an old version.
>
> If your program demonstrating the problem has access to the SQLite
> database solely via libsqlcrypt.a, can you explain what you’re using it for
> ?  Is your database actually direct access to a Lotus Notes file ?
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Simon Slavin

On 18 Apr 2017, at 9:50am, Deon Brewis  wrote:

> “From the OSX documentation:
> 
> Note that while fsync() will flush all data from the host to the drive (i.e. 
> the "permanent storage device"),

Deon,

I’m not sure this is related, but have you seen





?

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Brendan Duddridge
Hi Richard,

I just did another test with my PowerFailureTest app. I launched it and the
count of rows in it was 4402. Without even running the function to import
my data that writes to the SQLite file, I cut the power (holding down the
power button on my MBP).

When my Mac finally rebooted, I saw that the WAL file was 0 bytes, the SHM
file was 32768 bytes, and the db.sqlite file was 10.5 MB (as it was before
the test).

But now when I launch the test app, These messages display in the console
log:

*02:42:38.630| WARNING: SQLite error (code 11): database corruption at line
60553 of [2ef4f3a5b1] {at errorLogCallback:125}*

*02:42:38.631| WARNING: SQLite error (code 11): statement aborts at 18:
[SELECT revs.doc_id, docid, revid, sequence FROM revs, docs WHERE
docs.doc_id = revs.doc_id AND current=1 AND deleted=0 ORDER BY docid ASC,
revid DESC LIMIT ? OFFSET ?] database disk i {at errorLogCallback:125}*

*2017-04-18 02:42:38.630791-0600 PowerFailureTest[672:9264] Unknown error
calling sqlite3_step (11: database disk image is malformed) rs*

*2017-04-18 02:42:38.632178-0600 PowerFailureTest[672:9264] doc count: 408*

So there's definitely some corruption going on. The file was fine before I
cut the power.



On Tue, Apr 18, 2017 at 2:23 AM, Richard Hipp  wrote:

> On 4/18/17, Brendan Duddridge  wrote:
> > In both cases though I can cause SQLite file corruption by cutting the
> > power on my MacBook Pro.
>
> Corruption-by-power-loss problems tend to be very sensitive to timing.
> If you are able to reliably generate the problem by leisurely cutting
> power (holding down the power button) that suggests something else.
> You might be able to create the same effect by issuing a SIGKILL on
> the process (kill -9) at the same point that you would normally cut
> the power.
>
> I'm thinking you have some problem similar to that which is described
> in section 2.4 of https://www.sqlite.org/howtocorrupt.html
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Simon Slavin

On 18 Apr 2017, at 6:01am, Brendan Duddridge <brend...@gmail.com> wrote:

> Perhaps the only think I can think of that I'm doing that may be unorthodox
> is by storing the SQLite files inside a macOS package. But at the unix
> layer that's really just a directory, so I don't know how that could cause
> a problem. Unless macOS treats the package in a way differently than a
> normal folder and is causing things to get moved around or written when a
> power failure occurs.

You should be safe storing a SQLite database inside a package.  As you write, a 
package is really just a folder with one extra bit set.

Any demonstration code which uses the SQLite API directly we can attempt to 
debug.

If the problem you need solving is actually with Couchbase, you can solve it by 
upgrading to any version from 2.0 onwards.  Since that no longer uses SQLite, 
SQLite database corruption can no longer be a problem.  Current versions of 
Couchbase no longer use SQLite for a persistence layer and Couchbase developers 
are unlikely to assist with any problem you find with such an old version.  

If your program demonstrating the problem has access to the SQLite database 
solely via libsqlcrypt.a, can you explain what you’re using it for ?  Is your 
database actually direct access to a Lotus Notes file ?

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Deon Brewis
Are you by change using memory mapped IO (MMAP_SIZE something other than 0)?

This does not work on OSX. Not even remotely. I tracked an issue down in 
November 2015, and was able to trivially corrupt a database 100% of the time 
using the repro steps below. This happens long after our app gets shut down and 
SQLITE is flushed.  It got fixed when I set MMAP_SIZE to 0.

Repro (100%):

1) Run our app
2) Shutdown our app
3) Wait for our app to cleanly shut down – nothing showing in Activity Monitor 
– and app.db-wal deleted from disk (i.e. SQLITE clean close)
4) Wait 2 minutes (so our app isn’t running in this 2 minute period at all)

5) Copy the our db file to a NAS
6) Hard reboot the machine (power cycle).
7) Copy the db file to the NAS again (no reopening the app, just copy the file 
back to the NAS after the reboot).

Observe…

The file from #5 still works fine.
The file from #7 is corrupted.

Note that the app or sqlite is nowhere involved in between #5 and #7


I made this note in the bug when I fixed it - I believe it's related, but don't 
have the exact context:
“From the OSX documentation:

Note that while fsync() will flush all data from the host to the drive (i.e. 
the "permanent storage device"), the drive itself may not physically write the 
data to the platters for quite some time and it may be written in an 
out-of-order sequence. Specifically, if the drive loses power or the OS 
crashes, the application may find that only some or none of their data was 
written.  The disk drive may also re-order the data so that later writes may be 
present, while earlier writes are not.

This is not a theoretical edge case.  This scenario is easily reproduced with 
real world workloads and drive power failures.”

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Brendan Duddridge
Sent: Tuesday, April 18, 2017 1:36 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

Hi Rowan,

My apologies. My sample app does use NSDocument. But my production app doesn't. 
And they both exhibit the same problem.

On Tue, Apr 18, 2017 at 12:29 AM, Rowan Worth <row...@dug.com> wrote:

> Hey Brendan,
>
> I'm no OSX expert, but from what I've read this afternoon about 
> NSDocument and friends combined with what I know about sqlite I have 
> to say you are completely mad to continue passing sqlite databases to 
> NSDocument,
> *especially* as you don't define your own sub-class to do any of the 
> file management.
>
> Relying on some NSDocument sub-class builtin to the system and then 
> also opening the DB with CouchbaseLite may well violate section 2.2.1 
> of "how to corrupt an sqlite database" (multiple copies of sqlite 
> linked into the same application). Even if not, it sounds like 
> NSDocument has a tendency to copy files around for eg. auto-save purposes.
>
> I'd be very *very* surprised if there's an sqlite bug here. I'd take 
> the advice given to you on bountysource and watch your application's 
> file system operations to begin to understand what is going on behind 
> the scenes.
>
> -Rowan
>
>
>
> On 18 April 2017 at 13:01, Brendan Duddridge <brend...@gmail.com> wrote:
>
> > Thanks Richard for your reply.
> >
> > Sorry about the COLLATE problem. That's a Couchbase Lite thing.
> >
> > I find it weird that a different WAL file is getting in there 
> > somehow
> when
> > a power failure occurs. I'm a bit stumped at how it can even write 
> > to the file system the moment the power shuts down so I don't even 
> > understand
> how
> > the database file can get corrupted.  Unless the corruption happens 
> > when the database file is opened up next and tries to use this 
> > improper WAL file.
> >
> > In my main project I'm using SQLCipher as my SQLite layer. But in 
> > the
> demo
> > project I posted, I'm just using the built-in macOS Sierra SQLite
> library.
> > In both cases though I can cause SQLite file corruption by cutting 
> > the power on my MacBook Pro.
> >
> > I'm going to do further investigations to see if there's any misuse 
> > of SQLite that I can find. Although the developers of Couchbase Lite 
> > are far smarter than I and I'm sure they're doing things by the book.
> >
> > Perhaps the only think I can think of that I'm doing that may be
> unorthodox
> > is by storing the SQLite files inside a macOS package. But at the 
> > unix layer that's really just a directory, so I don't know how that 
> > could
> cause
> > a problem. Unless macOS treats the package in a way differently than 
> > a normal folder and is causing things to get moved around

Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Brendan Duddridge
Hi Rowan,

My apologies. My sample app does use NSDocument. But my production app
doesn't. And they both exhibit the same problem.

On Tue, Apr 18, 2017 at 12:29 AM, Rowan Worth  wrote:

> Hey Brendan,
>
> I'm no OSX expert, but from what I've read this afternoon about NSDocument
> and friends combined with what I know about sqlite I have to say you are
> completely mad to continue passing sqlite databases to NSDocument,
> *especially* as you don't define your own sub-class to do any of the file
> management.
>
> Relying on some NSDocument sub-class builtin to the system and then also
> opening the DB with CouchbaseLite may well violate section 2.2.1 of "how to
> corrupt an sqlite database" (multiple copies of sqlite linked into the same
> application). Even if not, it sounds like NSDocument has a tendency to copy
> files around for eg. auto-save purposes.
>
> I'd be very *very* surprised if there's an sqlite bug here. I'd take the
> advice given to you on bountysource and watch your application's file
> system operations to begin to understand what is going on behind the
> scenes.
>
> -Rowan
>
>
>
> On 18 April 2017 at 13:01, Brendan Duddridge  wrote:
>
> > Thanks Richard for your reply.
> >
> > Sorry about the COLLATE problem. That's a Couchbase Lite thing.
> >
> > I find it weird that a different WAL file is getting in there somehow
> when
> > a power failure occurs. I'm a bit stumped at how it can even write to the
> > file system the moment the power shuts down so I don't even understand
> how
> > the database file can get corrupted.  Unless the corruption happens when
> > the database file is opened up next and tries to use this improper WAL
> > file.
> >
> > In my main project I'm using SQLCipher as my SQLite layer. But in the
> demo
> > project I posted, I'm just using the built-in macOS Sierra SQLite
> library.
> > In both cases though I can cause SQLite file corruption by cutting the
> > power on my MacBook Pro.
> >
> > I'm going to do further investigations to see if there's any misuse of
> > SQLite that I can find. Although the developers of Couchbase Lite are far
> > smarter than I and I'm sure they're doing things by the book.
> >
> > Perhaps the only think I can think of that I'm doing that may be
> unorthodox
> > is by storing the SQLite files inside a macOS package. But at the unix
> > layer that's really just a directory, so I don't know how that could
> cause
> > a problem. Unless macOS treats the package in a way differently than a
> > normal folder and is causing things to get moved around or written when a
> > power failure occurs.
> >
> > This problem has been plaguing me for quite a long time actually. I hope
> > that I can find a solution somehow.
> >
> > Thanks,
> >
> > Brendan
> >
> > > I worked around the "COLLATE JSON" problem (by writing my own JSON
> > > collation).  That allows me to analyze your database and WAL file.
> > >
> > > It appears that the WAL file is not the correct WAL file for that
> > > database.  It is as if someone has taken an unrelated WAL file and
> > > renamed it to have the same base name as your database.  Or the other
> > > way around - someone has renamed your database to have the same base
> > > name as the WAL file.
> > >
> > > Your demonstration application does not call SQLite directly.  Instead
> > > it appears to use two libraries that in turn call SQLite:
> > > libsqlcrypt.a and the CouchbaseLite.framework.  You do not provide
> > > sources to these other libraries, so I am unable to deduce what they
> > > are doing.
> > >
> > > So, in the absence of further evidence, I am going to diagnose this as
> > > a misuse of SQLite by one of the two libraries that you are linking -
> > > probably a misuse in the form of trying to rename or unlink or
> > > otherwise modify the database file using ordinary operating system
> > > calls while a connection to the database is open.
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Brendan Duddridge
Hey Rowan,

In my Mac app I'm actually not using NSDocument. I used to and then I
changed my document subclass to extend NSObject instead, just in case
NSDocument was doing bad things behind my back. I had also disabled
auto-save and versions anyway. For a while though I tried extending
NSPersistentDocument even though my app isn't using CoreData. But it also
didn't make a difference.

I'm still storing my SQLite database in a file package even without using
NSDocument. The strange thing though is this is never a problem on iOS
where my code extends UIDocument, but for all intents and purposes behaves
pretty much the same way, at least in terms of organization of the database
files. Only macOS seems to have the corruption problem.

On Tue, Apr 18, 2017 at 12:29 AM, Rowan Worth  wrote:

> Hey Brendan,
>
> I'm no OSX expert, but from what I've read this afternoon about NSDocument
> and friends combined with what I know about sqlite I have to say you are
> completely mad to continue passing sqlite databases to NSDocument,
> *especially* as you don't define your own sub-class to do any of the file
> management.
>
> Relying on some NSDocument sub-class builtin to the system and then also
> opening the DB with CouchbaseLite may well violate section 2.2.1 of "how to
> corrupt an sqlite database" (multiple copies of sqlite linked into the same
> application). Even if not, it sounds like NSDocument has a tendency to copy
> files around for eg. auto-save purposes.
>
> I'd be very *very* surprised if there's an sqlite bug here. I'd take the
> advice given to you on bountysource and watch your application's file
> system operations to begin to understand what is going on behind the
> scenes.
>
> -Rowan
>
>
>
> On 18 April 2017 at 13:01, Brendan Duddridge  wrote:
>
> > Thanks Richard for your reply.
> >
> > Sorry about the COLLATE problem. That's a Couchbase Lite thing.
> >
> > I find it weird that a different WAL file is getting in there somehow
> when
> > a power failure occurs. I'm a bit stumped at how it can even write to the
> > file system the moment the power shuts down so I don't even understand
> how
> > the database file can get corrupted.  Unless the corruption happens when
> > the database file is opened up next and tries to use this improper WAL
> > file.
> >
> > In my main project I'm using SQLCipher as my SQLite layer. But in the
> demo
> > project I posted, I'm just using the built-in macOS Sierra SQLite
> library.
> > In both cases though I can cause SQLite file corruption by cutting the
> > power on my MacBook Pro.
> >
> > I'm going to do further investigations to see if there's any misuse of
> > SQLite that I can find. Although the developers of Couchbase Lite are far
> > smarter than I and I'm sure they're doing things by the book.
> >
> > Perhaps the only think I can think of that I'm doing that may be
> unorthodox
> > is by storing the SQLite files inside a macOS package. But at the unix
> > layer that's really just a directory, so I don't know how that could
> cause
> > a problem. Unless macOS treats the package in a way differently than a
> > normal folder and is causing things to get moved around or written when a
> > power failure occurs.
> >
> > This problem has been plaguing me for quite a long time actually. I hope
> > that I can find a solution somehow.
> >
> > Thanks,
> >
> > Brendan
> >
> > > I worked around the "COLLATE JSON" problem (by writing my own JSON
> > > collation).  That allows me to analyze your database and WAL file.
> > >
> > > It appears that the WAL file is not the correct WAL file for that
> > > database.  It is as if someone has taken an unrelated WAL file and
> > > renamed it to have the same base name as your database.  Or the other
> > > way around - someone has renamed your database to have the same base
> > > name as the WAL file.
> > >
> > > Your demonstration application does not call SQLite directly.  Instead
> > > it appears to use two libraries that in turn call SQLite:
> > > libsqlcrypt.a and the CouchbaseLite.framework.  You do not provide
> > > sources to these other libraries, so I am unable to deduce what they
> > > are doing.
> > >
> > > So, in the absence of further evidence, I am going to diagnose this as
> > > a misuse of SQLite by one of the two libraries that you are linking -
> > > probably a misuse in the form of trying to rename or unlink or
> > > otherwise modify the database file using ordinary operating system
> > > calls while a connection to the database is open.
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users 

Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Richard Hipp
On 4/18/17, Brendan Duddridge  wrote:
> In both cases though I can cause SQLite file corruption by cutting the
> power on my MacBook Pro.

Corruption-by-power-loss problems tend to be very sensitive to timing.
If you are able to reliably generate the problem by leisurely cutting
power (holding down the power button) that suggests something else.
You might be able to create the same effect by issuing a SIGKILL on
the process (kill -9) at the same point that you would normally cut
the power.

I'm thinking you have some problem similar to that which is described
in section 2.4 of https://www.sqlite.org/howtocorrupt.html

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-18 Thread Rowan Worth
Hey Brendan,

I'm no OSX expert, but from what I've read this afternoon about NSDocument
and friends combined with what I know about sqlite I have to say you are
completely mad to continue passing sqlite databases to NSDocument,
*especially* as you don't define your own sub-class to do any of the file
management.

Relying on some NSDocument sub-class builtin to the system and then also
opening the DB with CouchbaseLite may well violate section 2.2.1 of "how to
corrupt an sqlite database" (multiple copies of sqlite linked into the same
application). Even if not, it sounds like NSDocument has a tendency to copy
files around for eg. auto-save purposes.

I'd be very *very* surprised if there's an sqlite bug here. I'd take the
advice given to you on bountysource and watch your application's file
system operations to begin to understand what is going on behind the scenes.

-Rowan



On 18 April 2017 at 13:01, Brendan Duddridge  wrote:

> Thanks Richard for your reply.
>
> Sorry about the COLLATE problem. That's a Couchbase Lite thing.
>
> I find it weird that a different WAL file is getting in there somehow when
> a power failure occurs. I'm a bit stumped at how it can even write to the
> file system the moment the power shuts down so I don't even understand how
> the database file can get corrupted.  Unless the corruption happens when
> the database file is opened up next and tries to use this improper WAL
> file.
>
> In my main project I'm using SQLCipher as my SQLite layer. But in the demo
> project I posted, I'm just using the built-in macOS Sierra SQLite library.
> In both cases though I can cause SQLite file corruption by cutting the
> power on my MacBook Pro.
>
> I'm going to do further investigations to see if there's any misuse of
> SQLite that I can find. Although the developers of Couchbase Lite are far
> smarter than I and I'm sure they're doing things by the book.
>
> Perhaps the only think I can think of that I'm doing that may be unorthodox
> is by storing the SQLite files inside a macOS package. But at the unix
> layer that's really just a directory, so I don't know how that could cause
> a problem. Unless macOS treats the package in a way differently than a
> normal folder and is causing things to get moved around or written when a
> power failure occurs.
>
> This problem has been plaguing me for quite a long time actually. I hope
> that I can find a solution somehow.
>
> Thanks,
>
> Brendan
>
> > I worked around the "COLLATE JSON" problem (by writing my own JSON
> > collation).  That allows me to analyze your database and WAL file.
> >
> > It appears that the WAL file is not the correct WAL file for that
> > database.  It is as if someone has taken an unrelated WAL file and
> > renamed it to have the same base name as your database.  Or the other
> > way around - someone has renamed your database to have the same base
> > name as the WAL file.
> >
> > Your demonstration application does not call SQLite directly.  Instead
> > it appears to use two libraries that in turn call SQLite:
> > libsqlcrypt.a and the CouchbaseLite.framework.  You do not provide
> > sources to these other libraries, so I am unable to deduce what they
> > are doing.
> >
> > So, in the absence of further evidence, I am going to diagnose this as
> > a misuse of SQLite by one of the two libraries that you are linking -
> > probably a misuse in the form of trying to rename or unlink or
> > otherwise modify the database file using ordinary operating system
> > calls while a connection to the database is open.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-17 Thread Brendan Duddridge
Thanks Richard for your reply.

Sorry about the COLLATE problem. That's a Couchbase Lite thing.

I find it weird that a different WAL file is getting in there somehow when
a power failure occurs. I'm a bit stumped at how it can even write to the
file system the moment the power shuts down so I don't even understand how
the database file can get corrupted.  Unless the corruption happens when
the database file is opened up next and tries to use this improper WAL file.

In my main project I'm using SQLCipher as my SQLite layer. But in the demo
project I posted, I'm just using the built-in macOS Sierra SQLite library.
In both cases though I can cause SQLite file corruption by cutting the
power on my MacBook Pro.

I'm going to do further investigations to see if there's any misuse of
SQLite that I can find. Although the developers of Couchbase Lite are far
smarter than I and I'm sure they're doing things by the book.

Perhaps the only think I can think of that I'm doing that may be unorthodox
is by storing the SQLite files inside a macOS package. But at the unix
layer that's really just a directory, so I don't know how that could cause
a problem. Unless macOS treats the package in a way differently than a
normal folder and is causing things to get moved around or written when a
power failure occurs.

This problem has been plaguing me for quite a long time actually. I hope
that I can find a solution somehow.

Thanks,

Brendan

> I worked around the "COLLATE JSON" problem (by writing my own JSON
> collation).  That allows me to analyze your database and WAL file.
>
> It appears that the WAL file is not the correct WAL file for that
> database.  It is as if someone has taken an unrelated WAL file and
> renamed it to have the same base name as your database.  Or the other
> way around - someone has renamed your database to have the same base
> name as the WAL file.
>
> Your demonstration application does not call SQLite directly.  Instead
> it appears to use two libraries that in turn call SQLite:
> libsqlcrypt.a and the CouchbaseLite.framework.  You do not provide
> sources to these other libraries, so I am unable to deduce what they
> are doing.
>
> So, in the absence of further evidence, I am going to diagnose this as
> a misuse of SQLite by one of the two libraries that you are linking -
> probably a misuse in the form of trying to rename or unlink or
> otherwise modify the database file using ordinary operating system
> calls while a connection to the database is open.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-16 Thread Richard Hipp
On 4/16/17, Richard Hipp  wrote:
> I cannot seem to find the implementation for COLLATE JSON anywhere in
> your source code.  Can you give me a hint as to which source file I
> should be looking in?

I worked around the "COLLATE JSON" problem (by writing my own JSON
collation).  That allows me to analyze your database and WAL file.

It appears that the WAL file is not the correct WAL file for that
database.  It is as if someone has taken an unrelated WAL file and
renamed it to have the same base name as your database.  Or the other
way around - someone has renamed your database to have the same base
name as the WAL file.

Your demonstration application does not call SQLite directly.  Instead
it appears to use two libraries that in turn call SQLite:
libsqlcrypt.a and the CouchbaseLite.framework.  You do not provide
sources to these other libraries, so I am unable to deduce what they
are doing.

So, in the absence of further evidence, I am going to diagnose this as
a misuse of SQLite by one of the two libraries that you are linking -
probably a misuse in the form of trying to rename or unlink or
otherwise modify the database file using ordinary operating system
calls while a connection to the database is open.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-16 Thread Richard Hipp
I cannot seem to find the implementation for COLLATE JSON anywhere in
your source code.  Can you give me a hint as to which source file I
should be looking in?

On 4/16/17, Brendan Duddridge  wrote:
> Hi,
>
> I know this is an old thread, but I just found it now when I was doing some
> research on this topic. Thanks Jens for starting this thread. When Jens
> said he knew a developer who could create a corrupted database by turning
> off the power, I'm pretty sure he was talking about me. This has been an
> ongoing problem for me for a while now. My customers keep getting corrupted
> SQLite databases due to power failures or forced shutdowns.
>
> I wrote a small sample app to demonstrate the problem just to make sure it
> wasn't my own app causing the problem in some complex way.
>
> Someone on this thread said to post a sample of a corrupted database. So
> here's a link to download a file that has become corrupted:
>
> https://www.dropbox.com/s/5xwsfhorrfvefjy/corrupted-sqlite-powerfailure-mac.zip?dl=0
>
> The way it started out was an SQLite file with a table in it called docs
> (along with other tables) with 408 rows in it. I imported a bunch of data
> into the SQLite file so that there should be 4402 rows in the docs table,
> along with data stored in other tables.
>
> After the import and the data was committed to the database (while the app
> was still launched but not doing anything), I shutdown my MacBook Pro
> forcibly by holding the power key down until it shut off. When I restarted
> and opened the database file using the Mac SQLiteManager app, The docs
> table now contained only 2631 rows, with a lot of NULL data displayed in
> the docs table. That should not be possible with the library I'm using.
>
> If you're interested in trying out the sample application I wrote to see
> the corruption for yourself, you can download it here:
>
> https://www.dropbox.com/s/q2r4bz7n1d5fgag/PowerFailureTest.zip?dl=0
>
> You'll need a Mac and Xcode 8.x to run it. I would just have only Xcode and
> the sample app running when I caused the Mac's power to be killed.
>
> When the database is opened, it's using WAL journal_mode and also has
> pragma fullfsync=1 enabled.
>
> Any ideas of settings I could try to resolve this problem would be greatly
> appreciated by me and my customers.
>
> Thanks,
>
> Brendan Duddridge
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2017-04-16 Thread Brendan Duddridge
Hi,

I know this is an old thread, but I just found it now when I was doing some
research on this topic. Thanks Jens for starting this thread. When Jens
said he knew a developer who could create a corrupted database by turning
off the power, I'm pretty sure he was talking about me. This has been an
ongoing problem for me for a while now. My customers keep getting corrupted
SQLite databases due to power failures or forced shutdowns.

I wrote a small sample app to demonstrate the problem just to make sure it
wasn't my own app causing the problem in some complex way.

Someone on this thread said to post a sample of a corrupted database. So
here's a link to download a file that has become corrupted:

https://www.dropbox.com/s/5xwsfhorrfvefjy/corrupted-sqlite-powerfailure-mac.zip?dl=0

The way it started out was an SQLite file with a table in it called docs
(along with other tables) with 408 rows in it. I imported a bunch of data
into the SQLite file so that there should be 4402 rows in the docs table,
along with data stored in other tables.

After the import and the data was committed to the database (while the app
was still launched but not doing anything), I shutdown my MacBook Pro
forcibly by holding the power key down until it shut off. When I restarted
and opened the database file using the Mac SQLiteManager app, The docs
table now contained only 2631 rows, with a lot of NULL data displayed in
the docs table. That should not be possible with the library I'm using.

If you're interested in trying out the sample application I wrote to see
the corruption for yourself, you can download it here:

https://www.dropbox.com/s/q2r4bz7n1d5fgag/PowerFailureTest.zip?dl=0

You'll need a Mac and Xcode 8.x to run it. I would just have only Xcode and
the sample app running when I caused the Mac's power to be killed.

When the database is opened, it's using WAL journal_mode and also has
pragma fullfsync=1 enabled.

Any ideas of settings I could try to resolve this problem would be greatly
appreciated by me and my customers.

Thanks,

Brendan Duddridge
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-16 Thread Richard Hipp
On 11/16/16, Simon Slavin  wrote:
>
> There's also a tool from the SQLite team which can analyze a corrupt SQLite
> file and tell some things about how it is corrupt.  Unfortunately I can't
> remember what it's called or where to find it.  But I heard about it on this
> list and I hope someone can.
>

make showdb

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-16 Thread Simon Slavin

On 16 Nov 2016, at 7:25am, Jens Alfke  wrote:

> The case I’m concerned about happens on Macs, as I said, and the developer 
> says he can trigger it in his app via a power failure or a forced shutdown 
> (holding the power button for 5 seconds.) This version of our library is one 
> I wrote, in Objective-C; it uses a thin Obj-C wrapper around SQLite, but it 
> doesn’t do anything magic, and although it started out as 3rd party code I’ve 
> tweaked it enough that I know it inside and out.

Ah, that's enough information that someone around here should be able to help.  
Can you use the shell tool to run

PRAGMA integrity_check

on one such corrupt file and post what it finds ?  If the output is long, 
posting a summary is fine.

There's also a tool from the SQLite team which can analyze a corrupt SQLite 
file and tell some things about how it is corrupt.  Unfortunately I can't 
remember what it's called or where to find it.  But I heard about it on this 
list and I hope someone can.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Jens Alfke

> On Nov 15, 2016, at 1:46 PM, Simon Slavin  wrote:
> 
> Do these people admit they're letting their phones run out of power ?  

There’s nothing wrong with letting your phone run out of power, and software 
should be resilient to it. I don’t think that’s the problem, though, because 
mobile OS’s perform a clean shutdown when power drops below a minimum 
threshold, so it’s not like pulling the plug on a desktop computer (or holding 
down the power button on a laptop because the kernel froze.)

> If not, I suspect some sort of programming error somewhere.  Not necessarily 
> in your own programming but sometimes in whatever database API you're using 
> which eventually results in calls to the SQLite API.
> So is your software in C or C++ and calling the SQLite API directly ?

It’s possible. Our Android library is written in Java and uses some Java 
wrapper library around SQLite. I don’t know the details myself.

The case I’m concerned about happens on Macs, as I said, and the developer says 
he can trigger it in his app via a power failure or a forced shutdown (holding 
the power button for 5 seconds.) This version of our library is one I wrote, in 
Objective-C; it uses a thin Obj-C wrapper around SQLite, but it doesn’t do 
anything magic, and although it started out as 3rd party code I’ve tweaked it 
enough that I know it inside and out.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread R Smith


On 2016/11/15 10:34 PM, Jens Alfke wrote:



On Nov 15, 2016, at 10:57 AM, Simon Slavin  wrote:

sqlite> PRAGMA checkpoint_fullfsync;
1

I wasn’t aware of that pragma. Just tried it on my Mac (10.12.1), and its value 
is 1 even if I don’t first set pragma fullfsync; i.e. it defaults to 1. 
(Contradicting the docs.)

So it appears that on macOS, SQLite does use F_FULLFSYNC when checkpointing, 
but not at other times that it fsyncs. What does that mean in actual use, 
assuming that I always use WAL mode? Is there still an opportunity for 
corruption in the face of power failures?

(Sorry to be frothing at the mouth about this; but my team’s dealing with a few 
users/customers whose apps encounter db corruption, on Android as well as 
macOS, and we’re getting really frustrated trying to figure out what’s going 
on.)


Quite OK to be unsettled by learning that a flaw in the system that you 
assumed did not exist, might be the cause of your troubles. I think 
however something is missing in the complete understanding, so to be clear:


Calling F_FULLFSYNC when checkpointing or otherwise invokes a contract 
between the running software (your system) and the Operating System 
whereby the Operating system promises to A - put the current buffer's 
worth of written data INTO the BUS feeding the writable media, AND B - 
then asking said media to confirm the writing has happened (committed)  
BEFORE handing back control (moving your thread pointer along). This is 
not exactly the same for all OSes, but more or less similar.


This can slow down things, but sometimes the security is worth the price 
paid in latency. However, Hard drive manufacturers tend to sometimes lie 
about having committed your data. It is a well known and almost 
universally used method in standard desktop / laptop computers for the 
harddrive to tell the OS that: "YES indeed, I have committed" when in 
fact it is still piping data into the platters. Yes SSD's are better at 
this by simple virtue of lower latency from buffer-to-silicone, but they 
are not above lying either.


This means that unless you have a SERVER quality drive with typically 
its own battery-backup that guarantees ANY buffered writes to reach the 
platters, there simply is zero guarantee that all writes WILL go to 
disk, and any normal system that guarantees it lies.


This does not mean however that you should be experiencing corruption. 
SQLite might not be able to guarantee all writes reaching the disk, but 
in most cases, if a final write did not happen, the usual last step in 
committing a transaction is deleting / truncating a journal file or 
writing a checkpoint marker or such, which, if it did not happen, should 
have the entire write roll back (next time you open the DB) and leave 
you in a non-corrupt state. IF this does not happen it means a write may 
have happened out of order (not very common but can happen) or some 
other worse problem occurred - most importantly, FULL_FSYNC isn't the 
wild goose to be chasing. Whether or not any write happened is never an 
acceptable cause of corruption, so trying to wrestle with the thing that 
promises to make writes happen "more" as a causal relation to a 
corruption problem, is simply moot. (This is vigorously tested with 
every release of SQLite too).


If you can get the DB files (journals and all) from such a system where 
a user claims to be able to reproduce the corruption reliably, that 
would be an easy thing to check and the Devs here might learn something 
from it. You can simply make something that copy all the DB files before 
opening them at startup, until you have produced a corrupt DB, then 
those last copied files will be the corrupted DB files that can be 
investigated.


Good luck!
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Jean-Christophe Deschamps


At 22:41 15/11/2016, you wrote:

So if you're truly worried about flush-to-disk what do you do 
?  Solution 1 is to buy hard disks rated for servers -- sometimes 
called "enterprise-class hard drives" -- and to set the DIP switches 
to tell them they're being used on a server.  Those things are 
intended for use in servers, and ACID can be important to a server, so 
they support it properly and do not lie.


It's even both best and worse than that.

I enjoy running an old and ugly diy machine with 8 SAS HP disks (15k 
rpm) arranged in RAID6 behind a serious hardware controler having a 
good share of battery-backed RAM. Those enterprise-class disks don't 
have any switch and will lie to you and the OS as much as they can if 
you let them do and don't ask for acknowledgement that each write is 
final, but the key piece is the Areca controler which actually manages 
the RAID array and knows when/if some write has actually landed on 
surface. Should the power supply vanish, it keeps a track of what still 
needs to be written and will silently update the disks when power comes 
back, even before the OS is loaded.


So no, even very expensive hard disks themselves don't take any step to 
avoid lying to you/OS if you/OS don't care over "details", but the 
controler (if a serious one) will fill the gap and insure that 
everything is fine.


To be honest I'm almost certain that there can exist extreme situations 
where, for instance, the RAID array is in tatters (e.g. more than 2 of 
6 disks simultaneously failing) and won't accept writes while the 
controler battery is going dangerously low.
But if your needs are thusly mission critical you probably can afford 
enough redundant sophisticated hardware to make the probability of a 
disaster (e.g. a one-bit flip) as low as required.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Bob Friesenhahn

On Tue, 15 Nov 2016, Simon Slavin wrote:


Modern storage subsystems (hard disk or SSD) intended for use in a 
normal user computer always lie to the OS about flushing to disk. 
The apparent increase in speed from doing this is so big that every 
manufacturer has to do it, or risk having every review harp on about 
how slow their computer performs in real life tasks.  There is no 
way to get these things to be honest.  Because not one person in ten 
thousand cares.  Most people are more interested in how fast they 
can launch Microsoft Word.


I don't think that things are as bad as you say.  Some modern 
filesystems (e.g. zfs) depend on hardware cache flush to work yet 
there has not been a rash of corrupted filesystems.  Many people use 
these filesystems on non-enterprise hardware.


There are some devices which do fail to flush their cache or write 
data properly.  Some SSDs likely re-write data while in use or idle 
due to wear leveling in a way which causes a possibility of loss.


Enterprise disks are more tolerant of vibration, are specified to 
have fewer uncorrected bit errors, and try for a bounded time to 
recover bad sectors.


MacOS's target market is not storage.  The useful mass storage 
offerings for hardware running MacOS is rather limited.


Operating systems where fsync() or fdatasync() do not truely commit 
data to hardware are broken.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Simon Slavin

On 15 Nov 2016, at 8:34pm, Jens Alfke  wrote:

> (Sorry to be frothing at the mouth about this; but my team’s dealing with a 
> few users/customers whose apps encounter db corruption, on Android as well as 
> macOS, and we’re getting really frustrated trying to figure out what’s going 
> on.)

Do these people admit they're letting their phones run out of power ?  If not, 
I suspect some sort of programming error somewhere.  Not necessarily in your 
own programming but sometimes in whatever database API you're using which 
eventually results in calls to the SQLite API.

So is your software in C or C++ and calling the SQLite API directly ?

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Simon Slavin

On 15 Nov 2016, at 8:18pm, Jens Alfke  wrote:

> The only way to guarantee a true barrier is to really-and-truly flush the 
> disk controller, which requires not simply flushing but resetting it. That’s 
> what F_FULLFSYNC on macOS does. (Unfortunately it makes the disk controller 
> unresponsive for a macroscopic amount of time while it resets, which is why 
> it’s slow. Worse, it adds latency to _all_ I/O on the system. Flush too often 
> on a desktop computer and your music player will glitch, video playback will 
> drop frames, etc.)

It's worse than that.

Modern storage subsystems (hard disk or SSD) intended for use in a normal user 
computer always lie to the OS about flushing to disk.  The apparent increase in 
speed from doing this is so big that every manufacturer has to do it, or risk 
having every review harp on about how slow their computer performs in real life 
tasks.  There is no way to get these things to be honest.  Because not one 
person in ten thousand cares.  Most people are more interested in how fast they 
can launch Microsoft Word.

So if you're truly worried about flush-to-disk what do you do ?  Solution 1 is 
to buy hard disks rated for servers -- sometimes called "enterprise-class hard 
drives" -- and to set the DIP switches to tell them they're being used on a 
server.  Those things are intended for use in servers, and ACID can be 
important to a server, so they support it properly and do not lie.

Solution 2 works on any hard drive.  To make it flush cache just don't do 
anything that involves disk access for a couple of seconds.  They note the 
inactivity, then decide they have time to write away cache.

But if all the contact you have with the hard disk is through the OS ?  Forget 
it.  If power to the computer is but you're probably depending on the amount of 
power left in the system to flush data to disk.  And given the tiny power 
consumption those things have these days most of the time that's enough.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Simon Slavin

On 15 Nov 2016, at 8:34pm, Jens Alfke  wrote:

> On Nov 15, 2016, at 10:57 AM, Simon Slavin  wrote:
> 
>> sqlite> PRAGMA checkpoint_fullfsync;
>> 1
> 
> I wasn’t aware of that pragma. Just tried it on my Mac (10.12.1), and its 
> value is 1 even if I don’t first set pragma fullfsync; i.e. it defaults to 1. 
> (Contradicting the docs.)
> 
> So it appears that on macOS, SQLite does use F_FULLFSYNC when checkpointing, 
> but not at other times that it fsyncs.

Yet the sequence I posted was from my own Macintosh, running 10.12.[something I 
can't talk about].  And as you see from my post the pragma starts off as 0 but 
if you set it to 1 it'll stay at 1.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Jens Alfke

> On Nov 15, 2016, at 10:57 AM, Simon Slavin  wrote:
> 
> sqlite> PRAGMA checkpoint_fullfsync;
> 1

I wasn’t aware of that pragma. Just tried it on my Mac (10.12.1), and its value 
is 1 even if I don’t first set pragma fullfsync; i.e. it defaults to 1. 
(Contradicting the docs.)

So it appears that on macOS, SQLite does use F_FULLFSYNC when checkpointing, 
but not at other times that it fsyncs. What does that mean in actual use, 
assuming that I always use WAL mode? Is there still an opportunity for 
corruption in the face of power failures?

(Sorry to be frothing at the mouth about this; but my team’s dealing with a few 
users/customers whose apps encounter db corruption, on Android as well as 
macOS, and we’re getting really frustrated trying to figure out what’s going 
on.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Jens Alfke

> On Nov 15, 2016, at 10:57 AM, Simon Slavin  wrote:
> 
> My understanding is that F_FULLFSYNC still works the way you describe on a 
> Mac and SQLite still uses it the way the documentation says.  But I'm not in 
> touch with either development group.


This seems like a strange decision, considering that in other respects SQLite 
(wisely) defaults to settings that favor safety and durability over 
performance, e.g. synchronous mode.

So perhaps the “How To Corrupt A SQLite Database” page should be updated to add 
“Use SQLite on a Mac without knowing the secret that you should enable `pragma 
fullfsync`?” :-(


Incidentally, that page (https://www.sqlite.org/howtocorrupt.html 
) says:

>> Actually, if one is only concerned with atomic and consistent writes and is 
>> willing to forego durable writes, the sync operation does not need to wait 
>> until the content is completely stored on persistent media. Instead, the 
>> sync operation can be thought of as an I/O barrier. As long as all writes 
>> that occur before the sync are completed before any write that happens after 
>> the sync, no database corruption will occur.

So … based on my understanding (what I was told by a filesystem guru, albeit 10 
years ago), a typical fsync does _not_ act as an I/O barrier. This is because 
it’s not really syncing (as the same page says, "most consumer-grade mass 
storage devices lie about syncing”). Disk controllers don’t write physical 
sectors in the order in which the writes are received; instead they buffer 
them, and then optimize the order of writes to minimize seeks. So if power is 
abruptly lost, it is not true that there’s a barrier before which all writes 
have been persisted and after which none have; instead the pattern is likely to 
be random. 

The only way to guarantee a true barrier is to really-and-truly flush the disk 
controller, which requires not simply flushing but resetting it. That’s what 
F_FULLFSYNC on macOS does. (Unfortunately it makes the disk controller 
unresponsive for a macroscopic amount of time while it resets, which is why 
it’s slow. Worse, it adds latency to _all_ I/O on the system. Flush too often 
on a desktop computer and your music player will glitch, video playback will 
drop frames, etc.)

I’m totally willing to believe that these behaviors are different with SSDs, 
especially ones that use newer types of interfaces or are soldered directly to 
the motherboard (as on some MacBooks.) I’d love to be educated about the 
current state of the art :)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Simon Slavin

On 15 Nov 2016, at 6:11pm, Jens Alfke  wrote:

> I verified in the built-in sqlite3 tool on macOS 10.12.1 that the result of 
> `pragma fullfsync` is 0.

The default setting is 0.  But you can change it.  On my Mac running 10.12.x,

SQLite version 3.14.0 2016-07-26 15:17:14
Enter ".help" for usage hints.
sqlite> PRAGMA fullfsync;
0
sqlite> PRAGMA fullfsync=1;
sqlite> PRAGMA fullfsync;
1
sqlite> PRAGMA checkpoint_fullfsync;
1
sqlite> 

My understanding is that F_FULLFSYNC still works the way you describe on a Mac 
and SQLite still uses it the way the documentation says.  But I'm not in touch 
with either development group.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database corruption, and PRAGMA fullfsync on macOS

2016-11-15 Thread Jens Alfke
I’m seeing conflicting information about SQLite’s use of F_FULLFSYNC on macOS 
when committing transactions. This is making me nervous about durability and 
the possibility of database corruption.

The SQLite docs for PRAGMA fullfsync 
(https://www.sqlite.org/pragma.html#pragma_fullfsync 
) say:
>  This flag determines whether or not the F_FULLFSYNC syncing method is used 
> on systems that support it. The default value of the fullfsync flag is off. 
> Only Mac OS X supports F_FULLFSYNC.

I verified in the built-in sqlite3 tool on macOS 10.12.1 that the result of 
`pragma fullfsync` is 0.

My understanding is that issuing a F_FULLFSYNC ioctl call is important for 
ensuring that all writes have been flushed to physical storage, since disk 
controllers may lie and ignore a regular sync instruction, leaving some writes 
in their volatile cache memory. (I was working at Apple, on a project using 
SQLite, during the development of OS X 10.4 circa 2006; there were a number of 
databases that got corrupted by kernel panics or forced power-off, until Apple 
had the F_FULLFSYNC call added to SQLite.)

Apple’s documentation for Core Data (which uses SQLite) says: 
(https://developer.apple.com/library/content/documentation/Cocoa/Conceptual/CoreData/PersistentStoreFeatures.html
 
)
> In OS X the fsync command does not guarantee that bytes are written, so 
> SQLite sends a F_FULLFSYNC request to the kernel to ensure that the bytes are 
> actually written through to the drive platter. This request causes the kernel 
> to flush all buffers to the drives and causes the drives to flush their track 
> caches. Without this, there is a significantly large window of time within 
> which data will reside in volatile memory. If system failure occurs you risk 
> data corruption.


This contradicts the SQLite docs, but it does match my understanding, and until 
today I thought that SQLite on macOS enabled fullfsync by default.

So which of these is true?
• F_FULLFSYNC is no longer necessary for full durability (due to some change in 
the way SQLite commits?)
or
• SQLite by default sacrifices durability and data integrity for performance by 
skipping the [slow] F_FULLFSYNC call

This is disturbing because we do have one developer using our framework who has 
been reporting occasional database corruption. He says it’s happened to users, 
and he claims to be able to make it happen by forcibly powering-down his laptop 
(though we haven’t been able to reproduce it.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption question

2016-07-20 Thread Tim Streater
On 20 Jul 2016 at 20:29, Richard Hipp  wrote: 

> On 7/19/16, Doug Nebeker  wrote:
>>  there is only one process using the database file (though multiple
>> threads, but they each have their own database handle).
>
> That threading mode (https://www.sqlite.org/threadsafe.html) are you
> using?  Are you sure that you are using the threading mode that you
> think you are using?  If you have not done so already, have you tried
> setting the threading mode to SERIALIZED to see if that clears your
> problem?

Is there a way to determine which threading mode is in use? I'm running the 
SQLite built-in to an IDE (Xojo) which uses SQLite 3.9.2. I couldn't see a 
pragma to use.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption question

2016-07-20 Thread Richard Hipp
On 7/19/16, Doug Nebeker  wrote:
>  there is only one process using the database file (though multiple
> threads, but they each have their own database handle).

That threading mode (https://www.sqlite.org/threadsafe.html) are you
using?  Are you sure that you are using the threading mode that you
think you are using?  If you have not done so already, have you tried
setting the threading mode to SERIALIZED to see if that clears your
problem?

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption question

2016-07-20 Thread Simon Slavin

On 19 Jul 2016, at 5:43pm, Doug Nebeker  wrote:

> Would it be possible for a file scanning process (anti-virus, backup, etc) to 
> grab hold of a database file at just the right moment, momentarily blocking a 
> write or delete, and causing corruption?

It might prevent the program from accessing the file in a way that the program 
interprets as corruption.  It should not actually corrupt the file on disk.  In 
that if you later reopen the file and run

PRAGMA integrity_check

it should not find a problem.

Does that command find a problem with your file ?  Once you have noticed the 
problem are you restoring to an uncorrupt version before continuing use of the 
file ?

> There is a greater chance that this is my bug or environmental than being in 
> SQLite, so I'm looking for anything I can do to decrease these occurrences.

Are you using any PRAGMAs after you open the file or special modes in your open 
command ?

Do you check the result codes returned by all commands run on the file to make 
sure they're SQLITE_OK ?  The command which notices that the file is corrupt is 
run sometime /after/ the command which corrupts it.  It's probably not the 
command which causes the corruption.

When you make changes to the file do you use _exec() or 
_prepare,_step,_finalize ?

Is the database short enough to allow you to run 'PRAGMA integrity_check' 
occasionally (when your app starts or quits ?  At 3am every morning ?) and 
check the result ?

Do not forget that your mention of 'environmental' includes the possibility of 
a faulty hard disk.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database corruption question

2016-07-20 Thread Doug Nebeker
Been using SQLite for a long time and a huge fan.  We occasionally see database 
corruption on a local NTFS Windows drive and I've been trying to figure it out.

I finally have some logs from the SQLITE_CONFIG_LOG callback that may be of 
help:

(11) database corruption at line 78267 of [3d862f207e]
statement aborts at 29: [DELETE FROM StatData WHERE StatID IN (SELECT StatID 
FROM Statistic WHERE OwnerType IN (4) AND 
OwningComputer='f7ab745b-1aa7-4159-bbf7-b0b4d1262804') AND Date < 1350715577;]

This is for version 3.11, and the line is for a OP_NotExists case.
There was no power outage or other OS problem, this was not a restored database 
but had been in use for at least 12 hours.  It was not using WAL, and there is 
only one process using the database file (though multiple threads, but they 
each have their own database handle).

I've been through https://www.sqlite.org/howtocorrupt.html a number of times 
over the years :)

Would it be possible for a file scanning process (anti-virus, backup, etc) to 
grab hold of a database file at just the right moment, momentarily blocking a 
write or delete, and causing corruption?

There is a greater chance that this is my bug or environmental than being in 
SQLite, so I'm looking for anything I can do to decrease these occurrences.

Thanks for any input.

Doug
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database corruption using zipvfs on android 64 bits (x86_64)

2015-11-07 Thread Dan Kennedy
On 11/06/2015 11:57 PM, Alexandre Mainville wrote:
> Hi,
>
> I am experiencing database corruption using zipvfs on android 64
> bits (x86_64).
> The same program executing on 32 bits runs without problems.
>
> The program creates a new database on every run and does a series of
> inserts and updates (always the same series of operations).
>
> Both programs produce the same SQLITE_DEBUG trace up until the point the 64
> bit version outputs:
> zipvfs database is corrupt. Line 1100 of [9d6c1880fb]
> because of a zero size payload in a slot header.
>
> I am waiting on a device to test behavior with arm64-v8a.
>
> I am using sqlite 3.8.8.3 and zipvfs 3.8.8.
>
> If anyone has experienced the same or has some insight on this, it would be
> greatly appreciated.


Are you able to run this on a workstation? If so, try running the 
program under valgrind ( http://valgrind.org ).

One way to debug this might be to run the zipvfs integrity-check (i.e. 
ZIPVFS_CTRL_INTEGRITY_CHECK) after each transaction. If you can figure 
out which transaction is introducing the corruption and take a copy of 
the database before and after it is executed, we might be able to figure 
out what is going on based on that and the log below.

Another would be to capture the statements executed using an 
sqlite3_trace() callback, then try to use them to reproduce the 
corruption using the shell tool. Even if that doesn't produce the 
corruption, it will be worth running such a script under valgrind to 
search for uninitialized variable problems etc.

I can't see any bugfixes in the changelog since 3.8.8 to do with corrupt 
databases, but it might be worth upgrading to the latest zipvfs and 
sqlite to check if the problem is still present anyway.

Regards,
Dan.








[sqlite] Database corruption using zipvfs on android 64 bits (x86_64)

2015-11-06 Thread Dominique Pellé
Alexandre Mainville wrote:

> Hi,
>
> I am experiencing database corruption using zipvfs on android 64
> bits (x86_64).
> The same program executing on 32 bits runs without problems.
>
> The program creates a new database on every run and does a series of
> inserts and updates (always the same series of operations).
>
> Both programs produce the same SQLITE_DEBUG trace up until the point the 64
> bit version outputs:
> zipvfs database is corrupt. Line 1100 of [9d6c1880fb]
> because of a zero size payload in a slot header.
>
> I am waiting on a device to test behavior with arm64-v8a.
>
> I am using sqlite 3.8.8.3 and zipvfs 3.8.8.
>
> If anyone has experienced the same or has some insight on this, it would be
> greatly appreciated.
>
> Thanks
>
> Alex

I'm not sure whether it's the same problem, but on our
project we had memory corruptions on Android with zipvfs.
The bug happened only on Android. Valgrind and asan on
Linux was showing no bug.

The bug turned out not to be in SQLite, but in zlib.
Android was using zlib-1.2.7. After using zlib-1.2.8
instead, corruption did not happen anymore.

Reading the changes in zlib-1.2.8 at http://www.zlib.net I saw:

"Version 1.2.8 fixes a very rare bug in decompression.
All users are encouraged to upgrade immediately."

Can you check whether you use zlib-1.2.7 or zlib-1.2.8?

Dominique


[sqlite] Database corruption using zipvfs on android 64 bits (x86_64)

2015-11-06 Thread Alexandre Mainville
Hi,

I am experiencing database corruption using zipvfs on android 64
bits (x86_64).
The same program executing on 32 bits runs without problems.

The program creates a new database on every run and does a series of
inserts and updates (always the same series of operations).

Both programs produce the same SQLITE_DEBUG trace up until the point the 64
bit version outputs:
zipvfs database is corrupt. Line 1100 of [9d6c1880fb]
because of a zero size payload in a slot header.

I am waiting on a device to test behavior with arm64-v8a.

I am using sqlite 3.8.8.3 and zipvfs 3.8.8.

If anyone has experienced the same or has some insight on this, it would be
greatly appreciated.

Thanks

Alex

Here is the trace:

write page[1] to (456,127,0)
write page[2] to (589,20,17)
read page[1] from (24,127)
read page[1] from (24,127)
write page[1] to (632,126,0)
write page[2] to (764,33,4)
read page[1] from (24,126)
read page[1] from (24,126)
write page[1] to (807,509,0)
write page[3] to (1322,20,17)
write page[4] to (1365,20,17)
write page[5] to (1408,20,17)
write page[6] to (1451,20,17)
write page[7] to (1494,390,0)
write page[8] to (1890,383,0)
write page[9] to (2279,20,17)
write page[10] to (2322,20,17)
write page[11] to (2365,20,17)
write page[12] to (2408,20,17)
write page[13] to (2451,319,0)
write page[14] to (2776,20,17)
write page[15] to (2819,20,17)
write page[16] to (2862,20,17)
write page[17] to (2905,20,17)
write page[18] to (2948,379,0)
write page[19] to (,20,17)
write page[20] to (3376,20,17)
write page[21] to (3419,20,17)
write page[22] to (3462,20,17)
write page[23] to (3505,410,0)
write page[24] to (3921,20,17)
write page[25] to (3964,412,0)
write page[26] to (4382,20,17)
write page[27] to (4425,281,0)
write page[28] to (4712,20,17)
write page[29] to (4755,20,17)
write page[30] to (4798,384,0)
write page[31] to (5188,153,0)
write page[32] to (5347,20,17)
write page[33] to (5390,20,17)
write page[34] to (5433,20,17)
write page[35] to (5476,20,17)
write page[36] to (5519,427,0)
write page[37] to (5952,20,17)
write page[38] to (5995,20,17)
write page[39] to (6038,20,17)
write page[40] to (6081,20,17)
write page[41] to (6124,20,17)
write page[42] to (6167,363,0)
write page[43] to (6536,20,17)
write page[44] to (6579,20,17)
write page[45] to (6622,399,0)
write page[46] to (7027,20,17)
write page[47] to (7070,20,17)
write page[48] to (7113,20,17)
write page[49] to (7156,20,17)
write page[50] to (7199,370,0)
write page[51] to (7575,20,17)
write page[52] to (7618,20,17)
write page[53] to (7661,20,17)
write page[54] to (7704,385,0)
write page[55] to (8095,20,17)
write page[56] to (8138,20,17)
write page[57] to (8181,20,17)
write page[58] to (8224,20,17)
write page[59] to (8267,416,0)
write page[60] to (8689,20,17)
write page[61] to (8732,20,17)
write page[62] to (8775,20,17)
write page[63] to (8818,355,0)
write page[64] to (9179,20,17)
write page[65] to (9222,20,17)
write page[66] to (9265,20,17)
write page[67] to (9308,412,0)
write page[68] to (9726,20,17)
write page[69] to (9769,20,17)
write page[70] to (9812,432,0)
moving page[2] from (764,33,4) to (10250,33,0)
write page[71] to (10293,20,17)
write page[72] to (10336,20,17)
write page[73] to (10379,20,17)
write page[74] to (10422,20,17)
write page[75] to (10465,442,0)
moving page[1] from (807,509,0) to (10913,509,0)
write page[76] to (11428,20,17)
write page[77] to (11471,20,17)
write page[78] to (11514,275,0)
write page[79] to (11795,20,17)
write page[80] to (11838,20,17)
write page[81] to (11881,20,17)
read page[1] from (24,509)
read page[1] from (24,509)
read page[1] from (24,509)
write page[1] to (10913,509,0)
write page[11] to (11924,199,0)
write page[82] to (12129,806,0)
write page[83] to (12941,803,0)
write page[84] to (13750,804,0)
write page[85] to (14560,804,0)
read page[1] from (24,509)
read page[1] from (24,509)
read page[1] from (24,509)
write page[1] to (10913,509,0)
write page[11] to (15370,202,0)
write page[86] to (15578,804,0)
write page[87] to (16388,806,0)
write page[88] to (17200,808,0)
write page[89] to (18014,807,0)
write page[90] to (11924,199,0)
write page[91] to (18827,710,0)
read page[1] from (24,509)
read page[1] from (24,509)
read page[1] from (24,509)
write page[1] to (10913,509,0)
write page[11] to (19543,203,0)
write page[92] to (19752,808,0)
write page[93] to (20566,810,0)
write page[94] to (21382,808,0)
write page[95] to (22196,808,0)
write page[96] to (23010,633,0)
read page[1] from (24,509)
read page[1] from (24,509)
read page[1] from (24,509)
write page[1] to (10913,509,0)
write page[96] to (23649,751,0)
write page[97] to (24406,804,0)
write page[98] to (25216,808,0)
write page[99] to (26030,806,0)
write page[100] to (26842,805,0)
write page[101] to (27653,808,0)
read page[1] from (24,509)
read page[1] from (24,509)
read page[1] from (24,509)
write page[1] to (28467,510,0)
write page[11] to (19543,203,0)
write page[102] to (28983,806,0)
write page[103] to (29795,804,0)
write page[104] to (30605,807,0)
write page[105] to 

Re: [sqlite] Database corruption issue

2014-04-14 Thread Grzegorz Sikorski

Hi,

Actually we had data=writeback originally and problem also appeared. Our 
system has to be extremely stable and power loss can not cause any data 
damage at any time. We may deal with some latest data loss in such case, 
but no corruption is allowed. Thus, we decided to put journaling 
filesystem (ext4). Then we were searching for the best mount option to 
make sure we do not corrupt anything. We do not mind to switch the 
option to data=writeback (or even to different FS, maybe ext3?) if you 
suggest it is better tested, but we need to be sure that it would not 
cause any (more;) damage on power loss.


According to patches, I only recently applied 
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e44718318004a5618d1dfe2d080e2862532d8e5f, 
since I found it. Can you (or anyone) suggest any other patches critical 
to the stability? I will add the one you mentioned as well.


Thanks,
Greg

On 13/04/14 15:01, Florian Weimer wrote:

* Grzegorz Sikorski:


We do fsck on the startup. Occasionally, there are some errors, so we
decided to do 'fsck -p' to fix them before mounting the
filesystem. Here is how we then mount the actual filesystem:
/dev/mmcblk0p2 on /media/DATA type ext4
(rw,relatime,barrier=1,journal_checksum,nodelalloc,data=journal,usrquota)

I am not an expert in ext4, so I am not sure it is the best way to
mount, but as far as I read on some forums, this is the safest (and
slowest!) option.

It is, but it also less tested than data=writeback.  Have you
backported any ext4 fixes (such as commit 2d859db3e4) to your kernel?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



--
Asylia ExchangeDefender Message Security: Click below to verify authenticity
http://www.exchangedefender.com/verify.asp?id=s3E9gj19024173=g.sikor...@camlintechnologies.com


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption issue

2014-04-13 Thread Florian Weimer
* Grzegorz Sikorski:

> We do fsck on the startup. Occasionally, there are some errors, so we
> decided to do 'fsck -p' to fix them before mounting the
> filesystem. Here is how we then mount the actual filesystem:
> /dev/mmcblk0p2 on /media/DATA type ext4
> (rw,relatime,barrier=1,journal_checksum,nodelalloc,data=journal,usrquota)
>
> I am not an expert in ext4, so I am not sure it is the best way to
> mount, but as far as I read on some forums, this is the safest (and
> slowest!) option.

It is, but it also less tested than data=writeback.  Have you
backported any ext4 fixes (such as commit 2d859db3e4) to your kernel?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption issue

2014-04-11 Thread Grzegorz Sikorski

Hi,

Thanks you for your response, that is really helpful. Actually, I will 
do completely the other way, so setup my database with frequent 
checkpoints and process exiting occasionally, but for test purposes to 
track the issue down. At the moment the problem is hardly reproducible 
(one occurrence per hundreds resets), but your advise should give me 
more reproducible scenario. I suspect some synchronization issue on very 
low lever driver, involving interrupts from the SD controller. If anyone 
have another idea how to improve this test and make the corruption more 
possible, that would be great.


Regards,
Greg

On 11/04/14 12:22, Richard Hipp wrote:

On Fri, Apr 11, 2014 at 6:55 AM, Grzegorz Sikorski
wrote:


Hi Richard,

We do fsck on the startup. Occasionally, there are some errors, so we
decided to do 'fsck -p' to fix them before mounting the filesystem. Here is
how we then mount the actual filesystem:
/dev/mmcblk0p2 on /media/DATA type ext4 (rw,relatime,barrier=1,
journal_checksum,nodelalloc,data=journal,usrquota)

I am not an expert in ext4, so I am not sure it is the best way to mount,
but as far as I read on some forums, this is the safest (and slowest!)
option.



If EXT4 is also corrupting, that is evidence to support my theory that the
SD card is not synching correctly and is performing writes out of order.
And if that is the case, then there is nothing you can do to prevent
corruption on power-loss, to SQLite or any other database.  All you can do
is try to minimize your exposure.

(1)  Using WAL mode is good.  But the usage pattern makes a difference
too.  Please make sure you have at least one process that opens the
database (in WAL mode) and keeps it open all the time.  That processes
doesn't have to do anything with the database other than hold it open.
SQLite deletes the WAL file when the last connection closes.  By having one
process hold the database open, that prevents the WAL file from being
deleted, which reduces the risk of corruption.

(2)  Make sure you are not checkpointing too frequently.  Remember that the
power-loss corruption vulnerability only occurs during checkpoint.  You
have to checkpoint *sometime*.  Just try to make sure it isn't too often,
so that the risk of taking a power loss during a checkpoint is reduced.

(3) Make periodic backups of the database file using the SQLite backup
API.  Then on startup check the database for corruption (using PRAGMA
integrity_check or PRAGMA quick_check) and move to the backup if corruption
is seen.





--
Asylia ExchangeDefender Message Security: Click below to verify authenticity
http://www.exchangedefender.com/verify.asp?id=s3BCvUaq031855=g.sikor...@camlintechnologies.com


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption issue

2014-04-11 Thread Stephan Beal
On Fri, Apr 11, 2014 at 1:38 PM, Simon Slavin  wrote:

> I seem to recall that the sqlite3_open() call dos not really open the
> database.  The open actually happens when the data is
>
first accessed.  So to do the above "one process that opens the
> database" does one need to do some kind of access (presumably a harmless
> SELECT) ?
>

There was recently a post to this effect, where Richard said:


On Mon, Sep 2, 2013 at 10:52 PM, Richard Hipp  wrote:

> sqlite3_open() and sqlite3_open_v2() defer the actual opening of the
> database file until you do something with the database.  This is to give
> you the opportunity to issue PRAGMA statements that might influence the
> opening process.
>
> Running a very simple statement like
>
>  SELECT 1 FROM sqlite_master LIMIT 1;
>


-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption issue

2014-04-11 Thread Simon Slavin

On 11 Apr 2014, at 11:49am, Grzegorz Sikorski  wrote:

> No, the journal files (-wal/-shm) are not modified at all. The only think 
> that may be done (but rarely) is changing its privileges to rw-rw-r--. We 
> need to be able to open database in read-only mode by group and as far as I 
> know there is no other way to change privileges in SQLite3.

I'm trying to remember whether this works the way you'd expect it to.  I don't 
remember how SQLite reacts to having read-only access to a journal file for a 
database in WAL mode.  I hope an expert will post saying whether this works the 
way one might expect.

Apart from that, that's the only thing in your answer that suggests a problem.  
The rest of what you're doing should be okay.  The most important thing you 
wrote Dr Hipp already commented on: the fact that you occasionally have to do 
'fsck -p' suggests that your corruption is at the storage-device level and not 
caused by anything in SQLite.

On 11 Apr 2014, at 12:22pm, Richard Hipp  wrote:

> (1)  Using WAL mode is good.  But the usage pattern makes a difference
> too.  Please make sure you have at least one process that opens the
> database (in WAL mode) and keeps it open all the time.  That processes
> doesn't have to do anything with the database other than hold it open.
> SQLite deletes the WAL file when the last connection closes.  By having one
> process hold the database open, that prevents the WAL file from being
> deleted, which reduces the risk of corruption.

I seem to recall that the sqlite3_open() call dos not really open the database. 
 The open actually happens when the data is first accessed.  So to do the above 
"one process that opens the
database" does one need to do some kind of access (presumably a harmless 
SELECT) ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption issue

2014-04-11 Thread Richard Hipp
On Fri, Apr 11, 2014 at 6:55 AM, Grzegorz Sikorski
wrote:

> Hi Richard,
>
> We do fsck on the startup. Occasionally, there are some errors, so we
> decided to do 'fsck -p' to fix them before mounting the filesystem. Here is
> how we then mount the actual filesystem:
> /dev/mmcblk0p2 on /media/DATA type ext4 (rw,relatime,barrier=1,
> journal_checksum,nodelalloc,data=journal,usrquota)
>
> I am not an expert in ext4, so I am not sure it is the best way to mount,
> but as far as I read on some forums, this is the safest (and slowest!)
> option.
>


If EXT4 is also corrupting, that is evidence to support my theory that the
SD card is not synching correctly and is performing writes out of order.
And if that is the case, then there is nothing you can do to prevent
corruption on power-loss, to SQLite or any other database.  All you can do
is try to minimize your exposure.

(1)  Using WAL mode is good.  But the usage pattern makes a difference
too.  Please make sure you have at least one process that opens the
database (in WAL mode) and keeps it open all the time.  That processes
doesn't have to do anything with the database other than hold it open.
SQLite deletes the WAL file when the last connection closes.  By having one
process hold the database open, that prevents the WAL file from being
deleted, which reduces the risk of corruption.

(2)  Make sure you are not checkpointing too frequently.  Remember that the
power-loss corruption vulnerability only occurs during checkpoint.  You
have to checkpoint *sometime*.  Just try to make sure it isn't too often,
so that the risk of taking a power loss during a checkpoint is reduced.

(3) Make periodic backups of the database file using the SQLite backup
API.  Then on startup check the database for corruption (using PRAGMA
integrity_check or PRAGMA quick_check) and move to the backup if corruption
is seen.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption issue

2014-04-11 Thread Grzegorz Sikorski

Hi Richard,

We do fsck on the startup. Occasionally, there are some errors, so we 
decided to do 'fsck -p' to fix them before mounting the filesystem. Here 
is how we then mount the actual filesystem:
/dev/mmcblk0p2 on /media/DATA type ext4 
(rw,relatime,barrier=1,journal_checksum,nodelalloc,data=journal,usrquota)


I am not an expert in ext4, so I am not sure it is the best way to 
mount, but as far as I read on some forums, this is the safest (and 
slowest!) option.


Best regards,
Greg


On 10/04/14 18:48, Richard Hipp wrote:

On Thu, Apr 10, 2014 at 12:04 PM, Grzegorz Sikorski 

Re: [sqlite] Database corruption issue

2014-04-11 Thread Grzegorz Sikorski

Hi Simon,

See my comments below.

Thank you for your time,
Greg

On 10/04/14 18:25, Simon Slavin wrote:

On 10 Apr 2014, at 5:04pm, Grzegorz Sikorski  wrote:


The problem is we occasionally observe database corruption and whole database 
becomes rubbish. The problem is very rare and occurs only (as far as we were 
able to confirm so far) if there is a power lose during write to the database.

After power is lost, presumably you then reboot the device and restart the app. 
 Does your rebooting procedure, or anything else done before the database is 
reopened, mess with the database or journal file in any way ?  Especially, does 
it rename, move or delete the journal file ?
No, the journal files (-wal/-shm) are not modified at all. The only 
think that may be done (but rarely) is changing its privileges to 
rw-rw-r--. We need to be able to open database in read-only mode by 
group and as far as I know there is no other way to change privileges in 
SQLite3. Actually, I think this is kind of missing API/bug, but minor 
and I did not report it so far. SQLite3 should at least respect 
process's umask when it creates -shm/-wal files (during openning DB in 
read-only or read-write mode), but this is not the case, unfortunately 
(at least it did not work for me). But as far as I know filesystem, this 
should not hurt the files itself.


Does you app use any PRAGMAs when it opens the database file (apart from you 
having put it in WAL mode once upon a time) ?

Yes. For readers I do:
PRAGMA cache_size=8192;
and for writer (just in case the database is empty):
PRAGMA journal_mode=wal;


When the database is corrupted, I assume that the file length remains the same 
(or at least is plausible for an uncorrupted database) but that some of the 
file is overwritten with bytes which shouldn't be there.  Do these bytes look 
like data which could be from elsewhere in the database file, or elsewhere on 
the storage medium, or something which might have been in memory, or complete 
gibberish ?
Unfortunately, I am not really familiar with internal structures of 
SQLite3 DB. If it helps, here is an example output from 'pragma 
integrity_check;' command:

sqlite> pragma integrity_check;
*** in database main ***
On tree page 9173 cell 3: Rowid 13335 out of order (previous was 
1397130453176)
On tree page 9173 cell 30: Rowid 1397130684092 out of order (previous 
was 1397667554004)
On tree page 9173 cell 36: Rowid 1397130694597 out of order (previous 
was 1534569645968)
On tree page 9173 cell 39: Rowid 1397130866645 out of order (previous 
was 1397130961910)
On tree page 9173 cell 40: Rowid 1397130803109 out of order (previous 
was 1397130866645)

On tree page 9173 cell 45: Rowid 3 out of order (previous was 1397130807113)
On tree page 9173 cell 50: Rowid 3 out of order (previous was 1397130906159)
On tree page 9173 cell 53: Rowid 1397130914407 out of order (previous 
was 178832756909971)

On tree page 9173 cell 60: Rowid 0 out of order (previous was 1397130922549)
On tree page 9173 cell 67: Rowid 1397130948151 out of order (previous 
was 1397198039630)
On tree page 9173 cell 68: Rowid 1397130932767 out of order (previous 
was 1397130948151)
On tree page 9173 cell 70: Rowid 1397130935768 out of order (previous 
was 1397399370224)
On tree page 9173 cell 73: Rowid 1397130939282 out of order (previous 
was 3596154193322)
On tree page 9173 cell 77: Rowid 3 out of order (min less than parent 
min of 1397130450131)

Corruption detected in cell 0 on page 9173
Corruption detected in cell 3 on page 9173
Corruption detected in cell 26 on page 9173
Corruption detected in cell 50 on page 9173
Corruption detected in cell 67 on page 9173
Corruption detected in cell 69 on page 9173
Corruption detected in cell 76 on page 9173
Multiple uses for byte 265 of page 9173
Fragmentation of 11 bytes reported as 0 on page 9173
Corruption detected in cell 0 on page 9176
Fragmentation of 965 bytes reported as 0 on page 9176

This is an example log, there is many other scenarios, which I can share 
on request.


If you can't answer some of the above, don't worry.  It might help, might not.


We admit there may be a problem with SD drivers, POSIX mutex implementation or ext4 
driver (we are running relatively old version of Linux kernel), however we have never 
observed corruption of any other file on the same partition, even though this partition 
is heavily used for other purposes (syslog, all temporary data, configuration, some 
exported data files in zip format). We also use POSIX mutexes in other programs and have 
never discovered any issue (apart from invalid use in developer's code;) Does anyone know 
if there is any "unstable/unknown" part of SQLite3 code which may cause this 
issue?

Thank you for the details which has saved a lot of laborious back-and-forth.  
I'm not aware of any issues involving corruption as you describe, though 
someone else who reads this list may be.  Or may be able to solve your problem 
another 

Re: [sqlite] Database corruption issue

2014-04-10 Thread Richard Hipp
On Thu, Apr 10, 2014 at 12:04 PM, Grzegorz Sikorski  wrote:

> Hi,
>
> We developed an application which base on SQLite3. It is running on ARM
> processor with Linux 2.6.37 (no, there is no easy way to upgrade it). We
> put our database on microSD card (industrial grade, very reliable)
> formatted with ext4 partition. The database is used in WAL mode with
> default settings. Database schema is relatively simple: two tables, from
> which one stores only point names and the other is heavily used to store
> incoming data points. To speed up searching the data table, we setup R*tree
> indexes separately for each name (let say around 200 R*tree indexes). The
> problem is we occasionally observe database corruption and whole database
> becomes rubbish. The problem is very rare and occurs only (as far as we
> were able to confirm so far) if there is a power lose during write to the
> database. We have chosen WAL mode to allow read access to database during
> writing and also for performance reasons (database is on SD card and we run
> on embedded system). It is not yet confirmed for sure, but most likely the
> problem occurs only if reader and writer access the database at the same
> time and power lose occurs.
>
> We admit there may be a problem with SD drivers, POSIX mutex
> implementation or ext4 driver (we are running relatively old version of
> Linux kernel), however we have never observed corruption of any other file
> on the same partition, even though this partition is heavily used for other
> purposes (syslog, all temporary data, configuration, some exported data
> files in zip format). We also use POSIX mutexes in other programs and have
> never discovered any issue (apart from invalid use in developer's code;)
> Does anyone know if there is any "unstable/unknown" part of SQLite3 code
> which may cause this issue? We recently upgraded SQLite to the latest
> version, but it did not help. I am aware we use sort of new (maybe edge
> case) features, like WAL mode and R*tree indexes. Any advice will be
> appreciated.
>

There are no unstable/unknown parts of the code.  SQLite3 is used daily by
billions, literally.

For further information on how to corrupt an SQLite database see
http://www.sqlite.org/howtocorrupt.html

My guess is that the cause of corruption is that your hardware/os
combination is ignoring fsync()s during a checkpoint operation, causing
writes to occur out-of-order.   During a checkpoint operation, SQLite
copies content from the WAL file into the main database.  Then it
fsync()s.  Then it truncates and overwrites the header of the WAL file.
Now suppose writes are occurring out-of-order and there is a power loss.
Some of the content that overwrites the main database makes it onto the SD
card, but some does not, resulting in hopelessly scrambed main database.
But the truncation and overwrite of the WAL file does occur, so that the
checkpoint cannot be repeated to clear the error.  You are left with a
corrupted database file and no way to recover it.  fsync() is very
important.

Log files and ZIP archives do not commonly go corrupt on a power loss
because they are append-only.  They are not written randomly.

One other thing:  Are you sure that the power loss didn't corrupt the
filesystem?  Did you run fsck?


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption issue

2014-04-10 Thread Simon Slavin

On 10 Apr 2014, at 5:04pm, Grzegorz Sikorski  wrote:

> The problem is we occasionally observe database corruption and whole database 
> becomes rubbish. The problem is very rare and occurs only (as far as we were 
> able to confirm so far) if there is a power lose during write to the database.

After power is lost, presumably you then reboot the device and restart the app. 
 Does your rebooting procedure, or anything else done before the database is 
reopened, mess with the database or journal file in any way ?  Especially, does 
it rename, move or delete the journal file ?

Does you app use any PRAGMAs when it opens the database file (apart from you 
having put it in WAL mode once upon a time) ?

When the database is corrupted, I assume that the file length remains the same 
(or at least is plausible for an uncorrupted database) but that some of the 
file is overwritten with bytes which shouldn't be there.  Do these bytes look 
like data which could be from elsewhere in the database file, or elsewhere on 
the storage medium, or something which might have been in memory, or complete 
gibberish ?

If you can't answer some of the above, don't worry.  It might help, might not.

> We admit there may be a problem with SD drivers, POSIX mutex implementation 
> or ext4 driver (we are running relatively old version of Linux kernel), 
> however we have never observed corruption of any other file on the same 
> partition, even though this partition is heavily used for other purposes 
> (syslog, all temporary data, configuration, some exported data files in zip 
> format). We also use POSIX mutexes in other programs and have never 
> discovered any issue (apart from invalid use in developer's code;) Does 
> anyone know if there is any "unstable/unknown" part of SQLite3 code which may 
> cause this issue?

Thank you for the details which has saved a lot of laborious back-and-forth.  
I'm not aware of any issues involving corruption as you describe, though 
someone else who reads this list may be.  Or may be able to solve your problem 
another way.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database corruption issue

2014-04-10 Thread Grzegorz Sikorski

Hi,

We developed an application which base on SQLite3. It is running on ARM 
processor with Linux 2.6.37 (no, there is no easy way to upgrade it). We 
put our database on microSD card (industrial grade, very reliable) 
formatted with ext4 partition. The database is used in WAL mode with 
default settings. Database schema is relatively simple: two tables, from 
which one stores only point names and the other is heavily used to store 
incoming data points. To speed up searching the data table, we setup 
R*tree indexes separately for each name (let say around 200 R*tree 
indexes). The problem is we occasionally observe database corruption and 
whole database becomes rubbish. The problem is very rare and occurs only 
(as far as we were able to confirm so far) if there is a power lose 
during write to the database. We have chosen WAL mode to allow read 
access to database during writing and also for performance reasons 
(database is on SD card and we run on embedded system). It is not yet 
confirmed for sure, but most likely the problem occurs only if reader 
and writer access the database at the same time and power lose occurs.


We admit there may be a problem with SD drivers, POSIX mutex 
implementation or ext4 driver (we are running relatively old version of 
Linux kernel), however we have never observed corruption of any other 
file on the same partition, even though this partition is heavily used 
for other purposes (syslog, all temporary data, configuration, some 
exported data files in zip format). We also use POSIX mutexes in other 
programs and have never discovered any issue (apart from invalid use in 
developer's code;) Does anyone know if there is any "unstable/unknown" 
part of SQLite3 code which may cause this issue? We recently upgraded 
SQLite to the latest version, but it did not help. I am aware we use 
sort of new (maybe edge case) features, like WAL mode and R*tree 
indexes. Any advice will be appreciated.


Best regards,
Greg

--
Asylia ExchangeDefender Message Security: Click below to verify authenticity
http://www.exchangedefender.com/verify.asp?id=s3AG4bIR004615=g.sikor...@camlintechnologies.com


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database Corruption - Table data being overwritten

2012-06-06 Thread Black, Michael (IS)
That sounds like it might a corrupt index since the data doesn't show up in a 
dump.



Does your select work if you drop the index?



And have you done a "pragma integrity_check" ?





Michael D. Black

Senior Scientist

Advanced Analytics Directorate

Advanced GEOINT Solutions Operating Unit

Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of herb...@gdls.com [herb...@gdls.com]
Sent: Tuesday, June 05, 2012 1:38 PM
To: sqlite-users@sqlite.org
Subject: EXT :[sqlite] Database Corruption - Table data being overwritten

Hello,

I have run into a problem using SQLite in an embedded environment.

Currently using version 3.7.5.  (I know, I want to update, just have been
locked in at this version for the time being)

The power is routinely cut from the system, so I have JOURNAL_MODE =
PERSIST because there were issues with the -journal file not being
completely cleaned up with a power loss when the mode was left defaulted
to DELETE.

Everything works fine for a while, but then something will happen that
corrupts the database.  Rows from one table seem to be overwriting valid
Rows in another table.  No UPDATEs or DELETEs are ever performed on this
database, so no data should ever be over-written.

Even more bizarre, a .dump of "table1" will show:
  INSERT INTO "table1" VALUES(28274, 6, 10308);

But this query returns nothing:
  SELECT * FROM table1 WHERE table1_id=28274;

The invalid table1 data (28274, 6, 10308) seems to be a direct copy of
valid data from table2.  The .dump of table2 has:
  INSERT INTO "table2" VALUES(28274, 6, 10308, ' 277');

"28274" doesn't even make sense as a valid table1_id (which is never
manipulated and set to auto-increment).

"SELECT * FROM sqlite_sequence WHERE name='table1'" gives:
   table1|11985

So, "28274" is completely invalid.

What could cause such corruption?  A journal file problem?  Is there a
better option than "PERSIST" for systems that like to have the power
pulled?

Thanks for any help!

Patrick Herbst.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database Corruption - Table data being overwritten

2012-06-06 Thread Simon Slavin

On 6 Jun 2012, at 1:03pm, herb...@gdls.com wrote:

> Even more bizarre, a .dump of "table1" will show:
>  INSERT INTO "table1" VALUES(28274, 6, 10308);
> 
> But this query returns nothing:
>  SELECT * FROM table1 WHERE table1_id=28274;

There should be (at least) two copies of that number in the data file.  One is 
in the copy of all rows.  The other is in the index SQLite makes up so it can 
check for keys.  I would guess that one copy is there and one isn't.

> The invalid table1 data (28274, 6, 10308) seems to be a direct copy of 
> valid data from table2.  The .dump of table2 has:
>  INSERT INTO "table2" VALUES(28274, 6, 10308, ' 277');
> 
> "28274" doesn't even make sense as a valid table1_id (which is never 
> manipulated and set to auto-increment).
> 
> "SELECT * FROM sqlite_sequence WHERE name='table1'" gives:
>   table1|11985
> 
> So, "28274" is completely invalid. 

Having now learned that your database is corrupt, please cure the corruption 
before continuing.  Pointless doing more tests when you already know it's 
corrupt.  Perhaps use .dump and .read or something.

> What could cause such corruption?  A journal file problem?

Well, SQLite /should/ recover from sudden loss of power properly but it won't 
do so if fsync() doesn't work on your platform, and it may not.  Read section 
3.1 in



and note that Flash memory implementations are especially bad at syncing.  Or 
continue and read the rest of that page.

> Is there a 
> better option than "PERSIST" for systems that like to have the power 
> pulled?

Would it be possible in your implementation to close the database after any 
significant changes ?  Perhaps close and reopen immediately after long runs of 
inserting.  Or close and reopen after each 1000 operations, or each minute, or 
something ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database Corruption - Table data being overwritten

2012-06-06 Thread herbstp
Hello,

I have run into a problem using SQLite in an embedded environment.

Currently using version 3.7.5.  (I know, I want to update, just have been 
locked in at this version for the time being)

The power is routinely cut from the system, so I have JOURNAL_MODE = 
PERSIST because there were issues with the -journal file not being 
completely cleaned up with a power loss when the mode was left defaulted 
to DELETE.

Everything works fine for a while, but then something will happen that 
corrupts the database.  Rows from one table seem to be overwriting valid 
Rows in another table.  No UPDATEs or DELETEs are ever performed on this 
database, so no data should ever be over-written.

Even more bizarre, a .dump of "table1" will show:
  INSERT INTO "table1" VALUES(28274, 6, 10308);

But this query returns nothing:
  SELECT * FROM table1 WHERE table1_id=28274;

The invalid table1 data (28274, 6, 10308) seems to be a direct copy of 
valid data from table2.  The .dump of table2 has:
  INSERT INTO "table2" VALUES(28274, 6, 10308, ' 277');

"28274" doesn't even make sense as a valid table1_id (which is never 
manipulated and set to auto-increment).

"SELECT * FROM sqlite_sequence WHERE name='table1'" gives:
   table1|11985

So, "28274" is completely invalid. 

What could cause such corruption?  A journal file problem?  Is there a 
better option than "PERSIST" for systems that like to have the power 
pulled?

Thanks for any help!

Patrick Herbst.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database Corruption - Table data being overwritten

2012-06-06 Thread Richard Hipp
On Wed, Jun 6, 2012 at 8:03 AM,  wrote:

> Hello,
>
> I have run into a problem using SQLite in an embedded environment.
>
> Currently using version 3.7.5.  (I know, I want to update, just have been
> locked in at this version for the time being)
>
> The power is routinely cut from the system, so I have JOURNAL_MODE =
> PERSIST because there were issues with the -journal file not being
> completely cleaned up with a power loss when the mode was left defaulted
> to DELETE.
>

Have you read http://www.sqlite.org/howtocorrupt.html and the various other
pages that it links to?

Have you tried using WAL mode instead?  WAL mode tends to be more resistant
to issue with disk drives lying about having flushed their track cache to
oxide.


>
> Everything works fine for a while, but then something will happen that
> corrupts the database.  Rows from one table seem to be overwriting valid
> Rows in another table.  No UPDATEs or DELETEs are ever performed on this
> database, so no data should ever be over-written.
>
> Even more bizarre, a .dump of "table1" will show:
>  INSERT INTO "table1" VALUES(28274, 6, 10308);
>
> But this query returns nothing:
>  SELECT * FROM table1 WHERE table1_id=28274;
>
> The invalid table1 data (28274, 6, 10308) seems to be a direct copy of
> valid data from table2.  The .dump of table2 has:
>  INSERT INTO "table2" VALUES(28274, 6, 10308, ' 277');
>
> "28274" doesn't even make sense as a valid table1_id (which is never
> manipulated and set to auto-increment).
>
> "SELECT * FROM sqlite_sequence WHERE name='table1'" gives:
>   table1|11985
>
> So, "28274" is completely invalid.
>
> What could cause such corruption?  A journal file problem?  Is there a
> better option than "PERSIST" for systems that like to have the power
> pulled?
>
> Thanks for any help!
>
> Patrick Herbst.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database Corruption - Table data being overwritten

2012-06-06 Thread herbstp
Hello,

I have run into a problem using SQLite in an embedded environment.

Currently using version 3.7.5.  (I know, I want to update, just have been 
locked in at this version for the time being)

The power is routinely cut from the system, so I have JOURNAL_MODE = 
PERSIST because there were issues with the -journal file not being 
completely cleaned up with a power loss when the mode was left defaulted 
to DELETE.

Everything works fine for a while, but then something will happen that 
corrupts the database.  Rows from one table seem to be overwriting valid 
Rows in another table.  No UPDATEs or DELETEs are ever performed on this 
database, so no data should ever be over-written.

Even more bizarre, a .dump of "table1" will show:
  INSERT INTO "table1" VALUES(28274, 6, 10308);

But this query returns nothing:
  SELECT * FROM table1 WHERE table1_id=28274;

The invalid table1 data (28274, 6, 10308) seems to be a direct copy of 
valid data from table2.  The .dump of table2 has:
  INSERT INTO "table2" VALUES(28274, 6, 10308, ' 277');

"28274" doesn't even make sense as a valid table1_id (which is never 
manipulated and set to auto-increment).

"SELECT * FROM sqlite_sequence WHERE name='table1'" gives:
   table1|11985

So, "28274" is completely invalid. 

What could cause such corruption?  A journal file problem?  Is there a 
better option than "PERSIST" for systems that like to have the power 
pulled?

Thanks for any help!

Patrick Herbst.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption with fts table after closing/reopening the database

2012-02-10 Thread Simon Slavin

On 10 Feb 2012, at 11:01am, Julien LF wrote:

>> Did you try surrounding them strings with single quotes, instead ?
> 
> I got the same problem using single quotes.
> However the machine I was performing those tests on is a vmware virtual 
> machine.
> Performing the same steps on physical hardware (identical os & sqlite
> version) worked fine, so I guess there is some inconsistency somewhere
> between NetBSD and vmware at the disk writing level.
> 
> Sorry for the noise.

Actually I'm a little concerned at that.  SQLite may be marvellous but it's not 
mystical.  Many other programs use similar patterns of disk reading and 
writing.  You seem to have found a serious fault in the virtual machine system.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption with fts table after closing/reopening the database

2012-02-10 Thread Julien LF
> Did you try surrounding them strings with single quotes, instead ?

I got the same problem using single quotes.
However the machine I was performing those tests on is a vmware virtual machine.
Performing the same steps on physical hardware (identical os & sqlite
version) worked fine, so I guess there is some inconsistency somewhere
between NetBSD and vmware at the disk writing level.

Sorry for the noise.

Julien
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption with fts table after closing/reopening the database

2012-02-10 Thread Kevin Benson
On Fri, Feb 10, 2012 at 5:20 AM, Julien LF  wrote:

> Hello,
>
> I'm testing the fts extensions and experiencing a 'database disk image
> is malformed' problem that I can reproduce the following way:
> - Create a database
> - Open the database to insert/update rows. 'match' queries work at this
> point
> - Close the database
> - Open the database. 'match' queries still work at this point
> - Update a row
> - 'match' queries generate a 'database disk image is malformed'
> message, but normal selects work fine.
>
> This is sqlite 3.7.7.1 on NetBSD 5.1. I tried both fts3 and fts4.
>
> You'll find the schema below, followed by the complete detailed steps
> to produce the error. Is there something I'm missing?
>
> Regards,
> Julien
>
> CREATE VIRTUAL TABLE page USING fts4 (
>url,
>title,
>content
> );
>
> vnetbsd$ sqlite3 wiki.db < ./schema.sql
> vnetbsd$ sqlite3 ./wiki.db
> SQLite version 3.7.7.1 2011-06-28 17:39:05
> Enter ".help" for instructions
> Enter SQL statements terminated with a ";"
> sqlite> insert into page(url,title,content)
> values("intro","intro","intro");
> sqlite> select * from page where url match 'intro';
> intro|intro|intro
> sqlite> update page set url="intro", title="intro", content="introm"
> where rowid=1;
> sqlite> select * from page where url match 'intro';
> intro|intro|introm
> sqlite> .q
> vnetbsd$ sqlite3 ./wiki.db
> SQLite version 3.7.7.1 2011-06-28 17:39:05
> Enter ".help" for instructions
> Enter SQL statements terminated with a ";"
> sqlite> select * from page where url match 'intro';
> intro|intro|introm
> sqlite> update page set url="intro", title="intro", content="intro"
> where rowid=1;
> sqlite> select * from page where url match 'intro';
> Error: database disk image is malformed
> sqlite> select * from page;
> intro|intro|intro
>
Did you try surrounding them strings with single quotes, instead ?
--
   --
  --
 --ô¿ô--
K e V i N
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database corruption with fts table after closing/reopening the database

2012-02-10 Thread Julien LF
Hello,

I'm testing the fts extensions and experiencing a 'database disk image
is malformed' problem that I can reproduce the following way:
- Create a database
- Open the database to insert/update rows. 'match' queries work at this point
- Close the database
- Open the database. 'match' queries still work at this point
- Update a row
- 'match' queries generate a 'database disk image is malformed'
message, but normal selects work fine.

This is sqlite 3.7.7.1 on NetBSD 5.1. I tried both fts3 and fts4.

You'll find the schema below, followed by the complete detailed steps
to produce the error. Is there something I'm missing?

Regards,
Julien

CREATE VIRTUAL TABLE page USING fts4 (
url,
title,
content
);

vnetbsd$ sqlite3 wiki.db < ./schema.sql
vnetbsd$ sqlite3 ./wiki.db
SQLite version 3.7.7.1 2011-06-28 17:39:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> insert into page(url,title,content) values("intro","intro","intro");
sqlite> select * from page where url match 'intro';
intro|intro|intro
sqlite> update page set url="intro", title="intro", content="introm"
where rowid=1;
sqlite> select * from page where url match 'intro';
intro|intro|introm
sqlite> .q
vnetbsd$ sqlite3 ./wiki.db
SQLite version 3.7.7.1 2011-06-28 17:39:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from page where url match 'intro';
intro|intro|introm
sqlite> update page set url="intro", title="intro", content="intro"
where rowid=1;
sqlite> select * from page where url match 'intro';
Error: database disk image is malformed
sqlite> select * from page;
intro|intro|intro
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Help sqlite database corruption

2011-06-21 Thread Black, Michael (IS)
As I said before...welcome to DLL hell...



You guessed itthe original DLL will give an error if any of the DLLs it 
depends on are not loadable.



Since sqlite.interop.dll is non-standard you should include ALL DLLs that it 
depends on in your installation.  Subject to Microsoft's restrictions:

http://msdn.microsoft.com/en-us/library/8kche8ah.aspx



How are you making your installation distribution?



Michael D. Black

Senior Scientist

NG Information Systems

Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Singh, Manpreet [manpreet.sing...@ca.com]
Sent: Tuesday, June 21, 2011 2:55 AM
To: sqlite-...@sqlite.org; sqlite-users@sqlite.org
Subject: EXT :Re: [sqlite] Help sqlite database corruption

Anyone with clues to the problem "malformed disk image"?



From: Singh, Manpreet
Sent: Tuesday, May 24, 2011 3:10 PM
To: sqlite-...@sqlite.org; d...@hwaci.com; sqlite-users@sqlite.org
Cc: Singh, Abhijeet; Singh, Satbeer; Prasanth, Neelapalem
Subject: RE: Help sqlite database corruption



Hello,



Resending with some more finding any help would be appreciated. One of
the tables in the database is getting corrupted.

1. Sqlite>select count(*) from $table_name

..

7981508|35001|284

7981509|35002|284

7981510|35003|284

7981511|35004|284

SQL error: database disk image is malformed

sqlite>

2. When integrity check is performed following is the error:

*** in database main ***

Page 703104: sqlite3BtreeInitPage() returns error code 11

On tree page 680368 cell 75: Child page depth differs

Page 703365: sqlite3BtreeInitPage() returns error code 11

Page 703624: sqlite3BtreeInitPage() returns error code 11

Page 703884: sqlite3BtreeInitPage() returns error code 11



Note:

1. $table_name has 2 unique indexes and 3 columns

2. Using  Sqlite version 3.5.6

3. We are able to .dump to sql file and get the db back



Thanks and Regards,

Manpreet Singh

CA Subject Matter Expert

SQL Certified (Oracle 9i)

MCP (SQL 2000 Administration)

MS Certified Technology Specialist -  SQL Server 2005



From: Singh, Manpreet
Sent: Wednesday, May 18, 2011 12:04 PM
To: 'sqlite-...@sqlite.org'
Subject: Help sqlite database corruption



Hello,



Looking for help, sqlite database getting corrupted multiple times we
need to use .dump to recover and then back to database.

As per our observations index gets corrupted for one of the tables
(table goes more than 7900k records) and when queried it give "malformed
disk image" for that database.



Thanks and Regards,

Manpreet Singh



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Help sqlite database corruption

2011-06-21 Thread Singh, Manpreet
Anyone with clues to the problem "malformed disk image"?

 

From: Singh, Manpreet 
Sent: Tuesday, May 24, 2011 3:10 PM
To: sqlite-...@sqlite.org; d...@hwaci.com; sqlite-users@sqlite.org
Cc: Singh, Abhijeet; Singh, Satbeer; Prasanth, Neelapalem
Subject: RE: Help sqlite database corruption

 

Hello,

 

Resending with some more finding any help would be appreciated. One of
the tables in the database is getting corrupted.

1. Sqlite>select count(*) from $table_name

..

7981508|35001|284

7981509|35002|284

7981510|35003|284

7981511|35004|284

SQL error: database disk image is malformed

sqlite>

2. When integrity check is performed following is the error:

*** in database main ***

Page 703104: sqlite3BtreeInitPage() returns error code 11

On tree page 680368 cell 75: Child page depth differs

Page 703365: sqlite3BtreeInitPage() returns error code 11

Page 703624: sqlite3BtreeInitPage() returns error code 11

Page 703884: sqlite3BtreeInitPage() returns error code 11

 

Note: 

1. $table_name has 2 unique indexes and 3 columns

2. Using  Sqlite version 3.5.6

3. We are able to .dump to sql file and get the db back

 

Thanks and Regards,

Manpreet Singh

CA Subject Matter Expert

SQL Certified (Oracle 9i)

MCP (SQL 2000 Administration)

MS Certified Technology Specialist -  SQL Server 2005

 

From: Singh, Manpreet 
Sent: Wednesday, May 18, 2011 12:04 PM
To: 'sqlite-...@sqlite.org'
Subject: Help sqlite database corruption

 

Hello,

 

Looking for help, sqlite database getting corrupted multiple times we
need to use .dump to recover and then back to database.

As per our observations index gets corrupted for one of the tables
(table goes more than 7900k records) and when queried it give "malformed
disk image" for that database.

 

Thanks and Regards,

Manpreet Singh

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Help sqlite database corruption

2011-05-24 Thread Richard Hipp
On Tue, May 24, 2011 at 5:39 AM, Singh, Manpreet wrote:

> Hello,
>
>
>
> Resending with some more finding any help would be appreciated. One of
> the tables in the database is getting corrupted.
>

http://www.sqlite.org/howtocorrupt.html
http://www.sqlite.org/lockingv3.html#how_to_corrupt
http://www.sqlite.org/atomiccommit.html#sect_9_0

2. Using  Sqlite version 3.5.6
>

SQLite version 3.7.6.3 is faster and has fewer bugs.  Recommend that you
upgrade.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Help sqlite database corruption

2011-05-24 Thread Singh, Manpreet
Hello,

 

Resending with some more finding any help would be appreciated. One of
the tables in the database is getting corrupted.

1. Sqlite>select count(*) from $table_name

..

7981508|35001|284

7981509|35002|284

7981510|35003|284

7981511|35004|284

SQL error: database disk image is malformed

sqlite>

2. When integrity check is performed following is the error:

*** in database main ***

Page 703104: sqlite3BtreeInitPage() returns error code 11

On tree page 680368 cell 75: Child page depth differs

Page 703365: sqlite3BtreeInitPage() returns error code 11

Page 703624: sqlite3BtreeInitPage() returns error code 11

Page 703884: sqlite3BtreeInitPage() returns error code 11

 

Note: 

1. $table_name has 2 unique indexes and 3 columns

2. Using  Sqlite version 3.5.6

3. We are able to .dump to sql file and get the db back

 

Thanks and Regards,

Manpreet Singh

CA Subject Matter Expert

SQL Certified (Oracle 9i)

MCP (SQL 2000 Administration)

MS Certified Technology Specialist -  SQL Server 2005

 

From: Singh, Manpreet 
Sent: Wednesday, May 18, 2011 12:04 PM
To: 'sqlite-...@sqlite.org'
Subject: Help sqlite database corruption

 

Hello,

 

Looking for help, sqlite database getting corrupted multiple times we
need to use .dump to recover and then back to database.

As per our observations index gets corrupted for one of the tables
(table goes more than 7900k records) and when queried it give "malformed
disk image" for that database.

 

Thanks and Regards,

Manpreet Singh

 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite database corruption

2010-07-20 Thread Grzegorz Russek
Hi,

We're experiencing strange database corruptions using SQLite database (we
are using it for over three years). We are reporting this now because one of
databases got corrupted on less sensitive data which we can give you
for analysis and because we wanted to reject all other possibilities of
error on our side, we also separated SQLite library from the rest of
application to exclude memory management problems. But when problem occurred in
our experimental project which was using .NET Compact Framework it was sure
that there is something not right.

First of all, we are using SQLite 3.6.23.1 on Pocket PC 2003 SE/Windows
Mobile 5.0 - 6.5 platform. SQLite is compiled using eVC 4.0 SP4. Described
situation occurs mostly on newer versions of OS, but it may be hardware
dependent, because we have few types of devices on which this problem did
not occur, but on most of available devices this problem happens, so this
problem can include other developers using Windows Mobile and SQlite.

We were unable to create tests that would trigger this issue, but it mostly
happens when we synchronize data between server and mobile device.
Synchronization downloads from server SQLite database with updated data and
copy that data using single query per table. If table is big, then we use
limit and offset to copy in chunks. Downloaded database has similar
structure and identical settings (encoding, page size, etc.) Query looks
like that:

REPLACE INTO "main"."ProductPriceList" (
"Product_Id",
"PriceList_Code",
"NettoVal",
"Deleted")
SELECT
"BigSync_TH_SYNC"."ProductPriceList"."Product_Id",
"BigSync_TH_SYNC"."ProductPriceList"."PriceList_Code",
"BigSync_TH_SYNC"."ProductPriceList"."NettoVal",
"BigSync_TH_SYNC"."ProductPriceList"."Deleted"
FROM "BigSync_TH_SYNC"."ProductPriceList" *LIMIT 16536 OFFSET 0*;

In this case we are positive that this corruption occurred while executing
this query, because at the moment only synchronization modifies that table.
We don't know whether limit offset was applied, because we don't know how
big change was in database. (If synchronization is successful we remove
incoming database). Database also did not report any error, until something
else referenced that table.

This issue mostly takes place on this query:

REPLACE INTO "main"."Clients" (
"Id_Client",
"Code",
"ShortName",
"Name",
"Street",
"No",
"City",
"Region_Id",
"Latitude",
"Longitude",
"GPSData",
"Modified",
"Deleted",
...)
SELECT
"BigSync_TH_SYNC"."Clients"."Id_Client",
"BigSync_TH_SYNC"."Clients"."Code",
"BigSync_TH_SYNC"."Clients"."ShortName",
"BigSync_TH_SYNC"."Clients"."Name",
"BigSync_TH_SYNC"."Clients"."Street",
"BigSync_TH_SYNC"."Clients"."No",
"BigSync_TH_SYNC"."Clients"."City",
"main"."Clients"."Region_Id",
"main"."Clients"."Latitude",
"main"."Clients"."Longitude",
"main"."Clients"."GPSData",
"main"."Clients"."Modified",
"BigSync_TH_SYNC"."Clients"."Deleted",
...
FROM "BigSync_TH_SYNC"."Clients"
LEFT OUTER JOIN "main"."Clients" ON
"main"."Clients"."Id_Client"="BigSync_TH_SYNC"."Clients"."Id_Client";

Database in attachment has corrupted index on table ProductPriceList. Data
in other tables was wiped, because those are real client
data. Fortunately this did not change output of SQLite analyzer  and
integrity check. (also in attachment).

Hope that you will be able to resolve our issue. I hope that information I
provided will be useful. If you have any questions please feel free to ask.

Cheers

PS. I'm also curious if there is any faster way to transfer data between
databases. On desktop this is not a problem, but on mobile if table Clients
has 5000 rows replace into takes from 1 to 6 minutes, depending on processor
and flash speed.
Analyzing table AbsenceReasons...
Analyzing table Absences...
Analyzing table Addresses...
Analyzing table Cache_Clients...
Analyzing table Cache_Group...
Analyzing table Cache_Product_Filter...
Analyzing table Cache_Promotions_Products...
Analyzing table ClientCategories...
Analyzing table ClientGroupDiscounts...
Analyzing table ClientGroups...
Analyzing table ClientHardware...
Analyzing table ClientLicences...
Analyzing table ClientPackages...
Analyzing table ClientPrices...
Analyzing table ClientProductDiscounts...
Analyzing table ClientPromotions...
Analyzing table ClientSpecialPrices...
Analyzing table ClientStandards...
Analyzing table Client_Tasks...
Analyzing table Clients...
Analyzing table ClientsCatalog...
Analyzing table Corporations...
Analyzing table DocumentTypes...
Analyzing table FreebeProducts...
Analyzing table GasUsage...
Analyzing table Groups...
Analyzing table GroupsGroups...
Analyzing table Hardware...
Analyzing table HardwareTypes...
Analyzing table InvoiceHeads...
Analyzing table InvoicePositions...
Analyzing table InvoiceTypes...
Analyzing table KP...
Analyzing table LastOrdered...
Analyzing table Licences...
Analyzing table 

Re: [sqlite] Database corruption on Linux ext3

2010-07-17 Thread Chris Wedgwood
On Thu, Jul 15, 2010 at 10:20:22PM +0200, Florian Weimer wrote:

> It's a generic Linux problem, not an ext3-specific issue.  Until
> recently, the Linux block layer had no concept of a sync operation.
> Linux basically assumed that all writes were synchronous and ordered,
> which they are not if your hardware has (non-transparent) write
> caches.  It did not disable write caching by default, either.

For quite some time, ext3 in mainline and most distro's defaulted to
nobarrier (barrier=0) --- the only exception to this I'm aware of was
SLES.

Mainline switched to having barriers on by default, I assume the
distro's have or will follow.

Some other filesystems (XFS for example) have had barriers enabled by
default.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-15 Thread Florian Weimer
* D. Richard Hipp:

> An appliance manufacturer has discovered a database corruption issue  
> on Linux using ext3.   The issue is documented here:
>
>  http://www.sqlite.org/draft/lockingv3.html#ext3-barrier-problem

It's a generic Linux problem, not an ext3-specific issue.  Until
recently, the Linux block layer had no concept of a sync operation.
Linux basically assumed that all writes were synchronous and ordered,
which they are not if your hardware has (non-transparent) write
caches.  It did not disable write caching by default, either.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-14 Thread Jim Wilcoxson
On Wed, Jul 14, 2010 at 1:35 AM, Roger Binns  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 07/13/2010 05:30 PM, Jim Wilcoxson wrote:
> > I don't think this would work, because the problem described is that the
> > writes aren't making it to disk.   If pages don't make it to disk, the
> old
> > pages will be present, with the old, and valid checksums.
>
> You are assuming the checksums are stored in the page they checksum.  That
> would only detect corruption of that page.  You could have pages that store
> the checksums of numerous other pages, so both the checksum page and the
> data page would have to fail to make it to disk.  Yes, there are scenarios
> where you could still get old apparently valid pages, but those are harder
> to happen.
>

It seems there are several level of checking possible:

- checksum on the page itself lets you detect some errors, with no extra I/O
- checksum pages for a group of pages lets you detect missing writes within
the group, with some extra I/O
- checksum of all checksum pages lets you detect missing writes for an
entire commit, with even more extra I/O

How much extra I/O depends on the size of the db, page size, and how much
memory is available for caching checksum pages.

Scott mentioned that a detection system without the ability to correct might
not be useful, but I think it is useful.  Not as good as correction of
course, but useful because:

- it might prevent the application program from issuing a bogus error
message like "the row you asked for isn't in the database"; lots of time
could be spent in the weeds chasing down a misleading error

- some applications might have backup copies of the database; they could
display an error message and revert to a backup

Jim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/13/2010 05:30 PM, Jim Wilcoxson wrote:
> I don't think this would work, because the problem described is that the
> writes aren't making it to disk.   If pages don't make it to disk, the old
> pages will be present, with the old, and valid checksums.

You are assuming the checksums are stored in the page they checksum.  That
would only detect corruption of that page.  You could have pages that store
the checksums of numerous other pages, so both the checksum page and the
data page would have to fail to make it to disk.  Yes, there are scenarios
where you could still get old apparently valid pages, but those are harder
to happen.

> SQLite would have to verify the checksum on every page when the
> database is opened and a hot journal exists, which could be quite a lot of
> overhead for a large database.

I would want checksums looked at when reading pages always.  For more
valuable data I would want a thorough open check.  I already use the pragmas
(integrity/quick check) on opens as appropriate.

> I think a checksum on every page, and maybe even an error-correction code,
> is a great idea as an optional feature.

It would have to be optional, much like WAL for backwards compatibility
issues, and because not everyone would want this.

> But it would only detect hardware problems and bit rot.

Plus normal errors, such as the undetectable one every 20TB of I/O rate I
hope I remembered correctly from the ZFS paper.  I don't know what the error
rate of Flash is, nor how that interacts with SATA signalling error rates,
but all these numbers are non-zero and we keep doing more and more I/O, have
larger datasets, and have things around for longer.  Plus we do like buying
cheap storage :-)

> This problem of not doing writes, or doing them in
> the wrong order, is a different animal IMO.

There will always be loopholes you could construct with that that are very
hard to detect.  Hopefully checksumming or some sort of similar data
integrity scheme would make it easier to detect some of the scenarios.  ie
things could be made better than they are now.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkw9TJYACgkQmOOfHg372QRnkACfdJp+FZHvxXeLNN0DMCHSNJ1+
3E8AnR1PXa4PTQ1ridHBaAC7WpSSwnyM
=5sbp
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-13 Thread Scott Hess
On Tue, Jul 13, 2010 at 10:03 PM, Scott Hess  wrote:
> On Tue, Jul 13, 2010 at 8:24 PM, Simon Slavin  wrote:
>> It might be useful to figure out whether we're aiming for
>> detection or correction.  By 'correction' I don't mean recovery
>> of all information, I mean restoring the database to some state
>> it was in just after a COMMIT took effect.  There's no point in
>> implementing a detection system if the users consider "This
>> database is corrupt" something worth complaining about.  On the
>> other hand, implementing a correction system may well slow down
>> every write operation and perhaps '_open' too.  It's not worth
>> doing that if slowing down SQLite will decrease usability.
>
> The best case is a system where corruption cannot happen.  Since
> that's clearly impossible ...
>
> Second-best would be an ability to rollback to a priori valid state.

[Sigh, did not mean some whizzy technical term, there.  Meant "a prior
valid state."]

-scott
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-13 Thread Scott Hess
On Tue, Jul 13, 2010 at 8:24 PM, Simon Slavin  wrote:
> It might be useful to figure out whether we're aiming for
> detection or correction.  By 'correction' I don't mean recovery
> of all information, I mean restoring the database to some state
> it was in just after a COMMIT took effect.  There's no point in
> implementing a detection system if the users consider "This
> database is corrupt" something worth complaining about.  On the
> other hand, implementing a correction system may well slow down
> every write operation and perhaps '_open' too.  It's not worth
> doing that if slowing down SQLite will decrease usability.

The best case is a system where corruption cannot happen.  Since
that's clearly impossible ...

Second-best would be an ability to rollback to a priori valid state.
WAL should provide some of this, because over time the system should
tend towards having everything stably written to the disk.  You just
have to think about how much you're willing to lose.

Currently, there is a gray area where you've seen something which is
incorrect, but you cannot reason about how extensive the damage is.
Checksums can provide you with some ability to detect corruption more
quickly, and perhaps a way to reason about it.  You could have a
function like iscorrupt(rowid) which could be used to salvage
uncorrupted rows, and then carefully inspect corrupted rows.  Or maybe
you could delete the corrupt rows and then the table would be clean.
Perhaps there could even be an ON CORRUPTION trigger involved.

[Obviously, here I'm not thinking about the case where a DBA sits down
and reconstructs things.  I'm thinking about what you do when a
database in the field has corruption.  For instance, Chrome has a
history database, and being able to easily delete corrupted rows and
reconstruct indices would be an improvement over nuking the database
from orbit.  Without explicit database support, though, I've always
been nervous about attempting such things in an ad-hoc fashion.]

-scott
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-13 Thread Simon Slavin

On 14 Jul 2010, at 1:30am, Jim Wilcoxson wrote:

> This problem of not doing writes, or doing them in
> the wrong order, is a different animal IMO.

If writes are not happening, or are happening in the wrong order, you're in 
trouble.  It's almost impossible to figure out how to even detect that hardware 
problem without a time-consuming scan of each unit that should be written 
which, in SQLite, means reading every page.  Since SQLite doesn't run a server 
process, it has no opportunity to use slack time to check integrity.

Under the conditions described on the web page, this problem can happen only 
because of a power failure or an OS (not an application) crash.  Under these 
conditions, the ext3 file system doesn't support ACID at all: any system that 
relies on ACID is not going to work.  And if the file system doesn't support 
ACID, the software can't.  I don't see any fast way of solving that kind of 
problem.

It might be useful to figure out whether we're aiming for detection or 
correction.  By 'correction' I don't mean recovery of all information, I mean 
restoring the database to some state it was in just after a COMMIT took effect. 
 There's no point in implementing a detection system if the users consider 
"This database is corrupt" something worth complaining about.  On the other 
hand, implementing a correction system may well slow down every write operation 
and perhaps '_open' too.  It's not worth doing that if slowing down SQLite will 
decrease usability.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-13 Thread Jim Wilcoxson
On Tue, Jul 13, 2010 at 8:06 PM, Roger Binns  wrote:

>
> On 07/13/2010 04:57 PM, Simon Slavin wrote:
> > One on each page and one for the entire file that checksums the page
> checksums ?
>
> One for each page plus one of the header would make the most sense, but the
>

I don't think this would work, because the problem described is that the
writes aren't making it to disk.   If pages don't make it to disk, the old
pages will be present, with the old, and valid checksums.

The only way I can see checksums helping with this problem is if there is a
checksum over the entire file (or checksums of checksums of each page).
Then if you do any writes, but not all writes, the overall checksum will be
invalid.  SQLite would have to verify the checksum on every page when the
database is opened and a hot journal exists, which could be quite a lot of
overhead for a large database.

Plus, SQLite would have to keep a list of the checksums for every page, and
at commit time, recompute the overall hash/checksum.  This could be lots of
memory for a large database.  A 1GB database for example would require 1M
20-byte SHA1 hashes, so 20MB.  If a bit for every page in the database was a
scalability problem in earlier versions of SQLite, I'm guessing that 20
bytes for every page would be unworkable.

I think a checksum on every page, and maybe even an error-correction code,
is a great idea as an optional feature.  But it would only detect hardware
problems and bit rot.  This problem of not doing writes, or doing them in
the wrong order, is a different animal IMO.

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://sites.google.com/site/hashbackup
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/13/2010 04:57 PM, Simon Slavin wrote:
> One on each page and one for the entire file that checksums the page 
> checksums ?

One for each page plus one of the header would make the most sense, but the
overriding concern would be something that is as backwards and forwards
compatible as possible.  The journal would also need to be covered, as well
as some indication linking the database and the journal.

Currently if you wrote a stray zero somewhere in the file then the chances
of it being detected are virtually zero.  It may even be possible to have
entire sectors/512 bytes revert to all zeroes or 0xff without it being detected.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkw8/48ACgkQmOOfHg372QTZ8ACfVIvqoavqysTwII+c8N13gLp2
7XUAoIa/D3j8+DmuYtT/6I9TAjOBGgP7
=5lT2
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-13 Thread Simon Slavin

On 13 Jul 2010, at 9:26pm, Roger Binns wrote:

> On 07/13/2010 12:59 PM, D. Richard Hipp wrote:
>> You are encouraged to submit comments, insights, criticism, and  
>> analysis to this mailing list.  Thanks.
> 
> Have you considered adding internal checksums to SQLite files so that at the
> very least corruption can be detected?

One on each page and one for the entire file that checksums the page checksums 
?  I've seen that done elsewhere.  Perhaps instead one for each thing that 
needs its own pages (e.g. one for each table and index).  Or just one for each 
table which includes the index data in with it.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database corruption on Linux ext3

2010-07-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/13/2010 12:59 PM, D. Richard Hipp wrote:
> You are encouraged to submit comments, insights, criticism, and  
> analysis to this mailing list.  Thanks.

Have you considered adding internal checksums to SQLite files so that at the
very least corruption can be detected?  I think this is one of the best
features of the current crop of version control systems.

IIRC one of the original papers describing ZFS claimed an undetected error
rate of once every 20TB of activity with hard drives.  That is not an
unreasonable amount of disk access in one day now.

Errors could of course instead be detected by using better filesystems,
drives, protocols etc, but we often don't have the luxury of dictating what
systems SQLite databases will be used on.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkw8y/IACgkQmOOfHg372QTWfgCgjcPCvOLQuPv1Xqc6XM3TkXLY
X80An2TFiktmQxOY0sMxO1VLD74YzYpX
=tJ9V
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database corruption on Linux ext3

2010-07-13 Thread D. Richard Hipp
An appliance manufacturer has discovered a database corruption issue  
on Linux using ext3.   The issue is documented here:

 http://www.sqlite.org/draft/lockingv3.html#ext3-barrier-problem

You are encouraged to submit comments, insights, criticism, and  
analysis to this mailing list.  Thanks.

D. Richard Hipp
d...@hwaci.com



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] database corruption problem

2010-06-08 Thread Dan Kennedy

On Jun 9, 2010, at 12:51 AM, Dave Segleau wrote:

>
> On 6/8/2010 9:25 AM, Dan Kennedy wrote:
>>
>> Those pragmas should not cause a problem. Simon's referring
>> to "PRAGMA synchronous". The docs for which explain the
>> risks assumed by changing the default setting.
>>
>>http://www.sqlite.org/pragma.html#pragma_synchronous
>>
>>
> As Dan said, "PRAGMA synchronous" can cause database corruption under
> certain conditions of application or system failure.

System failure. Not application failure.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] database corruption problem

2010-06-08 Thread Dave Segleau

On 6/8/2010 9:25 AM, Dan Kennedy wrote:
>
> Those pragmas should not cause a problem. Simon's referring
> to "PRAGMA synchronous". The docs for which explain the
> risks assumed by changing the default setting.
>
> http://www.sqlite.org/pragma.html#pragma_synchronous
>   
>
As Dan said, "PRAGMA synchronous" can cause database corruption under 
certain conditions of application or system failure.

Just as an FYI, "PRAGMA synchronous" will not cause database corruption 
in the latest Oracle Berkeley DB 11gR2 release that supports the SQLite 
API. The Berkeley DB write-ahead logging and recovery will ensure that 
the database remains consistent even in the event of application or 
system failure.

Regards,

Dave
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] database corruption problem

2010-06-08 Thread Simon Slavin

On 8 Jun 2010, at 5:25pm, Dan Kennedy wrote:

> Those pragmas should not cause a problem. Simon's referring
> to "PRAGMA synchronous". The docs for which explain the
> risks assumed by changing the default setting.
> 
>   http://www.sqlite.org/pragma.html#pragma_synchronous

You're both right.  I didn't bother to look up Dan's precise combination of 
PRAGMAs to see if the documentation contained any warnings.  But I did remember 
that some PRAGMAs can cause corruption under some circumstances.  However, Dan 
has already said he has read the 'how to corrupt your database' text which 
discusses everything he should be worrying about.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


  1   2   >