Re: [sqlite] How good is pragma integrity_check

2014-03-24 Thread Simon Slavin

On 24 Mar 2014, at 1:34pm, Tim Streater  wrote:

> My app does some periodic housekeeping, which includes occasional vacuuming 
> of some files. From the logs, I deduce that the OS completely lost this file 
> [1], which got recreated at the next housekeeping and then vacuumed, thus 
> producing an uncorrupt but useless file, as of course it didn't have any 
> tables.

Strongly suspect broken hard disk (physically broken, and about to make loud 
noises soon) or corrupt disk format.  After taking a backup, try running an 
operating system level format check using CHKDSK, SCANDISK, Disk Utility, or 
whatever that platform has.

> I'm considering changing the app to use one or other of the flags 
> SQLITE_OPEN_READWRITE and SQLITE_OPEN_CREATE for when I expect the file to be 
> there or not, so the app gets immediate notification.

Good.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-24 Thread Tim Streater
On 24 Mar 2014 at 08:36, Dominique Devienne  wrote: 

> On Mon, Mar 24, 2014 at 6:23 AM, Roger Binns  wrote:

>> So you don't have to use it.  Those of us who would rather proactively
>> know about data corruption (to the best of SQLite's ability to detect it)
>> are happy to take whatever hit there would be.  After all, fast queries on
>> corrupted data are pointless.
>
> +1
>
>> And as we see in messages on this list, finally spotting corruption long
>> after it first happened is very hard to recover from and hard to nail down
>> the cause for.
>
> Indeed. Fail-fast on corruptions trumps speed for many use cases. --DD

I eventually found out what happened in my case. The user sent me the 
"corrupted" file, and on testing it I found it wasn't corrupted at all, just 
completely empty. In fact I was able to make a bit-for-bit duplicate of the 
file he sent by doing this:

  Second-Mini% php -a
  Interactive shell

  php > $dbh = new sqlite3 ('wiggy');
  php > $dbh->exec ('vacuum');
  php > 

My app does some periodic housekeeping, which includes occasional vacuuming of 
some files. From the logs, I deduce that the OS completely lost this file [1], 
which got recreated at the next housekeeping and then vacuumed, thus producing 
an uncorrupt but useless file, as of course it didn't have any tables.

I'm considering changing the app to use one or other of the flags 
SQLITE_OPEN_READWRITE and SQLITE_OPEN_CREATE for when I expect the file to be 
there or not, so the app gets immediate notification.

[1] The user's OS crashed "badly" (his term) the next day.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-24 Thread Simon Slavin

On 24 Mar 2014, at 12:10pm, Markus Schaber  wrote:

> Not agreed. Another way to solve this problem is to include the file offset
> or sector number into the checksum - this will also detect "movements" of
> data to the wrong place, without the need for an atomically writeable
> "external" storage.

Store the page number (or some equivalent) inside the page.  Neat.  
Unfortunately it doesn't catch the situation where a SQLite page gets its 
sector changed to a sector outside the SQLite file -- to, perhaps to a sector 
which is currently not being used where corruption will never be noticed.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-24 Thread Markus Schaber
Hi, Simon,

Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org]
> On 24 Mar 2014, at 5:23am, Roger Binns  wrote:
> > On 21/03/14 15:24, Simon Slavin wrote:
> >> Checksums stored with the page index lists,
> >
> > SQLite already has the ability to carve out data on each page for
> > other uses.  For example the encryption extension uses this.
> 
> This would be better than nothing, but there is a problem with writing the
> checksum in the same operation that writes the page.  The two standard
> unlikely scenarios for data corruption (cosmic rays and static discharge)
> flip bits inside wires lines or inside RAM bits, and it's possible that the
> bit that gets flipped will be for the address of the sector being
> written/read rather than in the data being written/read.  So by storing the
> checksum in the page it checks you will get a sector that passes the checksum
> test, but has been written to the wrong place in storage and therefore
> corrupts the file.

Agreed.

> So the checksum should be stored away from the page it checks. 

Not agreed. Another way to solve this problem is to include the file offset
or sector number into the checksum - this will also detect "movements" of
data to the wrong place, without the need for an atomically writeable
"external" storage.

> My
> understanding of the SQLite file format suggests that the checksum could
> usefully be stored with the pointer to the page in the page store (TABLE /
> INDEX page list or whatever it's called) since it would be needed at the same
> time and the two could be retrieved in one operation.
> 
> > It couldn't be on by default for backwards compatibility reasons.
> > (WAL is another example of that.)
> 
> 
> Agreed.  Backward compatibility would be a problem for SQLite3 so it's not
> going to happen either way unless the file format is revised and more PRAGMAs
> added.  Maybe in SQLite4.





Best regards

Markus Schaber

CODESYS(r) a trademark of 3S-Smart Software Solutions GmbH

Inspiring Automation Solutions

3S-Smart Software Solutions GmbH
Dipl.-Inf. Markus Schaber | Product Development Core Technology
Memminger Str. 151 | 87439 Kempten | Germany
Tel. +49-831-54031-979 | Fax +49-831-54031-50

E-Mail: m.scha...@codesys.com | Web: http://www.codesys.com | CODESYS store: 
http://store.codesys.com
CODESYS forum: http://forum.codesys.com

Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade 
register: Kempten HRB 6186 | Tax ID No.: DE 167014915
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-24 Thread Simon Slavin

On 24 Mar 2014, at 5:23am, Roger Binns  wrote:

> On 21/03/14 15:24, Simon Slavin wrote:
>> Checksums stored with the page index lists,
> 
> SQLite already has the ability to carve out data on each page for other
> uses.  For example the encryption extension uses this.

This would be better than nothing, but there is a problem with writing the 
checksum in the same operation that writes the page.  The two standard unlikely 
scenarios for data corruption (cosmic rays and static discharge) flip bits 
inside wires lines or inside RAM bits, and it's possible that the bit that gets 
flipped will be for the address of the sector being written/read rather than in 
the data being written/read.  So by storing the checksum in the page it checks 
you will get a sector that passes the checksum test, but has been written to 
the wrong place in storage and therefore corrupts the file.

So the checksum should be stored away from the page it checks.  My 
understanding of the SQLite file format suggests that the checksum could 
usefully be stored with the pointer to the page in the page store (TABLE / 
INDEX page list or whatever it's called) since it would be needed at the same 
time and the two could be retrieved in one operation.

> It couldn't be on by default for backwards compatibility reasons.  (WAL is
> another example of that.)


Agreed.  Backward compatibility would be a problem for SQLite3 so it's not 
going to happen either way unless the file format is revised and more PRAGMAs 
added.  Maybe in SQLite4.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-24 Thread Dominique Devienne
On Mon, Mar 24, 2014 at 6:23 AM, Roger Binns  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> On 21/03/14 15:24, Simon Slavin wrote:
>> Checksums stored with the page index lists,
>
> SQLite already has the ability to carve out data on each page for other
> uses.  For example the encryption extension uses this.
>
>> Nevertheless, the basic SQLite engine is so efficient that any
>> slow-down would be noticed and might cause complaints.
>
> It couldn't be on by default for backwards compatibility reasons.  (WAL is
> another example of that.)
>
> So you don't have to use it.  Those of us who would rather proactively
> know about data corruption (to the best of SQLite's ability to detect it)
> are happy to take whatever hit there would be.  After all, fast queries on
> corrupted data are pointless.

+1

> And as we see in messages on this list, finally spotting corruption long
> after it first happened is very hard to recover from and hard to nail down
> the cause for.

Indeed. Fail-fast on corruptions trumps speed for many use cases. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 21/03/14 15:24, Simon Slavin wrote:
> Checksums stored with the page index lists,

SQLite already has the ability to carve out data on each page for other
uses.  For example the encryption extension uses this.

> Nevertheless, the basic SQLite engine is so efficient that any
> slow-down would be noticed and might cause complaints.

It couldn't be on by default for backwards compatibility reasons.  (WAL is
another example of that.)

So you don't have to use it.  Those of us who would rather proactively
know about data corruption (to the best of SQLite's ability to detect it)
are happy to take whatever hit there would be.  After all, fast queries on
corrupted data are pointless.

And as we see in messages on this list, finally spotting corruption long
after it first happened is very hard to recover from and hard to nail down
the cause for.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlMvwWMACgkQmOOfHg372QSnOQCdEpBWBvcNsntkZ6WPvDs0yAju
fc0AoJzagj56DyoYrhmeE73rwHhe+D2f
=ZAfw
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-21 Thread Simon Slavin

On 21 Mar 2014, at 7:57pm, Roger Binns  wrote:

> It also doesn't check the data, just the structure of the data.  There was
> a feature request ticket for several years for checksums to at least catch
> unexpected changes to the data itself:
> 
>  https://www.sqlite.org/src/tktview?name=72b01a982a
> 
> Sadly it was rejected a few weeks ago without explanation.

There are problems with the particular formulation of checksums described in 
that ticket.  It will miss corruption caused the majority of faults which will 
corrupt a database.

Checksums stored with the page index lists, and then a checksum (of checksums 
?) across the whole table or index would be an improvement, and would involve 
the minimum slow-down caused by the additional work and storage required.  
Nevertheless, the basic SQLite engine is so efficient that any slow-down would 
be noticed and might cause complaints.

I suspect that the transition to SQLite4 is the first practical chance to 
introduce checksums.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-21 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 20/03/14 18:06, Simon Slavin wrote:
> All useful as far as SQLite itself goes, and better than nothing.
> Unfortunately, failing hard disks do weird things in weird orders.  And
> the interaction between the physical hard disk and the on-board cache
> makes it impossible to find out what's really on the disk.  There's
> only one way to check whether the whole file is readable from a disk:
> read the whole file from the disk.  Then do an integrity check on the
> copy you just made.

It also doesn't check the data, just the structure of the data.  There was
a feature request ticket for several years for checksums to at least catch
unexpected changes to the data itself:

  https://www.sqlite.org/src/tktview?name=72b01a982a

Sadly it was rejected a few weeks ago without explanation.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlMsmbQACgkQmOOfHg372QQy5gCfVa599WN9XWUB1Q8ABKYPJmQ6
QBwAnjuHzmK2oGfnhHDqdjlhD/5CoYe/
=1Y2C
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-21 Thread Tim Streater
On 21 Mar 2014 at 01:06, Simon Slavin  wrote: 

> On 20 Mar 2014, at 11:33pm, Richard Hipp  wrote:
>
>> On Thu, Mar 20, 2014 at 7:18 PM, Tim Streater  wrote:
>>
>>> I had a case where attempts to access a table in a user's db gave "no such
>>> table", where 60 mins previously (according to the log) querying that table
>>> gave no problems. About a day and a half later the user's machine had what
>>> he described as a "bad crash", so I'm suspecting he has a failing disk.
>>> [snip]
>>
>> PRAGMA integrity_check catches most things.  PRAGMA quick_check also does a
>> good job, and it is much faster.  The only difference is that quick_check
>> does not verify that indices agree with tables, whereas a full
>> integrity_check does.
>
> All useful as far as SQLite itself goes, and better than nothing. 
> Unfortunately, failing hard disks do weird things in weird orders.  And the
> interaction between the physical hard disk and the on-board cache makes it
> impossible to find out what's really on the disk.  There's only one way to
> check whether the whole file is readable from a disk: read the whole file from
> the disk.  Then do an integrity check on the copy you just made.
>
> Practically speaking, it's impossible.  Concentrate on having really good
> backups instead.

Well quite. But although the question of backups is a matter for the user, as 
the software is on his machine, I'll probably beef up the startup checks and 
add periodic checks of each db (the app does some periodic checks anyway). I'm 
hoping the user will forward me his damaged db so I can use it for testing.



--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-20 Thread Simon Slavin

On 20 Mar 2014, at 11:33pm, Richard Hipp  wrote:

> On Thu, Mar 20, 2014 at 7:18 PM, Tim Streater  wrote:
> 
>> I had a case where attempts to access a table in a user's db gave "no such
>> table", where 60 mins previously (according to the log) querying that table
>> gave no problems. About a day and a half later the user's machine had what
>> he described as a "bad crash", so I'm suspecting he has a failing disk. 
>> [snip]
> 
> PRAGMA integrity_check catches most things.  PRAGMA quick_check also does a
> good job, and it is much faster.  The only difference is that quick_check
> does not verify that indices agree with tables, whereas a full
> integrity_check does.

All useful as far as SQLite itself goes, and better than nothing.  
Unfortunately, failing hard disks do weird things in weird orders.  And the 
interaction between the physical hard disk and the on-board cache makes it 
impossible to find out what's really on the disk.  There's only one way to 
check whether the whole file is readable from a disk: read the whole file from 
the disk.  Then do an integrity check on the copy you just made.

Practically speaking, it's impossible.  Concentrate on having really good 
backups instead.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How good is pragma integrity_check

2014-03-20 Thread Richard Hipp
On Thu, Mar 20, 2014 at 7:18 PM, Tim Streater  wrote:

> I had a case where attempts to access a table in a user's db gave "no such
> table", where 60 mins previously (according to the log) querying that table
> gave no problems. About a day and a half later the user's machine had what
> he described as a "bad crash", so I'm suspecting he has a failing disk. Due
> to a separate coincidence, this meant the app could not restart.
>
> Now, I want to protect against this or give better warning, and I am
> wondering whether corruption looking like a missing table would be caught
> by the integrity_check pragma.
>

PRAGMA integrity_check catches most things.  PRAGMA quick_check also does a
good job, and it is much faster.  The only difference is that quick_check
does not verify that indices agree with tables, whereas a full
integrity_check does.

There is also PRAGMA foreign_key_check to verify referential integrity.


>
> --
> Cheers  --  Tim
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How good is pragma integrity_check

2014-03-20 Thread Tim Streater
I had a case where attempts to access a table in a user's db gave "no such 
table", where 60 mins previously (according to the log) querying that table 
gave no problems. About a day and a half later the user's machine had what he 
described as a "bad crash", so I'm suspecting he has a failing disk. Due to a 
separate coincidence, this meant the app could not restart.

Now, I want to protect against this or give better warning, and I am wondering 
whether corruption looking like a missing table would be caught by the 
integrity_check pragma.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users