Re: [sqlite] corrupt database recovery

2009-05-28 Thread Gene Allen
We weren't logging the encryption failure.

>>>> The feof() does return true until you attempt to read PAST the end of a
>> file.
The "does" in the above line should be "doesn't"

>> while code
Yeah, this code was written as a 'do...while', instead of a 'while do.'

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Machin
Sent: Thursday, May 28, 2009 7:19 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] corrupt database recovery

On 29/05/2009 9:34 AM, Gene Allen wrote:
> Yeah.  
> 
> Since my code works in blocks, read/compress/encrypt/write, loop.  Almost
> all the real data was being written to the compressed file, however any
> finalization and flushing of the stream wasn't occurring (since the
encrypt
> was failing)

and the encrypt failure wasn't logged?

 > so the last bit of any SQLite database wouldn't be written.

If so, pragma integrity_check should report that some of the pages 
actually written contain pointers to pages that are past the end of the 
file, shouldn't it?

[snip]

>> Well...a more structured test exposed the problem and it was this:
>>
>> The feof() does return true until you attempt to read PAST the end of a
>> file.

If feof doesn't continue to return true, it is broken.

> So the code worked great until the file's length was a multiple of
> the
>> buffer size (in my case 262,144 bytes).  As you can imagine that doesn't
>> happen too often in the real world.
>>
>> Since I assumed that a feof would return true where there wasn't any more
>> data in the file, I would start another pass at reading a chunk of data
>> (which wouldn't find anything) and run thru the compression/encryption
> code.
>> The compression code worked handled it correctly, but the encryption
>> required that a DWORD boundary (blowfish) and since 0 is on such a
> boundary
>> but at the wrong end...it would fail.  

Silently? Unlogged?

In any case, I would have thought using feof() was not needed ...
long time since I've written C in earnest, but isn't something like this 
the standard idiom:

#define BUFSIZ 262144
buff char[BUFSIZ];
size_t nbytes;
FILE *f;
f = fopen("filename", "rb");
while ((nbytes = fread(buff, 1, BUFSIZ, f)) {
do_something(buff, nbytes);
}

??

HTH,

John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-28 Thread John Machin
On 29/05/2009 10:18 AM, John Machin wrote:
> On 29/05/2009 9:34 AM, Gene Allen wrote:
>> Yeah.  
>>
>> Since my code works in blocks, read/compress/encrypt/write, loop.  Almost
>> all the real data was being written to the compressed file, however any
>> finalization and flushing of the stream wasn't occurring (since the encrypt
>> was failing)
> 
> and the encrypt failure wasn't logged?
> 
>  > so the last bit of any SQLite database wouldn't be written.
> 
> If so, pragma integrity_check should report that some of the pages 
> actually written contain pointers to pages that are past the end of the 
> file, shouldn't it?

Your output from the integrity_check shows complaints about invalid page 
  numbers in the range 462 to 773. At the default page size of 1024, 
those page numbers span (773-462+1)*1024 = 319488 bytes so you are 
missing more than a 262144-byte chunk [unless your page size is 512!].

This doesn't seem to gel with the combination of "almost all the real 
data was being written" and the hypothesis that the database was 
corrupted merely by truncation.

What are the page size, the expected size of the database, and the 
actual (truncated) size of the database?

What evidence do you have that the feof problem actually happened in 
this case?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-28 Thread John Machin
On 29/05/2009 9:34 AM, Gene Allen wrote:
> Yeah.  
> 
> Since my code works in blocks, read/compress/encrypt/write, loop.  Almost
> all the real data was being written to the compressed file, however any
> finalization and flushing of the stream wasn't occurring (since the encrypt
> was failing)

and the encrypt failure wasn't logged?

 > so the last bit of any SQLite database wouldn't be written.

If so, pragma integrity_check should report that some of the pages 
actually written contain pointers to pages that are past the end of the 
file, shouldn't it?

[snip]

>> Well...a more structured test exposed the problem and it was this:
>>
>> The feof() does return true until you attempt to read PAST the end of a
>> file.

If feof doesn't continue to return true, it is broken.

> So the code worked great until the file's length was a multiple of
> the
>> buffer size (in my case 262,144 bytes).  As you can imagine that doesn't
>> happen too often in the real world.
>>
>> Since I assumed that a feof would return true where there wasn't any more
>> data in the file, I would start another pass at reading a chunk of data
>> (which wouldn't find anything) and run thru the compression/encryption
> code.
>> The compression code worked handled it correctly, but the encryption
>> required that a DWORD boundary (blowfish) and since 0 is on such a
> boundary
>> but at the wrong end...it would fail.  

Silently? Unlogged?

In any case, I would have thought using feof() was not needed ...
long time since I've written C in earnest, but isn't something like this 
the standard idiom:

#define BUFSIZ 262144
buff char[BUFSIZ];
size_t nbytes;
FILE *f;
f = fopen("filename", "rb");
while ((nbytes = fread(buff, 1, BUFSIZ, f)) {
do_something(buff, nbytes);
}

??

HTH,

John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-28 Thread Allen Fowler



> 
> Since my code works in blocks, read/compress/encrypt/write, loop.  Almost
> all the real data was being written to the compressed file, however any
> finalization and flushing of the stream wasn't occurring (since the encrypt
> was failing) so the last bit of any SQLite database wouldn't be written.  
> 
> I'm guessing that also contributes to how infrequent I saw a problem.
> 

What size blocks where you using? 

So, what _is_ stored in the final "block size" length of the DB file?  It must 
be something not-so-critical as you are able to dump the DB, right?

Perhaps adding a full-file verify pass to your compress/encrypt code would be a 
good idea.


  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-28 Thread Gene Allen
Yeah.  

Since my code works in blocks, read/compress/encrypt/write, loop.  Almost
all the real data was being written to the compressed file, however any
finalization and flushing of the stream wasn't occurring (since the encrypt
was failing) so the last bit of any SQLite database wouldn't be written.  

I'm guessing that also contributes to how infrequent I saw a problem.

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Allen Fowler
Sent: Thursday, May 28, 2009 5:17 PM
To: General Discussion of SQLite Database; kennethinbox-sql...@yahoo.com
Subject: Re: [sqlite] corrupt database recovery





> I think I found my defect: my old stress tests was based on doing
> compression/encryptions/decryption/decompression passes on files of random
> sizes; so I would do about a 10 million passes or so and say...that's
pretty
> good.
> 
> Well...a more structured test exposed the problem and it was this:
> 
> The feof() does return true until you attempt to read PAST the end of a
> file. So the code worked great until the file's length was a multiple of
the
> buffer size (in my case 262,144 bytes).  As you can imagine that doesn't
> happen too often in the real world.
> 
> Since I assumed that a feof would return true where there wasn't any more
> data in the file, I would start another pass at reading a chunk of data
> (which wouldn't find anything) and run thru the compression/encryption
code.
> The compression code worked handled it correctly, but the encryption
> required that a DWORD boundary (blowfish) and since 0 is on such a
boundary
> but at the wrong end...it would fail.  
> 



But was causing the DB to be corrupt, but a dump to work fine?

What structure was getting damaged?



  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-28 Thread Allen Fowler




> I think I found my defect: my old stress tests was based on doing
> compression/encryptions/decryption/decompression passes on files of random
> sizes; so I would do about a 10 million passes or so and say...that's pretty
> good.
> 
> Well...a more structured test exposed the problem and it was this:
> 
> The feof() does return true until you attempt to read PAST the end of a
> file. So the code worked great until the file's length was a multiple of the
> buffer size (in my case 262,144 bytes).  As you can imagine that doesn't
> happen too often in the real world.
> 
> Since I assumed that a feof would return true where there wasn't any more
> data in the file, I would start another pass at reading a chunk of data
> (which wouldn't find anything) and run thru the compression/encryption code.
> The compression code worked handled it correctly, but the encryption
> required that a DWORD boundary (blowfish) and since 0 is on such a boundary
> but at the wrong end...it would fail.  
> 



But was causing the DB to be corrupt, but a dump to work fine?

What structure was getting damaged?



  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-28 Thread Gene Allen
I have not! but I just might...thanks for the tip.

I think I found my defect: my old stress tests was based on doing
compression/encryptions/decryption/decompression passes on files of random
sizes; so I would do about a 10 million passes or so and say...that's pretty
good.

Well...a more structured test exposed the problem and it was this:

The feof() does return true until you attempt to read PAST the end of a
file. So the code worked great until the file's length was a multiple of the
buffer size (in my case 262,144 bytes).  As you can imagine that doesn't
happen too often in the real world.

Since I assumed that a feof would return true where there wasn't any more
data in the file, I would start another pass at reading a chunk of data
(which wouldn't find anything) and run thru the compression/encryption code.
The compression code worked handled it correctly, but the encryption
required that a DWORD boundary (blowfish) and since 0 is on such a boundary
but at the wrong end...it would fail.  

I fixed the code to test for 0 bytes being read, instead of solely relying
on feof() and all seems well. :)

I would like to say thank you for all your help and advice.

Gene

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Ken
Sent: Thursday, May 28, 2009 11:32 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] corrupt database recovery


Gene,

Im sure others have suggested, but have you tried running your code through
valgrind? 
Can you remove the custom VFS ?


--- On Wed, 5/27/09, Gene Allen <g...@bystorm.com> wrote:

> From: Gene Allen <g...@bystorm.com>
> Subject: Re: [sqlite] corrupt database recovery
> To: mgr...@medcom-online.de, "'General Discussion of SQLite Database'"
<sqlite-users@sqlite.org>
> Date: Wednesday, May 27, 2009, 4:51 PM
> I've reviewed the code over and over
> again, but am yet to find it.  But it
> is a good bit of very complicated code (blowfish and
> compression code).
> 
> That's why I was asking about the format of the
> database.  Since I'm able to
> successfully do a .dump but the integrity_check whines, I'm
> thinking that
> maybe if I focus on where what could cause that, it might
> give me a clue.  
> 
> For example, if the tree is stored at the end of file maybe
> I'm falling out
> the loop without writing the final bytes.  But if the
> tree is not localized
> to one area of the file, I'll have to rethink my approach.
> 
> 
> 
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org]
> On Behalf Of Marcus Grimm
> Sent: Wednesday, May 27, 2009 2:44 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
> 
> so, if you think it is a coding error on your side it will
> be a hard work to find the source. I can only image side
> effects
> on an used API like sqlite3 with the classics:
> 
> - stack frame overload in a function that calls sqlite
> functions
> - using a local stack variable as a return pointer and
> reuse
>   this external.
> - memory overwriting on malloc'ed pointers or reusing of
> allready
>   free'd memory pages.
> 
> one and two might be found using a code review and
> identify
> local variables or particular arrays that are written:
> So wherever you call anything from sqlite check the stack
> declaration and review the usage of these variables...
> of course just don't do number two... ;)
> 
> memory errors might be detected using special debug
> versions of the
> malloc/free library, by code review, or manually by adding
> some
> test variables whereever you malloc or free a pointer.
> 
> I'm sure you allready did some of these... anyway, good
> luck
> 
> are you using threads? would be another beautiful issue to
> trace...;)
> 
> Marcus
> 
> 
> 
> > It is a server class machine running Windows 2003 with
> 4 GB.  No, it's a
> > local drive with 20GB free on it.
> >
> > I'm sure that it's a coding error on my part. 
> SQLite is very stable, in
> > my
> > opinion.  I'm just trying to get a rough idea on
> where I'm screwing up the
> > database.
> >
> >
> > -Original Message-
> > From: sqlite-users-boun...@sqlite.org
> > [mailto:sqlite-users-boun...@sqlite.org]
> On Behalf Of John Elrick
> > Sent: Wednesday, May 27, 2009 12:58 PM
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] corrupt database recovery
> >
> > What platform?  Any chance they are using a
> network drive?
> >
> >
> > John
> >
> > Gene wrote:
> >> My code is outside the database layer.  So I

Re: [sqlite] corrupt database recovery

2009-05-28 Thread Ken

Gene,

Im sure others have suggested, but have you tried running your code through 
valgrind? 
Can you remove the custom VFS ?


--- On Wed, 5/27/09, Gene Allen <g...@bystorm.com> wrote:

> From: Gene Allen <g...@bystorm.com>
> Subject: Re: [sqlite] corrupt database recovery
> To: mgr...@medcom-online.de, "'General Discussion of SQLite Database'" 
> <sqlite-users@sqlite.org>
> Date: Wednesday, May 27, 2009, 4:51 PM
> I've reviewed the code over and over
> again, but am yet to find it.  But it
> is a good bit of very complicated code (blowfish and
> compression code).
> 
> That's why I was asking about the format of the
> database.  Since I'm able to
> successfully do a .dump but the integrity_check whines, I'm
> thinking that
> maybe if I focus on where what could cause that, it might
> give me a clue.  
> 
> For example, if the tree is stored at the end of file maybe
> I'm falling out
> the loop without writing the final bytes.  But if the
> tree is not localized
> to one area of the file, I'll have to rethink my approach.
> 
> 
> 
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org]
> On Behalf Of Marcus Grimm
> Sent: Wednesday, May 27, 2009 2:44 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
> 
> so, if you think it is a coding error on your side it will
> be a hard work to find the source. I can only image side
> effects
> on an used API like sqlite3 with the classics:
> 
> - stack frame overload in a function that calls sqlite
> functions
> - using a local stack variable as a return pointer and
> reuse
>   this external.
> - memory overwriting on malloc'ed pointers or reusing of
> allready
>   free'd memory pages.
> 
> one and two might be found using a code review and
> identify
> local variables or particular arrays that are written:
> So wherever you call anything from sqlite check the stack
> declaration and review the usage of these variables...
> of course just don't do number two... ;)
> 
> memory errors might be detected using special debug
> versions of the
> malloc/free library, by code review, or manually by adding
> some
> test variables whereever you malloc or free a pointer.
> 
> I'm sure you allready did some of these... anyway, good
> luck
> 
> are you using threads? would be another beautiful issue to
> trace...;)
> 
> Marcus
> 
> 
> 
> > It is a server class machine running Windows 2003 with
> 4 GB.  No, it's a
> > local drive with 20GB free on it.
> >
> > I'm sure that it's a coding error on my part. 
> SQLite is very stable, in
> > my
> > opinion.  I'm just trying to get a rough idea on
> where I'm screwing up the
> > database.
> >
> >
> > -Original Message-
> > From: sqlite-users-boun...@sqlite.org
> > [mailto:sqlite-users-boun...@sqlite.org]
> On Behalf Of John Elrick
> > Sent: Wednesday, May 27, 2009 12:58 PM
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] corrupt database recovery
> >
> > What platform?  Any chance they are using a
> network drive?
> >
> >
> > John
> >
> > Gene wrote:
> >> My code is outside the database layer.  So I
> do all my database work,
> >> then
> >> compress and encrypt it.  No errors are
> returned anywhere.  I'm guessing
> >> that it's going to be an uninitialized variable or
> byte alignment
> >> problems
> >> somewhere.
> >>
> >> This code is running on hundreds of machines
> without a problem and I've
> >> never reproduced it but every now and again I get
> a support ticket
> >> showing
> > a
> >> corrupt database.  So I'm trying to figure
> out WHERE to look.
> >>
> >> -Original Message-
> >> From: sqlite-users-boun...@sqlite.org
> >> [mailto:sqlite-users-boun...@sqlite.org]
> On Behalf Of John Elrick
> >> Sent: Wednesday, May 27, 2009 10:59 AM
> >> To: General Discussion of SQLite Database
> >> Subject: Re: [sqlite] corrupt database recovery
> >>
> >> Gene Allen wrote:
> >>
> >>> Ok...it's happened again and I've decided that
> I need to track this
> >>> down
> >>> once and for all!
> >>>
> >>> Here is what I'm seeing: I get errors when I
> do a integrity_check (see
> >>> below), but I can .dump it to a text file and
> then .read it into
> >>> another
> >>> database ok.
> >>>
>

Re: [sqlite] corrupt database recovery

2009-05-27 Thread Marcus Grimm
I might have missed something, but:
What are you compressing?
ehm... Are compressing the database file while sqlite is running ?


> The problem is happening in the field and the program deals with a corrupt
> database by renaming it, logging the error and creating a new database.
> So
> I only end up with the corrupt one.
>
> I would GUESS that the database doesn't have any free pages except those
> at
> the end since it's an auditing product so we never delete anything...but I
> could be wrong.
>
> The only reason I think I'm 'dropping a byte' is that all the
> compression/encryption code is byte based and the database gets corrupt
> (invalid pages) but I'm able to .dump it successfully.That and I try
> to
> take the blame first, since it's usually my fault. (just ask my wife!)
>
> Right now I'm running a large test on the encryption/compression code to
> see
> if I can get it to break.
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Marcus Grimm
> Sent: Wednesday, May 27, 2009 5:09 PM
> To: 'General Discussion of SQLite Database'
> Subject: Re: [sqlite] corrupt database recovery
>
> sorry, I don't know where sqlite stores any kind of data like
> index or raw table, my suspect is that, since sqlite
> usually looks for a free page, the pages are randomly
> distributed over the file or added at the end if no free
> page has been found.
>
> what I don't understand:
> what makes you thinking that you may miss some bytes of a tree
> (and thus of a page) ?
> That is sqlite internal, except you are using your own
> vfs, I guess, ...do you ?
>
> by the way: did your application crash prior or nearby you had this
> corrupted DB ?
>
>> I've reviewed the code over and over again, but am yet to find it.  But
>> it
>> is a good bit of very complicated code (blowfish and compression code).
>>
>> That's why I was asking about the format of the database.  Since I'm
>> able
>> to
>> successfully do a .dump but the integrity_check whines, I'm thinking
>> that
>> maybe if I focus on where what could cause that, it might give me a
>> clue.
>>
>> For example, if the tree is stored at the end of file maybe I'm falling
>> out
>> the loop without writing the final bytes.  But if the tree is not
>> localized
>> to one area of the file, I'll have to rethink my approach.
>>
>>
>> -----Original Message-
>> From: sqlite-users-boun...@sqlite.org
>> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Marcus Grimm
>> Sent: Wednesday, May 27, 2009 2:44 PM
>> To: General Discussion of SQLite Database
>> Subject: Re: [sqlite] corrupt database recovery
>>
>> so, if you think it is a coding error on your side it will
>> be a hard work to find the source. I can only image side effects
>> on an used API like sqlite3 with the classics:
>>
>> - stack frame overload in a function that calls sqlite functions
>> - using a local stack variable as a return pointer and reuse
>>   this external.
>> - memory overwriting on malloc'ed pointers or reusing of allready
>>   free'd memory pages.
>>
>> one and two might be found using a code review and identify
>> local variables or particular arrays that are written:
>> So wherever you call anything from sqlite check the stack
>> declaration and review the usage of these variables...
>> of course just don't do number two... ;)
>>
>> memory errors might be detected using special debug versions of the
>> malloc/free library, by code review, or manually by adding some
>> test variables whereever you malloc or free a pointer.
>>
>> I'm sure you allready did some of these... anyway, good luck
>>
>> are you using threads? would be another beautiful issue to trace...;)
>>
>> Marcus
>>
>>
>>
>>> It is a server class machine running Windows 2003 with 4 GB.  No, it's
>>> a
>>> local drive with 20GB free on it.
>>>
>>> I'm sure that it's a coding error on my part.  SQLite is very stable,
>>> in
>>> my
>>> opinion.  I'm just trying to get a rough idea on where I'm screwing up
>>> the
>>> database.
>>>
>>>
>>> -Original Message-
>>> From: sqlite-users-boun...@sqlite.org
>>> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
>>> Sent: Wednesday, May 27, 2009 12:58 PM
>>> To: General Discussion of SQLite Database
>>> Subject: Re: [sqlite] corrupt database recovery
>>>
>>> What platfo

Re: [sqlite] corrupt database recovery

2009-05-27 Thread Gene
That's pretty much my test harness.  Right now, I'm running a test on every
file size from 0 to a 10 MB. 

If that works correctly (which I expect it will), I think I'm going to have
to spend the time and effort setting up a massive test environment and
stress the mess out of the whole system.  I'm trying to avoid that since it
would take a least a week to set that up.

Thank you for the support, I need it. 

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Doug
Sent: Wednesday, May 27, 2009 5:20 PM
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] corrupt database recovery

I'm sorry that I can't help with the SQLite part, but couldn't you write a
simple test harness that would create a file (of random size?) with random
contents, encrypt/compress to a secondary file, then decompress/decrypt to a
third file.  Compare first and third files.  If they match, delete and run
test again.  If they don't match you now have your debugging input.  I'd
think letting something like that run for a day or so would flush out
boundary/buffer errors.  It's not deterministic, but should be easy enough
to bang out and let run on a secondary machine while you continue your main
search.

Good luck either way.

Doug


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Gene Allen
> Sent: Wednesday, May 27, 2009 4:52 PM
> To: mgr...@medcom-online.de; 'General Discussion of SQLite Database'
> Subject: Re: [sqlite] corrupt database recovery
> 
> I've reviewed the code over and over again, but am yet to find it.  But
> it
> is a good bit of very complicated code (blowfish and compression code).
> 
> That's why I was asking about the format of the database.  Since I'm
> able to
> successfully do a .dump but the integrity_check whines, I'm thinking
> that
> maybe if I focus on where what could cause that, it might give me a
> clue.
> 
> For example, if the tree is stored at the end of file maybe I'm falling
> out
> the loop without writing the final bytes.  But if the tree is not
> localized
> to one area of the file, I'll have to rethink my approach.
> 
> 
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Marcus Grimm
> Sent: Wednesday, May 27, 2009 2:44 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
> 
> so, if you think it is a coding error on your side it will
> be a hard work to find the source. I can only image side effects
> on an used API like sqlite3 with the classics:
> 
> - stack frame overload in a function that calls sqlite functions
> - using a local stack variable as a return pointer and reuse
>   this external.
> - memory overwriting on malloc'ed pointers or reusing of allready
>   free'd memory pages.
> 
> one and two might be found using a code review and identify
> local variables or particular arrays that are written:
> So wherever you call anything from sqlite check the stack
> declaration and review the usage of these variables...
> of course just don't do number two... ;)
> 
> memory errors might be detected using special debug versions of the
> malloc/free library, by code review, or manually by adding some
> test variables whereever you malloc or free a pointer.
> 
> I'm sure you allready did some of these... anyway, good luck
> 
> are you using threads? would be another beautiful issue to trace...;)
> 
> Marcus
> 
> 
> 
> > It is a server class machine running Windows 2003 with 4 GB.  No,
> it's a
> > local drive with 20GB free on it.
> >
> > I'm sure that it's a coding error on my part.  SQLite is very stable,
> in
> > my
> > opinion.  I'm just trying to get a rough idea on where I'm screwing
> up the
> > database.
> >
> >
> > -----Original Message-
> > From: sqlite-users-boun...@sqlite.org
> > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
> > Sent: Wednesday, May 27, 2009 12:58 PM
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] corrupt database recovery
> >
> > What platform?  Any chance they are using a network drive?
> >
> >
> > John
> >
> > Gene wrote:
> >> My code is outside the database layer.  So I do all my database
> work,
> >> then
> >> compress and encrypt it.  No errors are returned anywhere.  I'm
> guessing
> >> that it's going to be an uninitialized variable or byte alignment
> >> problems
> >> somewhere.
> >>
> >> This code is running on hundreds of machines without a 

Re: [sqlite] corrupt database recovery

2009-05-27 Thread Gene
The problem is happening in the field and the program deals with a corrupt
database by renaming it, logging the error and creating a new database.  So
I only end up with the corrupt one.

I would GUESS that the database doesn't have any free pages except those at
the end since it's an auditing product so we never delete anything...but I
could be wrong.

The only reason I think I'm 'dropping a byte' is that all the
compression/encryption code is byte based and the database gets corrupt
(invalid pages) but I'm able to .dump it successfully.That and I try to
take the blame first, since it's usually my fault. (just ask my wife!)

Right now I'm running a large test on the encryption/compression code to see
if I can get it to break.  

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Marcus Grimm
Sent: Wednesday, May 27, 2009 5:09 PM
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] corrupt database recovery

sorry, I don't know where sqlite stores any kind of data like
index or raw table, my suspect is that, since sqlite
usually looks for a free page, the pages are randomly
distributed over the file or added at the end if no free
page has been found.

what I don't understand:
what makes you thinking that you may miss some bytes of a tree
(and thus of a page) ?
That is sqlite internal, except you are using your own
vfs, I guess, ...do you ?

by the way: did your application crash prior or nearby you had this
corrupted DB ?

> I've reviewed the code over and over again, but am yet to find it.  But it
> is a good bit of very complicated code (blowfish and compression code).
>
> That's why I was asking about the format of the database.  Since I'm able
> to
> successfully do a .dump but the integrity_check whines, I'm thinking that
> maybe if I focus on where what could cause that, it might give me a clue.
>
> For example, if the tree is stored at the end of file maybe I'm falling
> out
> the loop without writing the final bytes.  But if the tree is not
> localized
> to one area of the file, I'll have to rethink my approach.
>
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Marcus Grimm
> Sent: Wednesday, May 27, 2009 2:44 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
>
> so, if you think it is a coding error on your side it will
> be a hard work to find the source. I can only image side effects
> on an used API like sqlite3 with the classics:
>
> - stack frame overload in a function that calls sqlite functions
> - using a local stack variable as a return pointer and reuse
>   this external.
> - memory overwriting on malloc'ed pointers or reusing of allready
>   free'd memory pages.
>
> one and two might be found using a code review and identify
> local variables or particular arrays that are written:
> So wherever you call anything from sqlite check the stack
> declaration and review the usage of these variables...
> of course just don't do number two... ;)
>
> memory errors might be detected using special debug versions of the
> malloc/free library, by code review, or manually by adding some
> test variables whereever you malloc or free a pointer.
>
> I'm sure you allready did some of these... anyway, good luck
>
> are you using threads? would be another beautiful issue to trace...;)
>
> Marcus
>
>
>
>> It is a server class machine running Windows 2003 with 4 GB.  No, it's a
>> local drive with 20GB free on it.
>>
>> I'm sure that it's a coding error on my part.  SQLite is very stable, in
>> my
>> opinion.  I'm just trying to get a rough idea on where I'm screwing up
>> the
>> database.
>>
>>
>> -Original Message-----
>> From: sqlite-users-boun...@sqlite.org
>> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
>> Sent: Wednesday, May 27, 2009 12:58 PM
>> To: General Discussion of SQLite Database
>> Subject: Re: [sqlite] corrupt database recovery
>>
>> What platform?  Any chance they are using a network drive?
>>
>>
>> John
>>
>> Gene wrote:
>>> My code is outside the database layer.  So I do all my database work,
>>> then
>>> compress and encrypt it.  No errors are returned anywhere.  I'm
>>> guessing
>>> that it's going to be an uninitialized variable or byte alignment
>>> problems
>>> somewhere.
>>>
>>> This code is running on hundreds of machines without a problem and I've
>>> never reproduced it but every now and again I get a support ticket
>>> showing
>> a
>>> corrupt database

Re: [sqlite] corrupt database recovery

2009-05-27 Thread Doug
I'm sorry that I can't help with the SQLite part, but couldn't you write a
simple test harness that would create a file (of random size?) with random
contents, encrypt/compress to a secondary file, then decompress/decrypt to a
third file.  Compare first and third files.  If they match, delete and run
test again.  If they don't match you now have your debugging input.  I'd
think letting something like that run for a day or so would flush out
boundary/buffer errors.  It's not deterministic, but should be easy enough
to bang out and let run on a secondary machine while you continue your main
search.

Good luck either way.

Doug


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Gene Allen
> Sent: Wednesday, May 27, 2009 4:52 PM
> To: mgr...@medcom-online.de; 'General Discussion of SQLite Database'
> Subject: Re: [sqlite] corrupt database recovery
> 
> I've reviewed the code over and over again, but am yet to find it.  But
> it
> is a good bit of very complicated code (blowfish and compression code).
> 
> That's why I was asking about the format of the database.  Since I'm
> able to
> successfully do a .dump but the integrity_check whines, I'm thinking
> that
> maybe if I focus on where what could cause that, it might give me a
> clue.
> 
> For example, if the tree is stored at the end of file maybe I'm falling
> out
> the loop without writing the final bytes.  But if the tree is not
> localized
> to one area of the file, I'll have to rethink my approach.
> 
> 
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Marcus Grimm
> Sent: Wednesday, May 27, 2009 2:44 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
> 
> so, if you think it is a coding error on your side it will
> be a hard work to find the source. I can only image side effects
> on an used API like sqlite3 with the classics:
> 
> - stack frame overload in a function that calls sqlite functions
> - using a local stack variable as a return pointer and reuse
>   this external.
> - memory overwriting on malloc'ed pointers or reusing of allready
>   free'd memory pages.
> 
> one and two might be found using a code review and identify
> local variables or particular arrays that are written:
> So wherever you call anything from sqlite check the stack
> declaration and review the usage of these variables...
> of course just don't do number two... ;)
> 
> memory errors might be detected using special debug versions of the
> malloc/free library, by code review, or manually by adding some
> test variables whereever you malloc or free a pointer.
> 
> I'm sure you allready did some of these... anyway, good luck
> 
> are you using threads? would be another beautiful issue to trace...;)
> 
> Marcus
> 
> 
> 
> > It is a server class machine running Windows 2003 with 4 GB.  No,
> it's a
> > local drive with 20GB free on it.
> >
> > I'm sure that it's a coding error on my part.  SQLite is very stable,
> in
> > my
> > opinion.  I'm just trying to get a rough idea on where I'm screwing
> up the
> > database.
> >
> >
> > -----Original Message-
> > From: sqlite-users-boun...@sqlite.org
> > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
> > Sent: Wednesday, May 27, 2009 12:58 PM
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] corrupt database recovery
> >
> > What platform?  Any chance they are using a network drive?
> >
> >
> > John
> >
> > Gene wrote:
> >> My code is outside the database layer.  So I do all my database
> work,
> >> then
> >> compress and encrypt it.  No errors are returned anywhere.  I'm
> guessing
> >> that it's going to be an uninitialized variable or byte alignment
> >> problems
> >> somewhere.
> >>
> >> This code is running on hundreds of machines without a problem and
> I've
> >> never reproduced it but every now and again I get a support ticket
> >> showing
> > a
> >> corrupt database.  So I'm trying to figure out WHERE to look.
> >>
> >> -Original Message-
> >> From: sqlite-users-boun...@sqlite.org
> >> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
> >> Sent: Wednesday, May 27, 2009 10:59 AM
> >> To: General Discussion of SQLite Database
> >> Subject: Re: [sqlite] corrupt database recovery
> >>
> >> Gene Allen wrote:
> >>
> >>> Ok...it's happened agai

Re: [sqlite] corrupt database recovery

2009-05-27 Thread Marcus Grimm
sorry, I don't know where sqlite stores any kind of data like
index or raw table, my suspect is that, since sqlite
usually looks for a free page, the pages are randomly
distributed over the file or added at the end if no free
page has been found.

what I don't understand:
what makes you thinking that you may miss some bytes of a tree
(and thus of a page) ?
That is sqlite internal, except you are using your own
vfs, I guess, ...do you ?

by the way: did your application crash prior or nearby you had this
corrupted DB ?

> I've reviewed the code over and over again, but am yet to find it.  But it
> is a good bit of very complicated code (blowfish and compression code).
>
> That's why I was asking about the format of the database.  Since I'm able
> to
> successfully do a .dump but the integrity_check whines, I'm thinking that
> maybe if I focus on where what could cause that, it might give me a clue.
>
> For example, if the tree is stored at the end of file maybe I'm falling
> out
> the loop without writing the final bytes.  But if the tree is not
> localized
> to one area of the file, I'll have to rethink my approach.
>
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Marcus Grimm
> Sent: Wednesday, May 27, 2009 2:44 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
>
> so, if you think it is a coding error on your side it will
> be a hard work to find the source. I can only image side effects
> on an used API like sqlite3 with the classics:
>
> - stack frame overload in a function that calls sqlite functions
> - using a local stack variable as a return pointer and reuse
>   this external.
> - memory overwriting on malloc'ed pointers or reusing of allready
>   free'd memory pages.
>
> one and two might be found using a code review and identify
> local variables or particular arrays that are written:
> So wherever you call anything from sqlite check the stack
> declaration and review the usage of these variables...
> of course just don't do number two... ;)
>
> memory errors might be detected using special debug versions of the
> malloc/free library, by code review, or manually by adding some
> test variables whereever you malloc or free a pointer.
>
> I'm sure you allready did some of these... anyway, good luck
>
> are you using threads? would be another beautiful issue to trace...;)
>
> Marcus
>
>
>
>> It is a server class machine running Windows 2003 with 4 GB.  No, it's a
>> local drive with 20GB free on it.
>>
>> I'm sure that it's a coding error on my part.  SQLite is very stable, in
>> my
>> opinion.  I'm just trying to get a rough idea on where I'm screwing up
>> the
>> database.
>>
>>
>> -Original Message-----
>> From: sqlite-users-boun...@sqlite.org
>> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
>> Sent: Wednesday, May 27, 2009 12:58 PM
>> To: General Discussion of SQLite Database
>> Subject: Re: [sqlite] corrupt database recovery
>>
>> What platform?  Any chance they are using a network drive?
>>
>>
>> John
>>
>> Gene wrote:
>>> My code is outside the database layer.  So I do all my database work,
>>> then
>>> compress and encrypt it.  No errors are returned anywhere.  I'm
>>> guessing
>>> that it's going to be an uninitialized variable or byte alignment
>>> problems
>>> somewhere.
>>>
>>> This code is running on hundreds of machines without a problem and I've
>>> never reproduced it but every now and again I get a support ticket
>>> showing
>> a
>>> corrupt database.  So I'm trying to figure out WHERE to look.
>>>
>>> -Original Message-
>>> From: sqlite-users-boun...@sqlite.org
>>> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
>>> Sent: Wednesday, May 27, 2009 10:59 AM
>>> To: General Discussion of SQLite Database
>>> Subject: Re: [sqlite] corrupt database recovery
>>>
>>> Gene Allen wrote:
>>>
>>>> Ok...it's happened again and I've decided that I need to track this
>>>> down
>>>> once and for all!
>>>>
>>>> Here is what I'm seeing: I get errors when I do a integrity_check (see
>>>> below), but I can .dump it to a text file and then .read it into
>>>> another
>>>> database ok.
>>>>
>>>> It seems to me that I'm screwing up an index or something.  Are
>>>> indexes
>>>> stored at the end of the dat

Re: [sqlite] corrupt database recovery

2009-05-27 Thread Gene Allen
I've reviewed the code over and over again, but am yet to find it.  But it
is a good bit of very complicated code (blowfish and compression code).

That's why I was asking about the format of the database.  Since I'm able to
successfully do a .dump but the integrity_check whines, I'm thinking that
maybe if I focus on where what could cause that, it might give me a clue.  

For example, if the tree is stored at the end of file maybe I'm falling out
the loop without writing the final bytes.  But if the tree is not localized
to one area of the file, I'll have to rethink my approach. 


-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Marcus Grimm
Sent: Wednesday, May 27, 2009 2:44 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] corrupt database recovery

so, if you think it is a coding error on your side it will
be a hard work to find the source. I can only image side effects
on an used API like sqlite3 with the classics:

- stack frame overload in a function that calls sqlite functions
- using a local stack variable as a return pointer and reuse
  this external.
- memory overwriting on malloc'ed pointers or reusing of allready
  free'd memory pages.

one and two might be found using a code review and identify
local variables or particular arrays that are written:
So wherever you call anything from sqlite check the stack
declaration and review the usage of these variables...
of course just don't do number two... ;)

memory errors might be detected using special debug versions of the
malloc/free library, by code review, or manually by adding some
test variables whereever you malloc or free a pointer.

I'm sure you allready did some of these... anyway, good luck

are you using threads? would be another beautiful issue to trace...;)

Marcus



> It is a server class machine running Windows 2003 with 4 GB.  No, it's a
> local drive with 20GB free on it.
>
> I'm sure that it's a coding error on my part.  SQLite is very stable, in
> my
> opinion.  I'm just trying to get a rough idea on where I'm screwing up the
> database.
>
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
> Sent: Wednesday, May 27, 2009 12:58 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
>
> What platform?  Any chance they are using a network drive?
>
>
> John
>
> Gene wrote:
>> My code is outside the database layer.  So I do all my database work,
>> then
>> compress and encrypt it.  No errors are returned anywhere.  I'm guessing
>> that it's going to be an uninitialized variable or byte alignment
>> problems
>> somewhere.
>>
>> This code is running on hundreds of machines without a problem and I've
>> never reproduced it but every now and again I get a support ticket
>> showing
> a
>> corrupt database.  So I'm trying to figure out WHERE to look.
>>
>> -Original Message-
>> From: sqlite-users-boun...@sqlite.org
>> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
>> Sent: Wednesday, May 27, 2009 10:59 AM
>> To: General Discussion of SQLite Database
>> Subject: Re: [sqlite] corrupt database recovery
>>
>> Gene Allen wrote:
>>
>>> Ok...it's happened again and I've decided that I need to track this
>>> down
>>> once and for all!
>>>
>>> Here is what I'm seeing: I get errors when I do a integrity_check (see
>>> below), but I can .dump it to a text file and then .read it into
>>> another
>>> database ok.
>>>
>>> It seems to me that I'm screwing up an index or something.  Are indexes
>>> stored at the end of the database file?  All I can think of is that my
>>> compression/encryption routines are messing something up and I'm trying
> to
>>> figure out 'where' to look.
>>>
>>> I guess the real question is, what would I have to do to make an
>>> integrity_check fail, but still let a dump work correctly?
>>>
>>> Many thanks for any advice on tracking down this ugliness.
>>>
>>>
>>
>> SNIP
>>
>> Personally, I'd refactor the code to allow me to verify the operation of
>> the compression/encryption routines independently of the database
>> operation.  How are you injecting the compression/encryption into the
>> database layer?
>>
>>
>> John
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>> ___

Re: [sqlite] corrupt database recovery

2009-05-27 Thread Marcus Grimm
so, if you think it is a coding error on your side it will
be a hard work to find the source. I can only image side effects
on an used API like sqlite3 with the classics:

- stack frame overload in a function that calls sqlite functions
- using a local stack variable as a return pointer and reuse
  this external.
- memory overwriting on malloc'ed pointers or reusing of allready
  free'd memory pages.

one and two might be found using a code review and identify
local variables or particular arrays that are written:
So wherever you call anything from sqlite check the stack
declaration and review the usage of these variables...
of course just don't do number two... ;)

memory errors might be detected using special debug versions of the
malloc/free library, by code review, or manually by adding some
test variables whereever you malloc or free a pointer.

I'm sure you allready did some of these... anyway, good luck

are you using threads? would be another beautiful issue to trace...;)

Marcus



> It is a server class machine running Windows 2003 with 4 GB.  No, it's a
> local drive with 20GB free on it.
>
> I'm sure that it's a coding error on my part.  SQLite is very stable, in
> my
> opinion.  I'm just trying to get a rough idea on where I'm screwing up the
> database.
>
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
> Sent: Wednesday, May 27, 2009 12:58 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
>
> What platform?  Any chance they are using a network drive?
>
>
> John
>
> Gene wrote:
>> My code is outside the database layer.  So I do all my database work,
>> then
>> compress and encrypt it.  No errors are returned anywhere.  I'm guessing
>> that it's going to be an uninitialized variable or byte alignment
>> problems
>> somewhere.
>>
>> This code is running on hundreds of machines without a problem and I've
>> never reproduced it but every now and again I get a support ticket
>> showing
> a
>> corrupt database.  So I'm trying to figure out WHERE to look.
>>
>> -Original Message-
>> From: sqlite-users-boun...@sqlite.org
>> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
>> Sent: Wednesday, May 27, 2009 10:59 AM
>> To: General Discussion of SQLite Database
>> Subject: Re: [sqlite] corrupt database recovery
>>
>> Gene Allen wrote:
>>
>>> Ok...it's happened again and I've decided that I need to track this
>>> down
>>> once and for all!
>>>
>>> Here is what I'm seeing: I get errors when I do a integrity_check (see
>>> below), but I can .dump it to a text file and then .read it into
>>> another
>>> database ok.
>>>
>>> It seems to me that I'm screwing up an index or something.  Are indexes
>>> stored at the end of the database file?  All I can think of is that my
>>> compression/encryption routines are messing something up and I'm trying
> to
>>> figure out 'where' to look.
>>>
>>> I guess the real question is, what would I have to do to make an
>>> integrity_check fail, but still let a dump work correctly?
>>>
>>> Many thanks for any advice on tracking down this ugliness.
>>>
>>>
>>
>> SNIP
>>
>> Personally, I'd refactor the code to allow me to verify the operation of
>> the compression/encryption routines independently of the database
>> operation.  How are you injecting the compression/encryption into the
>> database layer?
>>
>>
>> John
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>> 
>>
>>
>> No virus found in this incoming message.
>> Checked by AVG - www.avg.com
>> Version: 8.0.238 / Virus Database: 270.12.39/2133 - Release Date:
>> 05/25/09
> 08:16:00
>>
>>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-27 Thread Gene Allen
It is a server class machine running Windows 2003 with 4 GB.  No, it's a
local drive with 20GB free on it.

I'm sure that it's a coding error on my part.  SQLite is very stable, in my
opinion.  I'm just trying to get a rough idea on where I'm screwing up the
database.  


-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
Sent: Wednesday, May 27, 2009 12:58 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] corrupt database recovery

What platform?  Any chance they are using a network drive?


John

Gene wrote:
> My code is outside the database layer.  So I do all my database work, then
> compress and encrypt it.  No errors are returned anywhere.  I'm guessing
> that it's going to be an uninitialized variable or byte alignment problems
> somewhere.
>
> This code is running on hundreds of machines without a problem and I've
> never reproduced it but every now and again I get a support ticket showing
a
> corrupt database.  So I'm trying to figure out WHERE to look.
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
> Sent: Wednesday, May 27, 2009 10:59 AM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
>
> Gene Allen wrote:
>   
>> Ok...it's happened again and I've decided that I need to track this down
>> once and for all!
>>
>> Here is what I'm seeing: I get errors when I do a integrity_check (see
>> below), but I can .dump it to a text file and then .read it into another
>> database ok.
>>
>> It seems to me that I'm screwing up an index or something.  Are indexes
>> stored at the end of the database file?  All I can think of is that my
>> compression/encryption routines are messing something up and I'm trying
to
>> figure out 'where' to look.  
>>
>> I guess the real question is, what would I have to do to make an
>> integrity_check fail, but still let a dump work correctly?
>>
>> Many thanks for any advice on tracking down this ugliness.
>>   
>> 
>
> SNIP
>
> Personally, I'd refactor the code to allow me to verify the operation of 
> the compression/encryption routines independently of the database 
> operation.  How are you injecting the compression/encryption into the 
> database layer?
>
>
> John
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>   
> 
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com 
> Version: 8.0.238 / Virus Database: 270.12.39/2133 - Release Date: 05/25/09
08:16:00
>
>   

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-27 Thread John Elrick
What platform?  Any chance they are using a network drive?


John

Gene wrote:
> My code is outside the database layer.  So I do all my database work, then
> compress and encrypt it.  No errors are returned anywhere.  I'm guessing
> that it's going to be an uninitialized variable or byte alignment problems
> somewhere.
>
> This code is running on hundreds of machines without a problem and I've
> never reproduced it but every now and again I get a support ticket showing a
> corrupt database.  So I'm trying to figure out WHERE to look.
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
> Sent: Wednesday, May 27, 2009 10:59 AM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
>
> Gene Allen wrote:
>   
>> Ok...it's happened again and I've decided that I need to track this down
>> once and for all!
>>
>> Here is what I'm seeing: I get errors when I do a integrity_check (see
>> below), but I can .dump it to a text file and then .read it into another
>> database ok.
>>
>> It seems to me that I'm screwing up an index or something.  Are indexes
>> stored at the end of the database file?  All I can think of is that my
>> compression/encryption routines are messing something up and I'm trying to
>> figure out 'where' to look.  
>>
>> I guess the real question is, what would I have to do to make an
>> integrity_check fail, but still let a dump work correctly?
>>
>> Many thanks for any advice on tracking down this ugliness.
>>   
>> 
>
> SNIP
>
> Personally, I'd refactor the code to allow me to verify the operation of 
> the compression/encryption routines independently of the database 
> operation.  How are you injecting the compression/encryption into the 
> database layer?
>
>
> John
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>   
> 
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com 
> Version: 8.0.238 / Virus Database: 270.12.39/2133 - Release Date: 05/25/09 
> 08:16:00
>
>   

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-27 Thread Jim Wilcoxson
What platform is this?  I just posted a note today that my Linux box
running 2.6-25 and ext3 isn't doing fsyncs like it should, so I would
be susceptible to DB corruption if my machine crashed during DB I/O.
I posted a C program you could run on the specific machine with a
corrupt DB to see if it's really doing synchronous I/O.

The other thing you might want to check is hardware.  Running memtest
on the machine overnight could show something, unless you are on
server-class hardware with ECC memory.

Jim


On 5/27/09, Gene <g...@bystorm.com> wrote:
> My code is outside the database layer.  So I do all my database work, then
> compress and encrypt it.  No errors are returned anywhere.  I'm guessing
> that it's going to be an uninitialized variable or byte alignment problems
> somewhere.
>
> This code is running on hundreds of machines without a problem and I've
> never reproduced it but every now and again I get a support ticket showing a
> corrupt database.  So I'm trying to figure out WHERE to look.
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
> Sent: Wednesday, May 27, 2009 10:59 AM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] corrupt database recovery
>
> Gene Allen wrote:
>> Ok...it's happened again and I've decided that I need to track this down
>> once and for all!
>>
>> Here is what I'm seeing: I get errors when I do a integrity_check (see
>> below), but I can .dump it to a text file and then .read it into another
>> database ok.
>>
>> It seems to me that I'm screwing up an index or something.  Are indexes
>> stored at the end of the database file?  All I can think of is that my
>> compression/encryption routines are messing something up and I'm trying to
>> figure out 'where' to look.
>>
>> I guess the real question is, what would I have to do to make an
>> integrity_check fail, but still let a dump work correctly?
>>
>> Many thanks for any advice on tracking down this ugliness.
>>
>
> SNIP
>
> Personally, I'd refactor the code to allow me to verify the operation of
> the compression/encryption routines independently of the database
> operation.  How are you injecting the compression/encryption into the
> database layer?
>
>
> John
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Software first.  Software lasts!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-27 Thread Gene
My code is outside the database layer.  So I do all my database work, then
compress and encrypt it.  No errors are returned anywhere.  I'm guessing
that it's going to be an uninitialized variable or byte alignment problems
somewhere.

This code is running on hundreds of machines without a problem and I've
never reproduced it but every now and again I get a support ticket showing a
corrupt database.  So I'm trying to figure out WHERE to look.

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Elrick
Sent: Wednesday, May 27, 2009 10:59 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] corrupt database recovery

Gene Allen wrote:
> Ok...it's happened again and I've decided that I need to track this down
> once and for all!
>
> Here is what I'm seeing: I get errors when I do a integrity_check (see
> below), but I can .dump it to a text file and then .read it into another
> database ok.
>
> It seems to me that I'm screwing up an index or something.  Are indexes
> stored at the end of the database file?  All I can think of is that my
> compression/encryption routines are messing something up and I'm trying to
> figure out 'where' to look.  
>
> I guess the real question is, what would I have to do to make an
> integrity_check fail, but still let a dump work correctly?
>
> Many thanks for any advice on tracking down this ugliness.
>   

SNIP

Personally, I'd refactor the code to allow me to verify the operation of 
the compression/encryption routines independently of the database 
operation.  How are you injecting the compression/encryption into the 
database layer?


John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-27 Thread John Elrick
Gene Allen wrote:
> Ok...it's happened again and I've decided that I need to track this down
> once and for all!
>
> Here is what I'm seeing: I get errors when I do a integrity_check (see
> below), but I can .dump it to a text file and then .read it into another
> database ok.
>
> It seems to me that I'm screwing up an index or something.  Are indexes
> stored at the end of the database file?  All I can think of is that my
> compression/encryption routines are messing something up and I'm trying to
> figure out 'where' to look.  
>
> I guess the real question is, what would I have to do to make an
> integrity_check fail, but still let a dump work correctly?
>
> Many thanks for any advice on tracking down this ugliness.
>   

SNIP

Personally, I'd refactor the code to allow me to verify the operation of 
the compression/encryption routines independently of the database 
operation.  How are you injecting the compression/encryption into the 
database layer?


John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-05-27 Thread Gene Allen
Ok...it's happened again and I've decided that I need to track this down
once and for all!

Here is what I'm seeing: I get errors when I do a integrity_check (see
below), but I can .dump it to a text file and then .read it into another
database ok.

It seems to me that I'm screwing up an index or something.  Are indexes
stored at the end of the database file?  All I can think of is that my
compression/encryption routines are messing something up and I'm trying to
figure out 'where' to look.  

I guess the real question is, what would I have to do to make an
integrity_check fail, but still let a dump work correctly?

Many thanks for any advice on tracking down this ugliness.

Gene

>>Output details
sqlite> pragma integrity_check;
*** in database main ***
On tree page 3 cell 26: invalid page number 469
On tree page 3 cell 26: Child page depth differs
On tree page 3 cell 27: invalid page number 490
On tree page 3 cell 28: invalid page number 511
On tree page 3 cell 29: invalid page number 533
On tree page 3 cell 30: invalid page number 554
On tree page 3 cell 31: invalid page number 576
On tree page 3 cell 32: invalid page number 598
On tree page 3 cell 33: invalid page number 620
On tree page 3 cell 34: invalid page number 642
On tree page 3 cell 35: invalid page number 666
On tree page 3 cell 36: invalid page number 688
On tree page 3 cell 37: invalid page number 709
On tree page 3 cell 38: invalid page number 730
On tree page 3 cell 39: invalid page number 752
On page 3 at right child: invalid page number 773
On tree page 419 cell 86: invalid page number 462
On tree page 419 cell 86: Child page depth differs
On tree page 419 cell 87: invalid page number 463

I said...h...so I do a .dump to see if I can 
sqlite> .output c:\\a.sql
sqlite> .dump
sqlite> .quit

resulting in:
05/27/2009  10:08 AM   496,051 a.sql
   1 File(s)496,051 bytes
   0 Dir(s)  104,974,647,296 bytes free


-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Machin
Sent: Saturday, April 25, 2009 8:15 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] corrupt database recovery

On 26/04/2009 5:47 AM, Gene wrote:
> Every now and again, we have a database that gets corrupt in the field
(bad
> coding on our end, not sqlite). 
> 

Hi Gene,

The obvious question: why not fix the bad code?

What does 'PRAGMA integrity_check;' say about these corrupt databases?

> When we get one of these corrupt databases, we recover what we can by get
> one row at a time by rowid, like 'select * from mytable where rowid = 1'
and
> we inc the row number every time.  Each row we successfully recover we
> insert into a clean database.

How do you know when to stop incrementing the row number?

Does 'select min(rowid), max(rowid) from mytable' give you believable 
answers?

What happens when you do 'select * from mytable' ?

Approximately how many rows are there?

How many 'select * from mytable where rowid = ' attempts 
fail, and for what reasons (previously deleted, some other result code(s))?

Are the failures restricted to a relatively small range of rowids?

> Works pretty well, except over time we've added more and more columns,
each
> one that has to be pulled and bound to get it into the new table.

What is causing this "pull and bind" problem, the fact that some columns 
  weren't present initially? or just the sheer number of columns i.e. 
you need to "pull and bind" all columns (not just the later additions)? 
In any case, please explain what you mean by "pulled" and "bound".

>  We tried
> to do an 'attach' so we could do a 'select into' the clean database
directly
> from the corrupt one.  But as soon as we attempt to 'attach' the corrupt
> database, we understandable get a 'database is malformed' error.

It's not quite so understandable why 'select * from mytable where rowid 
= 1' doesn't get an error.

> Is there an easier way to pull the good records out of a corrupt database
> and put them into a new one without binding each column by hand?

Can you give us an example of a row or two of (a) what you get from the 
'select * from mytable where rowid = ' (b) the insert 
statement that you need to do to insert that data into the clean 
database? Doesn't have to be real data -- e.g. assume 3 columns 
initially, now grown to 5.

What rules/procedure/recipe do you follow when producing (b) from (a) by 
hand?

Assuming that 'select * from mytable' doesn't work, and subject to 
understanding the pulling and binding by hand thing, I would have 
thought the solution would look something like this:

Write a script that loops around doing 'select * from mytable where 
rowid = ?' on a connection to your corrupt database and doing 'insert 
into mytable values(?,?,?,?,? etc etc etc)' on a connection to your 
clean database.

HTH,

Re: [sqlite] corrupt database recovery

2009-04-26 Thread Jim Dodgen
Cool solution.

On Sun, Apr 26, 2009 at 6:39 AM, Jim Wilcoxson  wrote:
> You could do a binary search to find the highest accessible rowid:
>
>    select rowid where rowid = 2147483647 (fails)
>    select rowid where rowid = 1073741824 (fails)
>    ...
>    select rowid where rowid = 65536 (fails)
>    select rowid where rowid = 32768 (works!)
>    select rowid where rowid = 49152 (works!)
>
> Within 32 selects, you will find the highest accessible rowid.  Then
> do an insert from select * where rowid <= XXX.
>
> Jim
>
> On 4/25/09, Gene  wrote:
>> You are exactly right John...that is indeed what the code looks
>> like...except we have over 25 columns (it's a flat table).
>>
>> We've already fixed the bad code, but there are some customers who have old
>> versions...it didn't break very often with the old code, but it does still
>> did.
>>
>> I haven't tried a select Min or max on the row id but a select count(*)
>> returns an error...that's how I know I need to do the row by row recovery
>> method.  Select * from mytable also returns an error.
>>
>> The tables usually have tens of thousands of rows, sometimes over a couple
>> hundred thousand but that's rare.
>>
>> What seems to work is that I do a select * from myTable where rowId = 'X'
>> incing X until I get an error.  After I get the error, every row higher then
>> X also returns an error. So as soon as I get an error, I stop trying to
>> recover more rows.
>
> --
> Software first.  Software lasts!
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Jim Dodgen
j...@dodgen.us
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-04-26 Thread Jim Wilcoxson
You could do a binary search to find the highest accessible rowid:

select rowid where rowid = 2147483647 (fails)
select rowid where rowid = 1073741824 (fails)
...
select rowid where rowid = 65536 (fails)
select rowid where rowid = 32768 (works!)
select rowid where rowid = 49152 (works!)

Within 32 selects, you will find the highest accessible rowid.  Then
do an insert from select * where rowid <= XXX.

Jim

On 4/25/09, Gene  wrote:
> You are exactly right John...that is indeed what the code looks
> like...except we have over 25 columns (it's a flat table).
>
> We've already fixed the bad code, but there are some customers who have old
> versions...it didn't break very often with the old code, but it does still
> did.
>
> I haven't tried a select Min or max on the row id but a select count(*)
> returns an error...that's how I know I need to do the row by row recovery
> method.  Select * from mytable also returns an error.
>
> The tables usually have tens of thousands of rows, sometimes over a couple
> hundred thousand but that's rare.
>
> What seems to work is that I do a select * from myTable where rowId = 'X'
> incing X until I get an error.  After I get the error, every row higher then
> X also returns an error. So as soon as I get an error, I stop trying to
> recover more rows.

-- 
Software first.  Software lasts!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-04-26 Thread Tito Ciuro
Hi Gene,

On Apr 25, 2009, at 3:47 PM, Gene wrote:

> Every now and again, we have a database that gets corrupt in the  
> field (bad
> coding on our end, not sqlite).

How do you corrupt a database with bad coding? Just curious...

-- Tito
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-04-25 Thread Gene Allen
It's not hard to update it...I'm just whining about that it's out of date
and I hate having to deal with matching everything up so I was hoping that I
could get past the Attach. 

I think your python script below might ease my maintenance whoas a bit.
I'll port it to C# and just build my insert on the fly.   

Thank you. 

Gene

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Machin
Sent: Saturday, April 25, 2009 9:30 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] corrupt database recovery

On 26/04/2009 11:28 AM, Gene wrote:
> You are exactly right John...that is indeed what the code looks
> like...except we have over 25 columns (it's a flat table).

I presume that you are referring to this:
"""
  Write a script that loops around doing 'select * from mytable where
  rowid = ?' on a connection to your corrupt database and doing 'insert
  into mytable values(?,?,?,?,? etc etc etc)' on a connection to your
  clean database.
"""

If that's correct, then surely the only maintenance you need to do to 
the above when an extra column is added to your table is to add an extra 
two characters ',?' to the insert statement ... you don't even have to 
do that e.g.

# Python
TABLE_NAME = "mytable"
NUMBER_OF_COLS = 25
question_marks = ",".join("?" * NUMBER_OF_COLS)
insert_sql = "insert into %s values (%s)" % (TABLE_NAME, question_marks)

AND the output from the select should be able to be pumped straight into 
the insert with no changes at all.

AND there might even be a pragma or suchlike that will enable you to 
easily find the number of columns on the fly in your script ...


> 
> We've already fixed the bad code, but there are some customers who have
old
> versions...it didn't break very often with the old code, but it does still
> did.
> 
> I haven't tried a select Min or max on the row id but a select count(*)
> returns an error...that's how I know I need to do the row by row recovery
> method.  Select * from mytable also returns an error.
> 
> The tables usually have tens of thousands of rows, sometimes over a couple
> hundred thousand but that's rare.
> 
> What seems to work is that I do a select * from myTable where rowId = 'X'
> incing X until I get an error.  After I get the error, every row higher
then
> X also returns an error. So as soon as I get an error, I stop trying to
> recover more rows.

Does this usually mean that you are able to recover almost all of the rows?

> The pull and bind code is just ugly and we don't update our 'recovery
> utility' as quickly as we make changes to the database so it tends to get
> out of data.  That's all.

I don't understand what is "the pull and bind" code and why you would 
need anything other that what I've outlined.

Cheers,
John

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-04-25 Thread John Machin
On 26/04/2009 11:28 AM, Gene wrote:
> You are exactly right John...that is indeed what the code looks
> like...except we have over 25 columns (it's a flat table).

I presume that you are referring to this:
"""
  Write a script that loops around doing 'select * from mytable where
  rowid = ?' on a connection to your corrupt database and doing 'insert
  into mytable values(?,?,?,?,? etc etc etc)' on a connection to your
  clean database.
"""

If that's correct, then surely the only maintenance you need to do to 
the above when an extra column is added to your table is to add an extra 
two characters ',?' to the insert statement ... you don't even have to 
do that e.g.

# Python
TABLE_NAME = "mytable"
NUMBER_OF_COLS = 25
question_marks = ",".join("?" * NUMBER_OF_COLS)
insert_sql = "insert into %s values (%s)" % (TABLE_NAME, question_marks)

AND the output from the select should be able to be pumped straight into 
the insert with no changes at all.

AND there might even be a pragma or suchlike that will enable you to 
easily find the number of columns on the fly in your script ...


> 
> We've already fixed the bad code, but there are some customers who have old
> versions...it didn't break very often with the old code, but it does still
> did.
> 
> I haven't tried a select Min or max on the row id but a select count(*)
> returns an error...that's how I know I need to do the row by row recovery
> method.  Select * from mytable also returns an error.
> 
> The tables usually have tens of thousands of rows, sometimes over a couple
> hundred thousand but that's rare.
> 
> What seems to work is that I do a select * from myTable where rowId = 'X'
> incing X until I get an error.  After I get the error, every row higher then
> X also returns an error. So as soon as I get an error, I stop trying to
> recover more rows.

Does this usually mean that you are able to recover almost all of the rows?

> The pull and bind code is just ugly and we don't update our 'recovery
> utility' as quickly as we make changes to the database so it tends to get
> out of data.  That's all.

I don't understand what is "the pull and bind" code and why you would 
need anything other that what I've outlined.

Cheers,
John

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-04-25 Thread Gene
You are exactly right John...that is indeed what the code looks
like...except we have over 25 columns (it's a flat table).

We've already fixed the bad code, but there are some customers who have old
versions...it didn't break very often with the old code, but it does still
did.

I haven't tried a select Min or max on the row id but a select count(*)
returns an error...that's how I know I need to do the row by row recovery
method.  Select * from mytable also returns an error.

The tables usually have tens of thousands of rows, sometimes over a couple
hundred thousand but that's rare.

What seems to work is that I do a select * from myTable where rowId = 'X'
incing X until I get an error.  After I get the error, every row higher then
X also returns an error. So as soon as I get an error, I stop trying to
recover more rows.

The pull and bind code is just ugly and we don't update our 'recovery
utility' as quickly as we make changes to the database so it tends to get
out of data.  That's all.

I haven't tried a PRAGMA integrity_check; in a long time so I can't remember
what it tells me.  I'll run it again.

Thanks for you comments John!

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Machin
Sent: Saturday, April 25, 2009 8:15 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] corrupt database recovery

On 26/04/2009 5:47 AM, Gene wrote:
> Every now and again, we have a database that gets corrupt in the field
(bad
> coding on our end, not sqlite). 
> 

Hi Gene,

The obvious question: why not fix the bad code?

What does 'PRAGMA integrity_check;' say about these corrupt databases?

> When we get one of these corrupt databases, we recover what we can by get
> one row at a time by rowid, like 'select * from mytable where rowid = 1'
and
> we inc the row number every time.  Each row we successfully recover we
> insert into a clean database.

How do you know when to stop incrementing the row number?

Does 'select min(rowid), max(rowid) from mytable' give you believable 
answers?

What happens when you do 'select * from mytable' ?

Approximately how many rows are there?

How many 'select * from mytable where rowid = ' attempts 
fail, and for what reasons (previously deleted, some other result code(s))?

Are the failures restricted to a relatively small range of rowids?

> Works pretty well, except over time we've added more and more columns,
each
> one that has to be pulled and bound to get it into the new table.

What is causing this "pull and bind" problem, the fact that some columns 
  weren't present initially? or just the sheer number of columns i.e. 
you need to "pull and bind" all columns (not just the later additions)? 
In any case, please explain what you mean by "pulled" and "bound".

>  We tried
> to do an 'attach' so we could do a 'select into' the clean database
directly
> from the corrupt one.  But as soon as we attempt to 'attach' the corrupt
> database, we understandable get a 'database is malformed' error.

It's not quite so understandable why 'select * from mytable where rowid 
= 1' doesn't get an error.

> Is there an easier way to pull the good records out of a corrupt database
> and put them into a new one without binding each column by hand?

Can you give us an example of a row or two of (a) what you get from the 
'select * from mytable where rowid = ' (b) the insert 
statement that you need to do to insert that data into the clean 
database? Doesn't have to be real data -- e.g. assume 3 columns 
initially, now grown to 5.

What rules/procedure/recipe do you follow when producing (b) from (a) by 
hand?

Assuming that 'select * from mytable' doesn't work, and subject to 
understanding the pulling and binding by hand thing, I would have 
thought the solution would look something like this:

Write a script that loops around doing 'select * from mytable where 
rowid = ?' on a connection to your corrupt database and doing 'insert 
into mytable values(?,?,?,?,? etc etc etc)' on a connection to your 
clean database.

HTH,
John

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] corrupt database recovery

2009-04-25 Thread John Machin
On 26/04/2009 5:47 AM, Gene wrote:
> Every now and again, we have a database that gets corrupt in the field (bad
> coding on our end, not sqlite). 
> 

Hi Gene,

The obvious question: why not fix the bad code?

What does 'PRAGMA integrity_check;' say about these corrupt databases?

> When we get one of these corrupt databases, we recover what we can by get
> one row at a time by rowid, like 'select * from mytable where rowid = 1' and
> we inc the row number every time.  Each row we successfully recover we
> insert into a clean database.

How do you know when to stop incrementing the row number?

Does 'select min(rowid), max(rowid) from mytable' give you believable 
answers?

What happens when you do 'select * from mytable' ?

Approximately how many rows are there?

How many 'select * from mytable where rowid = ' attempts 
fail, and for what reasons (previously deleted, some other result code(s))?

Are the failures restricted to a relatively small range of rowids?

> Works pretty well, except over time we've added more and more columns, each
> one that has to be pulled and bound to get it into the new table.

What is causing this "pull and bind" problem, the fact that some columns 
  weren't present initially? or just the sheer number of columns i.e. 
you need to "pull and bind" all columns (not just the later additions)? 
In any case, please explain what you mean by "pulled" and "bound".

>  We tried
> to do an 'attach' so we could do a 'select into' the clean database directly
> from the corrupt one.  But as soon as we attempt to 'attach' the corrupt
> database, we understandable get a 'database is malformed' error.

It's not quite so understandable why 'select * from mytable where rowid 
= 1' doesn't get an error.

> Is there an easier way to pull the good records out of a corrupt database
> and put them into a new one without binding each column by hand?

Can you give us an example of a row or two of (a) what you get from the 
'select * from mytable where rowid = ' (b) the insert 
statement that you need to do to insert that data into the clean 
database? Doesn't have to be real data -- e.g. assume 3 columns 
initially, now grown to 5.

What rules/procedure/recipe do you follow when producing (b) from (a) by 
hand?

Assuming that 'select * from mytable' doesn't work, and subject to 
understanding the pulling and binding by hand thing, I would have 
thought the solution would look something like this:

Write a script that loops around doing 'select * from mytable where 
rowid = ?' on a connection to your corrupt database and doing 'insert 
into mytable values(?,?,?,?,? etc etc etc)' on a connection to your 
clean database.

HTH,
John

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users