Re: [sqlite] Header corruption

2018-04-17 Thread Deon Brewis
Right, that's the bug we found. The statements are in the reverse order.  It's 
not quite that simple as the code below, but it ends up happening in that 
order. 

It's one of the most fundamental things a senior dev asks themselves on every 
call that that differentiates them from a junior dev - "What is the lifetime 
semantics of this thing I'm handing over / getting back?".

You don't even need the documentation for this one after looking at how OpenSSL 
gets the socket handle in the first place - the bug is obvious, I just wasn't 
aware this codepath even existed. (Big product, big team - you know how it 
goes).

- Deon

-Original Message-
From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of 
Olivier Mascia
Sent: Tuesday, April 17, 2018 2:28 PM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Header corruption

> Le 17 avr. 2018 à 22:07, Deon Brewis <de...@outlook.com> a écrit :
> 
> closesocket(_socket); // posix socket
> SSL_shutdown(_ssl); // openssl (_ssl was initialized using the _socket 
> above)

These two statements are inherently wrong, in this order. First you 
SSL_shutdown(), then you closesocket(). Not the reverse. And in Windows code it 
even is good citizenship to call shutdown() properly in between.  AFAIK it 
should also be on Linux, though I'm told it is not commonly seen.  But please 
check the respective OS'es SDK and OpenSSL documentation for the details.

Called in the right order there is no file descriptor re-use syndrome to fear, 
because the closesocket() rightfully comes as the very last step.

--
Best regards, Meilleures salutations, Met vriendelijke groeten, Olivier Mascia 
(from mobile device)


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-04-17 Thread Deon Brewis
Fair enough. Agreed, just less of a chance (and we haven't seen it) because 
Windows seems to provide more spacing between these values.

- Deon

-Original Message-
From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of 
Warren Young
Sent: Tuesday, April 17, 2018 1:54 PM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Header corruption

On Apr 17, 2018, at 2:07 PM, Deon Brewis <de...@outlook.com> wrote:
> 
> One just has to be in the mindset that on unix based platforms, a socket is a 
> file handle. (Not instinctive if you're coming from a Windows background).

You’re either using Winsock 1.1 or are dragging forward obsolete knowledge from 
the DOS-based versions of Windows:

   https://msdn.microsoft.com/library/windows/desktop/ms740522.aspx

Unless your code accidentally works on Windows because of differing semantics, 
there was a risk of seeing this bug bite on Windows, too.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-04-17 Thread Deon Brewis
Yes, but the socket values differ by the thousands in Windows from handle 
values returned by CreateFile. 

On MAC they don't differ at all - it immediately gets re-used.

- Deon

-Original Message-
From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of 
Peter Da Silva
Sent: Tuesday, April 17, 2018 1:13 PM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Header corruption

On 4/17/18, 3:08 PM, "sqlite-users on behalf of Deon Brewis" 
<sqlite-users-boun...@mailinglists.sqlite.org on behalf of de...@outlook.com> 
wrote:
> So this was a special case of re-using the File handle as per the corruption 
> guide. One just has to be in the mindset that on unix based platforms, a 
> socket is a file handle. (Not instinctive if you're coming from a Windows 
> background).

Even on Windows, wouldn't doing an ssl_shutdown on a socket you'd already 
closed still have a risk of unexpected behavior?
 


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-04-17 Thread Olivier Mascia
> Le 17 avr. 2018 à 22:07, Deon Brewis  a écrit :
> 
> closesocket(_socket); // posix socket
> SSL_shutdown(_ssl); // openssl (_ssl was initialized using the _socket above)

These two statements are inherently wrong, in this order. First you 
SSL_shutdown(), then you closesocket(). Not the reverse. And in Windows code it 
even is good citizenship to call shutdown() properly in between.  AFAIK it 
should also be on Linux, though I'm told it is not commonly seen.  But please 
check the respective OS'es SDK and OpenSSL documentation for the details.

Called in the right order there is no file descriptor re-use syndrome to fear, 
because the closesocket() rightfully comes as the very last step.

-- 
Best regards, Meilleures salutations, Met vriendelijke groeten,  
Olivier Mascia (from mobile device)


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-04-17 Thread Warren Young
On Apr 17, 2018, at 2:07 PM, Deon Brewis  wrote:
> 
> One just has to be in the mindset that on unix based platforms, a socket is a 
> file handle. (Not instinctive if you're coming from a Windows background).

You’re either using Winsock 1.1 or are dragging forward obsolete knowledge from 
the DOS-based versions of Windows:

   https://msdn.microsoft.com/library/windows/desktop/ms740522.aspx

Unless your code accidentally works on Windows because of differing semantics, 
there was a risk of seeing this bug bite on Windows, too.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-04-17 Thread Simon Slavin
On 17 Apr 2018, at 9:13pm, Peter Da Silva  wrote:

> Even on Windows, wouldn't doing an ssl_shutdown on a socket you'd already 
> closed still have a risk of unexpected behavior?

Under Windows an attempting to shutdown a connection which is already shutdown 
returns SOCKET_ERROR .  You can then make further calls to learn that the exact 
problem was an attempt to close something that isn't an open socket.

The major problem is analogous to one with SQLite: most software doesn't check 
the return value because if the return value is not 0 there's nothing simple 
that can be done about it.  It's simpler, and 99% of the time perfectly okay, 
to just quit as if no error had occurred.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-04-17 Thread Peter Da Silva
On 4/17/18, 3:08 PM, "sqlite-users on behalf of Deon Brewis" 
 
wrote:
> So this was a special case of re-using the File handle as per the corruption 
> guide. One just has to be in the mindset that on unix based platforms, a 
> socket is a file handle. (Not instinctive if you're coming from a Windows 
> background).

Even on Windows, wouldn't doing an ssl_shutdown on a socket you'd already 
closed still have a risk of unexpected behavior?
 


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-04-17 Thread Deon Brewis
After months, I managed to track this down. I'd like to extend a BIG thanks to 
Richard Hip and Dan Kennedy for their help in helping me instrument and 
understand the SQLITE internal data structures better, as well as giving me a 
way to programmatically do this, as well as to teach me about the showdb tool. 
It has been invaluable to help me understand and categorize the corruption 
which in turn helped me to know what common pattern to look out for.

Turns out, in some shutdown scenarios we would call what ends up being this:

closesocket(_socket); // posix socket
SSL_shutdown(_ssl); // openssl (_ssl was initialized using the _socket above)


If in between those two commands, we opened a new SQLITE connection on another 
thread, SQLITE may get the same file handle value as what the old socket value 
used to be.

SSL_shutdown then sends out a sequence resembling the following, to what used 
to be the socket:
150301002071476f3be1f3fa76f22b9addbe0f520ebbe007fcc1d6536c19ec9d69c5334799

However, since the old socket handle value is now being used as a file handle 
value, the sequence ends up in the database file instead.

So this was a special case of re-using the File handle as per the corruption 
guide. One just has to be in the mindset that on unix based platforms, a socket 
is a file handle. (Not instinctive if you're coming from a Windows background).

Thanks again guys!


I'll leave the following search terms here as well in case someone searches for 
this in the future:
1503010020
15030100
352518400 (decimal version)
OpenSSL

- Deon

-Original Message-
From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of 
Deon Brewis
Sent: Tuesday, February 6, 2018 8:57 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: [sqlite] Header corruption

I’m trying to track down SQLITE corruptions that seems to corrupt our databases 
in a consistent way. (Running on 3.20.1).

This isn’t related to a crash or hardware failure. The app is running and 
reading and writing to the database fine, and then suddenly we start getting a 
SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN 
TRANSACTION’, and everything thereafter started failing.

When our database headers are fine, I see this:
h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3.
0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@  ...›..4B

On corruption #1, I see this:
h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ;  .‘!…D.,í¾!ú
0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û

On corruption #2, I see this:
h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ;  ..˱ÿœ.ÐÖ»"
0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼


This bit pattern of “15 03 01 00” for the first 4 characters where it should be 
“SQLi” is a bit too much of a coincidence to ignore or write off as disk error.

The rest of the header and file after the first 25-ish bytes or so, seem fine. 
It’s just these first 25  bytes that gets corrupted. If I restore Offsets 0 to 
Offsets 27 out of a “working” database, I get a usable file.

Under what circumstances would SQLITE even attempt to re-write the header 
string?

- Deon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7Cfdf8fd84436048062e6c08d56d82ae47%7C84df9e7fe9f640afb435%7C1%7C0%7C636535330385784749=qjk7S7H6FW%2FaZIBD2XKPULL4v1wH7p2UN4GaKkx1O7I%3D=0
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-08 Thread Jens Alfke


> On Feb 8, 2018, at 12:43 AM, Eduardo  wrote:
> 
> Profiling shouldn't help, it isn't the rigth tool. Use Xcode analyzer or from
> command line scan-build and scan-view. If you can, use the last version of
> clang-analyzer.

The Clang address sanitizer would be the best tool for diagnosing memory 
corruption. It's a lifesaver.

In Xcode it's very easy to enable: go to the scheme editor, select Run (or 
Test) from the list on the left, click the Diagnostics tab, and click the 
"Address Sanitizer" checkbox. I also recommend checking "Detect use of stack 
after return" and "Malloc Scribble". Then press the Run (or Test) button. I 
tend to leave this on all the time while developing; it slows down the program, 
but not enough to get in the way.

It can also be used from the command-line, but I have no knowledge of how to do 
that. (I do know it requires recompiling with a special compiler flag, since it 
instruments the machine code.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-08 Thread Eduardo
On Wed, 7 Feb 2018 17:29:54 +
Deon Brewis  escribió:

> Oh yeah, I don’t think this is a SQLITE bug or anything. 
> 
> I think something in our code is writing to memory after freed. I'm just 
> trying to track it down at the point that it happens. We've tried all 
> Profiling tools on both OSX and Windows without luck, so my next step is 
> trying to find the writing thread at the point of corruption.
> 

Profiling shouldn't help, it isn't the rigth tool. Use Xcode analyzer or from
command line scan-build and scan-view. If you can, use the last version of
clang-analyzer.

In your project directory type:

%mkdir review
%scan-build make -o review/ -V

in directory review is the scan-build html output. Use:
 
%scan-view review/ to see it again


> Dan Kennedy's suggestion seems like that would we that way to do that.
> 
> - Deon

-- 
Eduardo 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-07 Thread Deon Brewis
Oh yeah, I don’t think this is a SQLITE bug or anything. 

I think something in our code is writing to memory after freed. I'm just trying 
to track it down at the point that it happens. We've tried all Profiling tools 
on both OSX and Windows without luck, so my next step is trying to find the 
writing thread at the point of corruption.

Dan Kennedy's suggestion seems like that would we that way to do that.

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Wednesday, February 7, 2018 8:32 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Header corruption

On 7 Feb 2018, at 3:16pm, Deon Brewis <de...@outlook.com> wrote:

> So this looks more like something is overwriting the memory of Page1 before 
> SQLITE writes it back to disk.

That is almost always what people eventually admit to after reporting a problem 
like this.  Some part of their code is stomping on memory or a file handle 
which SQLite thought it had exclusive rights to.  Some of them discover it 
using a runtime profiler tool which looks for use of released memory or 
double-release of file handles, but I don't know enough about Windows to 
suggest anything.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7C883405b8374547bace6508d56e486bb8%7C84df9e7fe9f640afb435%7C1%7C0%7C636536179674888635=ZZtP6DzeVTBCnnWqxu0hWRMVLS6NWTFLrBI0%2Buu6G4I%3D=0
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-07 Thread Simon Slavin
On 7 Feb 2018, at 3:16pm, Deon Brewis  wrote:

> So this looks more like something is overwriting the memory of Page1 before 
> SQLITE writes it back to disk.

That is almost always what people eventually admit to after reporting a problem 
like this.  Some part of their code is stomping on memory or a file handle 
which SQLite thought it had exclusive rights to.  Some of them discover it 
using a runtime profiler tool which looks for use of released memory or 
double-release of file handles, but I don't know enough about Windows to 
suggest anything.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-07 Thread Dan Kennedy

On 02/07/2018 10:16 PM, Deon Brewis wrote:

That's an interesting thought, but we see this on Windows as well, and on 
Windows these files are opened with SQLITE_OPEN_EXCLUSIVE.

So this looks more like something is overwriting the memory of Page1 before 
SQLITE writes it back to disk.

I don't think SQLITE verifies the memory during every write. There's a:
if( memcmp(page1, zMagicHeader, 16)!=0 ){

during lockBtree() but that seems to be on read, not write.


Is there somewhere I can add a check of the header buffer before the write to 
disk?

Or does SQLITE literally never write this part of the header again once written 
the first time? (Seek past the zMagicHeader?).


If it does rewrite it, it writes all of the first database page at the 
same time.


You're using wal mode, so whenever page 1 (the page containing the db 
header) is written, it is appended to the wal file. Then, later on, a 
checkpoint copies it from the wal file into the db file.


Writing the page to the wal file happens here:

  http://www.sqlite.org/src/artifact/5a3f464edd645?ln=3404

You could add (say):

  assert( p->pgno!=1 || 0==memcmp(pData, "SQLite", 6) );

Then copying from the wal file into the db file is here:

  http://www.sqlite.org/src/artifact/5a3f464edd645?ln=1861

Add (again, say):

  assert( iDbpage!=1 || 0==memcmp(zBuf, "SQLite", 6) );

Dan.






- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Dan Kennedy
Sent: Tuesday, February 6, 2018 9:08 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Header corruption

On 02/06/2018 11:57 PM, Deon Brewis wrote:

I’m trying to track down SQLITE corruptions that seems to corrupt our databases 
in a consistent way. (Running on 3.20.1).

This isn’t related to a crash or hardware failure. The app is running and 
reading and writing to the database fine, and then suddenly we start getting a 
SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN 
TRANSACTION’, and everything thereafter started failing.

When our database headers are fine, I see this:
h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3.
0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@
...›..4B

On corruption #1, I see this:
h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ; 
.‘!…D.,í¾!ú
0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ;
žtçêš..,™°·;Æð5û

On corruption #2, I see this:
h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ;  ..˱ÿœ.ÐÖ»"
0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ;
.dÓôÝ8ûߣàG¸×ö!¼


This bit pattern of “15 03 01 00” for the first 4 characters where it should be 
“SQLi” is a bit too much of a coincidence to ignore or write off as disk error.

The rest of the header and file after the first 25-ish bytes or so, seem fine. 
It’s just these first 25  bytes that gets corrupted. If I restore Offsets 0 to 
Offsets 27 out of a “working” database, I get a usable file.

Under what circumstances would SQLITE even attempt to re-write the header 
string?

Is this on UNIX?

Quite often this sort of thing occurs because some other module is writing to a 
file-descriptor owned by SQLite. Usually because it closed its own fd, then 
SQLite opened the db file and was assigned the same integer fd value, then the 
rogue module wrote to the fd anyway. In other words, some other module is doing:

close(fd);
write(fd, "1503010020...", 25);

and between those two calls SQLite is calling open() and is being assigned a 
file-descriptor with the same integer value as fd.

Dan.






- Deon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmaili
nglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=
02%7C01%7C%7C12190af418bc4a32192b08d56d8432f3%7C84df9e7fe9f640afb435aa
aa%7C1%7C0%7C636535336909688339=OMCv%2BtfSJ4PvlFTFw8Cqcc
1eizCrnIgFnNSJ7x5XUoo%3D=0


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7C12190af418bc4a32192b08d56d8432f3%7C84df9e7fe9f640afb435%7C1%7C0%7C636535336909688339=OMCv%2BtfSJ4PvlFTFw8Cqcc1eizCrnIgFnNSJ7x5XUoo%3D=0
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-07 Thread Richard Hipp
On 2/7/18, Deon Brewis  wrote:
>
> I don't think SQLITE verifies the memory during every write. There's a:
> if( memcmp(page1, zMagicHeader, 16)!=0 ){
>
> during lockBtree() but that seems to be on read, not write.
>
>
> Is there somewhere I can add a check of the header buffer before the write
> to disk?

There does not appear to be a single place to put this code as the
header can be written from several places.

Search for instances of "sqlite3OsWrite(pPager->fd, ...)".  You want
to be sure that the last field is "fd" and not "jfd" or "sjfd".  I
think there are four separate places you will need to insert your
patch.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-07 Thread Deon Brewis
That's an interesting thought, but we see this on Windows as well, and on 
Windows these files are opened with SQLITE_OPEN_EXCLUSIVE. 

So this looks more like something is overwriting the memory of Page1 before 
SQLITE writes it back to disk.

I don't think SQLITE verifies the memory during every write. There's a:
if( memcmp(page1, zMagicHeader, 16)!=0 ){

during lockBtree() but that seems to be on read, not write. 


Is there somewhere I can add a check of the header buffer before the write to 
disk?

Or does SQLITE literally never write this part of the header again once written 
the first time? (Seek past the zMagicHeader?).

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Dan Kennedy
Sent: Tuesday, February 6, 2018 9:08 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Header corruption

On 02/06/2018 11:57 PM, Deon Brewis wrote:
> I’m trying to track down SQLITE corruptions that seems to corrupt our 
> databases in a consistent way. (Running on 3.20.1).
>
> This isn’t related to a crash or hardware failure. The app is running and 
> reading and writing to the database fine, and then suddenly we start getting 
> a SQLITE_NOTADB errors. The first error we got was on trying to execute a 
> ‘BEGIN TRANSACTION’, and everything thereafter started failing.
>
> When our database headers are fine, I see this:
> h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3.
> 0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@  
> ...›..4B
>
> On corruption #1, I see this:
> h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ;  
> .‘!…D.,í¾!ú
> 0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; 
> žtçêš..,™°·;Æð5û
>
> On corruption #2, I see this:
> h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ;  ..˱ÿœ.ÐÖ»"
> 0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; 
> .dÓôÝ8ûߣàG¸×ö!¼
>
>
> This bit pattern of “15 03 01 00” for the first 4 characters where it should 
> be “SQLi” is a bit too much of a coincidence to ignore or write off as disk 
> error.
>
> The rest of the header and file after the first 25-ish bytes or so, seem 
> fine. It’s just these first 25  bytes that gets corrupted. If I restore 
> Offsets 0 to Offsets 27 out of a “working” database, I get a usable file.
>
> Under what circumstances would SQLITE even attempt to re-write the header 
> string?

Is this on UNIX?

Quite often this sort of thing occurs because some other module is writing to a 
file-descriptor owned by SQLite. Usually because it closed its own fd, then 
SQLite opened the db file and was assigned the same integer fd value, then the 
rogue module wrote to the fd anyway. In other words, some other module is doing:

   close(fd);
   write(fd, "1503010020...", 25);

and between those two calls SQLite is calling open() and is being assigned a 
file-descriptor with the same integer value as fd.

Dan.





>
> - Deon
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmaili
> nglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=
> 02%7C01%7C%7C12190af418bc4a32192b08d56d8432f3%7C84df9e7fe9f640afb435aa
> aa%7C1%7C0%7C636535336909688339=OMCv%2BtfSJ4PvlFTFw8Cqcc
> 1eizCrnIgFnNSJ7x5XUoo%3D=0


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7C12190af418bc4a32192b08d56d8432f3%7C84df9e7fe9f640afb435%7C1%7C0%7C636535336909688339=OMCv%2BtfSJ4PvlFTFw8Cqcc1eizCrnIgFnNSJ7x5XUoo%3D=0
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-06 Thread Dan Kennedy

On 02/06/2018 11:57 PM, Deon Brewis wrote:

I’m trying to track down SQLITE corruptions that seems to corrupt our databases 
in a consistent way. (Running on 3.20.1).

This isn’t related to a crash or hardware failure. The app is running and 
reading and writing to the database fine, and then suddenly we start getting a 
SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN 
TRANSACTION’, and everything thereafter started failing.

When our database headers are fine, I see this:
h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3.
0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@  ...›..4B

On corruption #1, I see this:
h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ;  .‘!…D.,í¾!ú
0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û

On corruption #2, I see this:
h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ;  ..˱ÿœ.ÐÖ»"
0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼


This bit pattern of “15 03 01 00” for the first 4 characters where it should be 
“SQLi” is a bit too much of a coincidence to ignore or write off as disk error.

The rest of the header and file after the first 25-ish bytes or so, seem fine. 
It’s just these first 25  bytes that gets corrupted. If I restore Offsets 0 to 
Offsets 27 out of a “working” database, I get a usable file.

Under what circumstances would SQLITE even attempt to re-write the header 
string?


Is this on UNIX?

Quite often this sort of thing occurs because some other module is 
writing to a file-descriptor owned by SQLite. Usually because it closed 
its own fd, then SQLite opened the db file and was assigned the same 
integer fd value, then the rogue module wrote to the fd anyway. In other 
words, some other module is doing:


  close(fd);
  write(fd, "1503010020...", 25);

and between those two calls SQLite is calling open() and is being 
assigned a file-descriptor with the same integer value as fd.


Dan.







- Deon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Header corruption

2018-02-06 Thread David Raymond
Things stored in the first 25 bytes include page size, WAL status, and the file 
change counter. So at least part of the header there gets changed with every 
committed write transaction.

http://www.sqlite.org/fileformat2.html


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Deon Brewis
Sent: Tuesday, February 06, 2018 11:57 AM
To: SQLite mailing list
Subject: [sqlite] Header corruption

I’m trying to track down SQLITE corruptions that seems to corrupt our databases 
in a consistent way. (Running on 3.20.1).

This isn’t related to a crash or hardware failure. The app is running and 
reading and writing to the database fine, and then suddenly we start getting a 
SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN 
TRANSACTION’, and everything thereafter started failing.

When our database headers are fine, I see this:
h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3.
0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@  ...›..4B

On corruption #1, I see this:
h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ;  .‘!…D.,í¾!ú
0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û

On corruption #2, I see this:
h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ;  ..˱ÿœ.ÐÖ»"
0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼


This bit pattern of “15 03 01 00” for the first 4 characters where it should be 
“SQLi” is a bit too much of a coincidence to ignore or write off as disk error.

The rest of the header and file after the first 25-ish bytes or so, seem fine. 
It’s just these first 25  bytes that gets corrupted. If I restore Offsets 0 to 
Offsets 27 out of a “working” database, I get a usable file.

Under what circumstances would SQLITE even attempt to re-write the header 
string?

- Deon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Header corruption

2018-02-06 Thread Deon Brewis
I’m trying to track down SQLITE corruptions that seems to corrupt our databases 
in a consistent way. (Running on 3.20.1).

This isn’t related to a crash or hardware failure. The app is running and 
reading and writing to the database fine, and then suddenly we start getting a 
SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN 
TRANSACTION’, and everything thereafter started failing.

When our database headers are fine, I see this:
h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3.
0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@  ...›..4B

On corruption #1, I see this:
h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ;  .‘!…D.,í¾!ú
0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û

On corruption #2, I see this:
h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ;  ..˱ÿœ.ÐÖ»"
0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼


This bit pattern of “15 03 01 00” for the first 4 characters where it should be 
“SQLi” is a bit too much of a coincidence to ignore or write off as disk error.

The rest of the header and file after the first 25-ish bytes or so, seem fine. 
It’s just these first 25  bytes that gets corrupted. If I restore Offsets 0 to 
Offsets 27 out of a “working” database, I get a usable file.

Under what circumstances would SQLITE even attempt to re-write the header 
string?

- Deon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users