Re: [sqlite] Header corruption
Right, that's the bug we found. The statements are in the reverse order. It's not quite that simple as the code below, but it ends up happening in that order. It's one of the most fundamental things a senior dev asks themselves on every call that that differentiates them from a junior dev - "What is the lifetime semantics of this thing I'm handing over / getting back?". You don't even need the documentation for this one after looking at how OpenSSL gets the socket handle in the first place - the bug is obvious, I just wasn't aware this codepath even existed. (Big product, big team - you know how it goes). - Deon -Original Message- From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of Olivier Mascia Sent: Tuesday, April 17, 2018 2:28 PM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Header corruption > Le 17 avr. 2018 à 22:07, Deon Brewis <de...@outlook.com> a écrit : > > closesocket(_socket); // posix socket > SSL_shutdown(_ssl); // openssl (_ssl was initialized using the _socket > above) These two statements are inherently wrong, in this order. First you SSL_shutdown(), then you closesocket(). Not the reverse. And in Windows code it even is good citizenship to call shutdown() properly in between. AFAIK it should also be on Linux, though I'm told it is not commonly seen. But please check the respective OS'es SDK and OpenSSL documentation for the details. Called in the right order there is no file descriptor re-use syndrome to fear, because the closesocket() rightfully comes as the very last step. -- Best regards, Meilleures salutations, Met vriendelijke groeten, Olivier Mascia (from mobile device) ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
Fair enough. Agreed, just less of a chance (and we haven't seen it) because Windows seems to provide more spacing between these values. - Deon -Original Message- From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of Warren Young Sent: Tuesday, April 17, 2018 1:54 PM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Header corruption On Apr 17, 2018, at 2:07 PM, Deon Brewis <de...@outlook.com> wrote: > > One just has to be in the mindset that on unix based platforms, a socket is a > file handle. (Not instinctive if you're coming from a Windows background). You’re either using Winsock 1.1 or are dragging forward obsolete knowledge from the DOS-based versions of Windows: https://msdn.microsoft.com/library/windows/desktop/ms740522.aspx Unless your code accidentally works on Windows because of differing semantics, there was a risk of seeing this bug bite on Windows, too. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
Yes, but the socket values differ by the thousands in Windows from handle values returned by CreateFile. On MAC they don't differ at all - it immediately gets re-used. - Deon -Original Message- From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of Peter Da Silva Sent: Tuesday, April 17, 2018 1:13 PM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Header corruption On 4/17/18, 3:08 PM, "sqlite-users on behalf of Deon Brewis" <sqlite-users-boun...@mailinglists.sqlite.org on behalf of de...@outlook.com> wrote: > So this was a special case of re-using the File handle as per the corruption > guide. One just has to be in the mindset that on unix based platforms, a > socket is a file handle. (Not instinctive if you're coming from a Windows > background). Even on Windows, wouldn't doing an ssl_shutdown on a socket you'd already closed still have a risk of unexpected behavior? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
> Le 17 avr. 2018 à 22:07, Deon Brewisa écrit : > > closesocket(_socket); // posix socket > SSL_shutdown(_ssl); // openssl (_ssl was initialized using the _socket above) These two statements are inherently wrong, in this order. First you SSL_shutdown(), then you closesocket(). Not the reverse. And in Windows code it even is good citizenship to call shutdown() properly in between. AFAIK it should also be on Linux, though I'm told it is not commonly seen. But please check the respective OS'es SDK and OpenSSL documentation for the details. Called in the right order there is no file descriptor re-use syndrome to fear, because the closesocket() rightfully comes as the very last step. -- Best regards, Meilleures salutations, Met vriendelijke groeten, Olivier Mascia (from mobile device) ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
On Apr 17, 2018, at 2:07 PM, Deon Brewiswrote: > > One just has to be in the mindset that on unix based platforms, a socket is a > file handle. (Not instinctive if you're coming from a Windows background). You’re either using Winsock 1.1 or are dragging forward obsolete knowledge from the DOS-based versions of Windows: https://msdn.microsoft.com/library/windows/desktop/ms740522.aspx Unless your code accidentally works on Windows because of differing semantics, there was a risk of seeing this bug bite on Windows, too. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
On 17 Apr 2018, at 9:13pm, Peter Da Silvawrote: > Even on Windows, wouldn't doing an ssl_shutdown on a socket you'd already > closed still have a risk of unexpected behavior? Under Windows an attempting to shutdown a connection which is already shutdown returns SOCKET_ERROR . You can then make further calls to learn that the exact problem was an attempt to close something that isn't an open socket. The major problem is analogous to one with SQLite: most software doesn't check the return value because if the return value is not 0 there's nothing simple that can be done about it. It's simpler, and 99% of the time perfectly okay, to just quit as if no error had occurred. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
On 4/17/18, 3:08 PM, "sqlite-users on behalf of Deon Brewis"wrote: > So this was a special case of re-using the File handle as per the corruption > guide. One just has to be in the mindset that on unix based platforms, a > socket is a file handle. (Not instinctive if you're coming from a Windows > background). Even on Windows, wouldn't doing an ssl_shutdown on a socket you'd already closed still have a risk of unexpected behavior? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
After months, I managed to track this down. I'd like to extend a BIG thanks to Richard Hip and Dan Kennedy for their help in helping me instrument and understand the SQLITE internal data structures better, as well as giving me a way to programmatically do this, as well as to teach me about the showdb tool. It has been invaluable to help me understand and categorize the corruption which in turn helped me to know what common pattern to look out for. Turns out, in some shutdown scenarios we would call what ends up being this: closesocket(_socket); // posix socket SSL_shutdown(_ssl); // openssl (_ssl was initialized using the _socket above) If in between those two commands, we opened a new SQLITE connection on another thread, SQLITE may get the same file handle value as what the old socket value used to be. SSL_shutdown then sends out a sequence resembling the following, to what used to be the socket: 150301002071476f3be1f3fa76f22b9addbe0f520ebbe007fcc1d6536c19ec9d69c5334799 However, since the old socket handle value is now being used as a file handle value, the sequence ends up in the database file instead. So this was a special case of re-using the File handle as per the corruption guide. One just has to be in the mindset that on unix based platforms, a socket is a file handle. (Not instinctive if you're coming from a Windows background). Thanks again guys! I'll leave the following search terms here as well in case someone searches for this in the future: 1503010020 15030100 352518400 (decimal version) OpenSSL - Deon -Original Message- From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On Behalf Of Deon Brewis Sent: Tuesday, February 6, 2018 8:57 AM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: [sqlite] Header corruption I’m trying to track down SQLITE corruptions that seems to corrupt our databases in a consistent way. (Running on 3.20.1). This isn’t related to a crash or hardware failure. The app is running and reading and writing to the database fine, and then suddenly we start getting a SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN TRANSACTION’, and everything thereafter started failing. When our database headers are fine, I see this: h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3. 0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@ ...›..4B On corruption #1, I see this: h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ; .‘!…D.,í¾!ú 0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û On corruption #2, I see this: h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ; ..˱ÿœ.ÐÖ»" 0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼ This bit pattern of “15 03 01 00” for the first 4 characters where it should be “SQLi” is a bit too much of a coincidence to ignore or write off as disk error. The rest of the header and file after the first 25-ish bytes or so, seem fine. It’s just these first 25 bytes that gets corrupted. If I restore Offsets 0 to Offsets 27 out of a “working” database, I get a usable file. Under what circumstances would SQLITE even attempt to re-write the header string? - Deon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7Cfdf8fd84436048062e6c08d56d82ae47%7C84df9e7fe9f640afb435%7C1%7C0%7C636535330385784749=qjk7S7H6FW%2FaZIBD2XKPULL4v1wH7p2UN4GaKkx1O7I%3D=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
> On Feb 8, 2018, at 12:43 AM, Eduardowrote: > > Profiling shouldn't help, it isn't the rigth tool. Use Xcode analyzer or from > command line scan-build and scan-view. If you can, use the last version of > clang-analyzer. The Clang address sanitizer would be the best tool for diagnosing memory corruption. It's a lifesaver. In Xcode it's very easy to enable: go to the scheme editor, select Run (or Test) from the list on the left, click the Diagnostics tab, and click the "Address Sanitizer" checkbox. I also recommend checking "Detect use of stack after return" and "Malloc Scribble". Then press the Run (or Test) button. I tend to leave this on all the time while developing; it slows down the program, but not enough to get in the way. It can also be used from the command-line, but I have no knowledge of how to do that. (I do know it requires recompiling with a special compiler flag, since it instruments the machine code.) —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
On Wed, 7 Feb 2018 17:29:54 + Deon Brewisescribió: > Oh yeah, I don’t think this is a SQLITE bug or anything. > > I think something in our code is writing to memory after freed. I'm just > trying to track it down at the point that it happens. We've tried all > Profiling tools on both OSX and Windows without luck, so my next step is > trying to find the writing thread at the point of corruption. > Profiling shouldn't help, it isn't the rigth tool. Use Xcode analyzer or from command line scan-build and scan-view. If you can, use the last version of clang-analyzer. In your project directory type: %mkdir review %scan-build make -o review/ -V in directory review is the scan-build html output. Use: %scan-view review/ to see it again > Dan Kennedy's suggestion seems like that would we that way to do that. > > - Deon -- Eduardo ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
Oh yeah, I don’t think this is a SQLITE bug or anything. I think something in our code is writing to memory after freed. I'm just trying to track it down at the point that it happens. We've tried all Profiling tools on both OSX and Windows without luck, so my next step is trying to find the writing thread at the point of corruption. Dan Kennedy's suggestion seems like that would we that way to do that. - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Simon Slavin Sent: Wednesday, February 7, 2018 8:32 AM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Header corruption On 7 Feb 2018, at 3:16pm, Deon Brewis <de...@outlook.com> wrote: > So this looks more like something is overwriting the memory of Page1 before > SQLITE writes it back to disk. That is almost always what people eventually admit to after reporting a problem like this. Some part of their code is stomping on memory or a file handle which SQLite thought it had exclusive rights to. Some of them discover it using a runtime profiler tool which looks for use of released memory or double-release of file handles, but I don't know enough about Windows to suggest anything. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7C883405b8374547bace6508d56e486bb8%7C84df9e7fe9f640afb435%7C1%7C0%7C636536179674888635=ZZtP6DzeVTBCnnWqxu0hWRMVLS6NWTFLrBI0%2Buu6G4I%3D=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
On 7 Feb 2018, at 3:16pm, Deon Brewiswrote: > So this looks more like something is overwriting the memory of Page1 before > SQLITE writes it back to disk. That is almost always what people eventually admit to after reporting a problem like this. Some part of their code is stomping on memory or a file handle which SQLite thought it had exclusive rights to. Some of them discover it using a runtime profiler tool which looks for use of released memory or double-release of file handles, but I don't know enough about Windows to suggest anything. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
On 02/07/2018 10:16 PM, Deon Brewis wrote: That's an interesting thought, but we see this on Windows as well, and on Windows these files are opened with SQLITE_OPEN_EXCLUSIVE. So this looks more like something is overwriting the memory of Page1 before SQLITE writes it back to disk. I don't think SQLITE verifies the memory during every write. There's a: if( memcmp(page1, zMagicHeader, 16)!=0 ){ during lockBtree() but that seems to be on read, not write. Is there somewhere I can add a check of the header buffer before the write to disk? Or does SQLITE literally never write this part of the header again once written the first time? (Seek past the zMagicHeader?). If it does rewrite it, it writes all of the first database page at the same time. You're using wal mode, so whenever page 1 (the page containing the db header) is written, it is appended to the wal file. Then, later on, a checkpoint copies it from the wal file into the db file. Writing the page to the wal file happens here: http://www.sqlite.org/src/artifact/5a3f464edd645?ln=3404 You could add (say): assert( p->pgno!=1 || 0==memcmp(pData, "SQLite", 6) ); Then copying from the wal file into the db file is here: http://www.sqlite.org/src/artifact/5a3f464edd645?ln=1861 Add (again, say): assert( iDbpage!=1 || 0==memcmp(zBuf, "SQLite", 6) ); Dan. - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Dan Kennedy Sent: Tuesday, February 6, 2018 9:08 AM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] Header corruption On 02/06/2018 11:57 PM, Deon Brewis wrote: I’m trying to track down SQLITE corruptions that seems to corrupt our databases in a consistent way. (Running on 3.20.1). This isn’t related to a crash or hardware failure. The app is running and reading and writing to the database fine, and then suddenly we start getting a SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN TRANSACTION’, and everything thereafter started failing. When our database headers are fine, I see this: h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3. 0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@ ...›..4B On corruption #1, I see this: h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ; .‘!…D.,í¾!ú 0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û On corruption #2, I see this: h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ; ..˱ÿœ.ÐÖ»" 0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼ This bit pattern of “15 03 01 00” for the first 4 characters where it should be “SQLi” is a bit too much of a coincidence to ignore or write off as disk error. The rest of the header and file after the first 25-ish bytes or so, seem fine. It’s just these first 25 bytes that gets corrupted. If I restore Offsets 0 to Offsets 27 out of a “working” database, I get a usable file. Under what circumstances would SQLITE even attempt to re-write the header string? Is this on UNIX? Quite often this sort of thing occurs because some other module is writing to a file-descriptor owned by SQLite. Usually because it closed its own fd, then SQLite opened the db file and was assigned the same integer fd value, then the rogue module wrote to the fd anyway. In other words, some other module is doing: close(fd); write(fd, "1503010020...", 25); and between those two calls SQLite is calling open() and is being assigned a file-descriptor with the same integer value as fd. Dan. - Deon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmaili nglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users= 02%7C01%7C%7C12190af418bc4a32192b08d56d8432f3%7C84df9e7fe9f640afb435aa aa%7C1%7C0%7C636535336909688339=OMCv%2BtfSJ4PvlFTFw8Cqcc 1eizCrnIgFnNSJ7x5XUoo%3D=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7C12190af418bc4a32192b08d56d8432f3%7C84df9e7fe9f640afb435%7C1%7C0%7C636535336909688339=OMCv%2BtfSJ4PvlFTFw8Cqcc1eizCrnIgFnNSJ7x5XUoo%3D=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
On 2/7/18, Deon Brewiswrote: > > I don't think SQLITE verifies the memory during every write. There's a: > if( memcmp(page1, zMagicHeader, 16)!=0 ){ > > during lockBtree() but that seems to be on read, not write. > > > Is there somewhere I can add a check of the header buffer before the write > to disk? There does not appear to be a single place to put this code as the header can be written from several places. Search for instances of "sqlite3OsWrite(pPager->fd, ...)". You want to be sure that the last field is "fd" and not "jfd" or "sjfd". I think there are four separate places you will need to insert your patch. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
That's an interesting thought, but we see this on Windows as well, and on Windows these files are opened with SQLITE_OPEN_EXCLUSIVE. So this looks more like something is overwriting the memory of Page1 before SQLITE writes it back to disk. I don't think SQLITE verifies the memory during every write. There's a: if( memcmp(page1, zMagicHeader, 16)!=0 ){ during lockBtree() but that seems to be on read, not write. Is there somewhere I can add a check of the header buffer before the write to disk? Or does SQLITE literally never write this part of the header again once written the first time? (Seek past the zMagicHeader?). - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Dan Kennedy Sent: Tuesday, February 6, 2018 9:08 AM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] Header corruption On 02/06/2018 11:57 PM, Deon Brewis wrote: > I’m trying to track down SQLITE corruptions that seems to corrupt our > databases in a consistent way. (Running on 3.20.1). > > This isn’t related to a crash or hardware failure. The app is running and > reading and writing to the database fine, and then suddenly we start getting > a SQLITE_NOTADB errors. The first error we got was on trying to execute a > ‘BEGIN TRANSACTION’, and everything thereafter started failing. > > When our database headers are fine, I see this: > h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3. > 0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@ > ...›..4B > > On corruption #1, I see this: > h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ; > .‘!…D.,í¾!ú > 0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; > žtçêš..,™°·;Æð5û > > On corruption #2, I see this: > h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ; ..˱ÿœ.ÐÖ»" > 0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; > .dÓôÝ8ûߣàG¸×ö!¼ > > > This bit pattern of “15 03 01 00” for the first 4 characters where it should > be “SQLi” is a bit too much of a coincidence to ignore or write off as disk > error. > > The rest of the header and file after the first 25-ish bytes or so, seem > fine. It’s just these first 25 bytes that gets corrupted. If I restore > Offsets 0 to Offsets 27 out of a “working” database, I get a usable file. > > Under what circumstances would SQLITE even attempt to re-write the header > string? Is this on UNIX? Quite often this sort of thing occurs because some other module is writing to a file-descriptor owned by SQLite. Usually because it closed its own fd, then SQLite opened the db file and was assigned the same integer fd value, then the rogue module wrote to the fd anyway. In other words, some other module is doing: close(fd); write(fd, "1503010020...", 25); and between those two calls SQLite is calling open() and is being assigned a file-descriptor with the same integer value as fd. Dan. > > - Deon > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmaili > nglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users= > 02%7C01%7C%7C12190af418bc4a32192b08d56d8432f3%7C84df9e7fe9f640afb435aa > aa%7C1%7C0%7C636535336909688339=OMCv%2BtfSJ4PvlFTFw8Cqcc > 1eizCrnIgFnNSJ7x5XUoo%3D=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7C12190af418bc4a32192b08d56d8432f3%7C84df9e7fe9f640afb435%7C1%7C0%7C636535336909688339=OMCv%2BtfSJ4PvlFTFw8Cqcc1eizCrnIgFnNSJ7x5XUoo%3D=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
On 02/06/2018 11:57 PM, Deon Brewis wrote: I’m trying to track down SQLITE corruptions that seems to corrupt our databases in a consistent way. (Running on 3.20.1). This isn’t related to a crash or hardware failure. The app is running and reading and writing to the database fine, and then suddenly we start getting a SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN TRANSACTION’, and everything thereafter started failing. When our database headers are fine, I see this: h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3. 0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@ ...›..4B On corruption #1, I see this: h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ; .‘!…D.,í¾!ú 0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û On corruption #2, I see this: h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ; ..˱ÿœ.ÐÖ»" 0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼ This bit pattern of “15 03 01 00” for the first 4 characters where it should be “SQLi” is a bit too much of a coincidence to ignore or write off as disk error. The rest of the header and file after the first 25-ish bytes or so, seem fine. It’s just these first 25 bytes that gets corrupted. If I restore Offsets 0 to Offsets 27 out of a “working” database, I get a usable file. Under what circumstances would SQLITE even attempt to re-write the header string? Is this on UNIX? Quite often this sort of thing occurs because some other module is writing to a file-descriptor owned by SQLite. Usually because it closed its own fd, then SQLite opened the db file and was assigned the same integer fd value, then the rogue module wrote to the fd anyway. In other words, some other module is doing: close(fd); write(fd, "1503010020...", 25); and between those two calls SQLite is calling open() and is being assigned a file-descriptor with the same integer value as fd. Dan. - Deon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Header corruption
Things stored in the first 25 bytes include page size, WAL status, and the file change counter. So at least part of the header there gets changed with every committed write transaction. http://www.sqlite.org/fileformat2.html -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Deon Brewis Sent: Tuesday, February 06, 2018 11:57 AM To: SQLite mailing list Subject: [sqlite] Header corruption I’m trying to track down SQLITE corruptions that seems to corrupt our databases in a consistent way. (Running on 3.20.1). This isn’t related to a crash or hardware failure. The app is running and reading and writing to the database fine, and then suddenly we start getting a SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN TRANSACTION’, and everything thereafter started failing. When our database headers are fine, I see this: h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3. 0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@ ...›..4B On corruption #1, I see this: h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ; .‘!…D.,í¾!ú 0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û On corruption #2, I see this: h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ; ..˱ÿœ.ÐÖ»" 0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼ This bit pattern of “15 03 01 00” for the first 4 characters where it should be “SQLi” is a bit too much of a coincidence to ignore or write off as disk error. The rest of the header and file after the first 25-ish bytes or so, seem fine. It’s just these first 25 bytes that gets corrupted. If I restore Offsets 0 to Offsets 27 out of a “working” database, I get a usable file. Under what circumstances would SQLITE even attempt to re-write the header string? - Deon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Header corruption
I’m trying to track down SQLITE corruptions that seems to corrupt our databases in a consistent way. (Running on 3.20.1). This isn’t related to a crash or hardware failure. The app is running and reading and writing to the database fine, and then suddenly we start getting a SQLITE_NOTADB errors. The first error we got was on trying to execute a ‘BEGIN TRANSACTION’, and everything thereafter started failing. When our database headers are fine, I see this: h: 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 ; SQLite format 3. 0010h: 10 00 02 02 00 40 20 20 00 00 10 9B 00 01 34 42 ; .@ ...›..4B On corruption #1, I see this: h: 15 03 01 00 20 01 91 21 85 44 17 2C ED BE 21 FA ; .‘!…D.,í¾!ú 0010h: 9E 74 E7 EA 9A 0A 15 2C 99 B0 B7 3B C6 F0 35 FB ; žtçêš..,™°·;Æð5û On corruption #2, I see this: h: 15 03 01 00 20 0F 1E CB B1 FF 9C 1E D0 D6 BB 22 ; ..˱ÿœ.ÐÖ»" 0010h: 15 64 D3 F4 DD 38 FB DF A3 E0 47 B8 D7 F6 21 BC ; .dÓôÝ8ûߣàG¸×ö!¼ This bit pattern of “15 03 01 00” for the first 4 characters where it should be “SQLi” is a bit too much of a coincidence to ignore or write off as disk error. The rest of the header and file after the first 25-ish bytes or so, seem fine. It’s just these first 25 bytes that gets corrupted. If I restore Offsets 0 to Offsets 27 out of a “working” database, I get a usable file. Under what circumstances would SQLITE even attempt to re-write the header string? - Deon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users