Re: [sqlite] iOS Watchdog and database corruption

2018-02-21 Thread Richard Hipp
On 2/21/18, Deon Brewis  wrote:
> I do.
>
> I'll have to request permission from the customer though to share it - who
> will potentially be looking at the file? (Just so I can share names and
> background with the customer to put him at ease).

You can send corrupt database files (and corresponding journals)
directly to my private email and they will be shared only among the
SQLite developers: me, Dan, and Joe.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] iOS Watchdog and database corruption

2018-02-21 Thread Deon Brewis
I do. 

I'll have to request permission from the customer though to share it - who will 
potentially be looking at the file? (Just so I can share names and background 
with the customer to put him at ease).

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Jens Alfke
Sent: Wednesday, February 21, 2018 9:19 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] iOS Watchdog and database corruption



> On Feb 21, 2018, at 7:45 AM, Simon Slavin <slav...@bigfraud.org> wrote:
> 
> My concern was that it's abnormal for "sqlite3LeaveMutexAndCloseZombie" to 
> take five seconds to execute.

As of a few weeks ago, I know all about this function ;-) It's called when the 
last statement is closed on a "zombie" database connection that's already had 
sqlite3_close_v2 called on it; it performs the actual close that was deferred. 
It's taking a long time because it's calling sqlite3WalCheckpoint.

But it is scary that the database file got corrupted. Deon, do you still have 
the corrupted file(s) available for forensics?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7C257363c1f5294b83d4c408d5794f3e3c%7C84df9e7fe9f640afb435%7C1%7C0%7C636548303605879185=91ZUzky2NXRMbmdZ70MnTnW%2FT4crgDMfNTwTXPiDsCg%3D=0
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] iOS Watchdog and database corruption

2018-02-21 Thread Jens Alfke


> On Feb 21, 2018, at 7:45 AM, Simon Slavin  wrote:
> 
> My concern was that it's abnormal for "sqlite3LeaveMutexAndCloseZombie" to 
> take five seconds to execute.

As of a few weeks ago, I know all about this function ;-) It's called when the 
last statement is closed on a "zombie" database connection that's already had 
sqlite3_close_v2 called on it; it performs the actual close that was deferred. 
It's taking a long time because it's calling sqlite3WalCheckpoint.

But it is scary that the database file got corrupted. Deon, do you still have 
the corrupted file(s) available for forensics?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] iOS Watchdog and database corruption

2018-02-21 Thread Simon Slavin
On 21 Feb 2018, at 3:34pm, Deon Brewis  wrote:

> Yes, definitely the main thread - we close down the database during 
> applicationWillTerminate. It gives us 5 seconds to exit before it triggers 
> the watchdog.

Okay, that all sounds right, and the dump you pasted suggested everything 
worked right.  I know a lot about the iOS application cycle, somewhat less 
about SQLite.

>> "If the offending call really was "sqlite3LeaveMutexAndCloseZombie" then you 
>> may have some sort of mismanagement in your code"
> 
> What do you mean by that? Is it abnormal for sqlite3close to call 
> sqlite3LeaveMutexAndCloseZombie?

Sorry, I didn't mean it like that.  My concern was that it's abnormal for 
"sqlite3LeaveMutexAndCloseZombie" to take five seconds to execute.  This is 
rare, and suggested that perhaps some other part of your application (maybe 
your own code, maybe SQLite code) had abandoned the mutex.  But it seems you're 
doing everything right and the 5 second delay mystifies me.

DRH says that a crash of any sort should not be corrupting the database.  If 
you can reliably demonstrate it happening, I'm sure it's something he'd like to 
investigate.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] iOS Watchdog and database corruption

2018-02-21 Thread Stephen Chrzanowski
That number doesn't surprise me.  At my company, one of our products is
built around iPads.  Airlines give their pilots 16-32GB iPads to bring into
the cockpit to look at maps, charts, weather info, etc.  The iPads
essentially become EFB, or, Electronic Flight Bags.  Compressed, we push
two or three gig of PDFs, images, and proprietary information of different
format structures, probably some of which is SQLite, to the devices on
initial deployment, and then update packages data going forward.  Once
received, those packages are decompressed and put into place.

A single 5gig database?  Not a big deal if the device is being used for a
very specific purpose.

On Wed, Feb 21, 2018 at 10:22 AM, Simon Slavin  wrote:

>
> > SQLITE3 version is 3.20.1. Database size is around 5 GB.
>
> You have a 5 GB database on a device which may have a 16 GB capacity ?  I
> assume you know what you're doing.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] iOS Watchdog and database corruption

2018-02-21 Thread Deon Brewis
Yes, definitely the main thread - we close down the database during 
applicationWillTerminate. It gives us 5 seconds to exit before it triggers the 
watchdog.

Termination Description: SPRINGBOARD, process-exit watchdog transgression: xxx 
exhausted real (wall clock) time allowance of 5.00 seconds |  | 
ProcessVisibility: Foreground | ProcessState: Running | WatchdogEvent: 
process-exit | WatchdogVisibility: Foreground | WatchdogCPUStatistics: ( | 
"Elapsed total CPU time (seconds): 3.740 (user 3.740, system 0.000), 25% CPU", 
| "Elapsed application CPU time (seconds): 0.049, 0% CPU" | )

Triggered by Thread:  0

Thread 0 crashed:
__semwait_signal: external code (libsystem_kernel.dylib)
nanosleep: external code (libsystem_c.dylib)
+[NSThread sleepForTimeInterval:]: external code (Foundation)
Database::signalCloseAndWait()
App::~App()
-[AppDelegate applicationWillTerminate:]: appdelegate.mm @ 377
-[UIApplication _terminateWithStatus:]: external code (UIKit)


"If the offending call really was "sqlite3LeaveMutexAndCloseZombie" then you 
may have some sort of mismanagement in your code"

What do you mean by that? Is it abnormal for sqlite3close to call 
sqlite3LeaveMutexAndCloseZombie?

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Wednesday, February 21, 2018 7:23 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] iOS Watchdog and database corruption

On 21 Feb 2018, at 2:35pm, Deon Brewis <de...@outlook.com> wrote:

> The application got watchdog terminated by iOS because the main thread was 
> taking too long (waiting for the sqlite3close on the worker thread). The 
> resultant force close seems to have aborted SQLITE in such a way that it 
> caused the database to be corrupted.
> 
> Worker thread stack:
> Thread 4:
>ftruncate: external code (libsystem_kernel.dylib)
>unixTruncate: sqlite3.c @ 34036
>sqlite3WalCheckpoint: sqlite3.c @ 56846
>sqlite3WalClose: sqlite3.c @ 56955
>sqlite3PagerClose: sqlite3.c @ 51556
>sqlite3BtreeClose: sqlite3.c @ 62169
>sqlite3LeaveMutexAndCloseZombie: sqlite3.c @ 142752
>sqlite3Close: sqlite3.c @ 0

I'm puzzled by this.  iOS gives applications quite a long time to terminate 
before calling "kill" on them.  Had "applicationWillTerminate" been called ?  
Was it definitely your main thread (via thread 4) which was delaying the 
termination, and not another thread ?  You should find the offending thread 
identified further up in that same report, just before it starts listing the 
call-stacks of each thread.

If the offending call really was "sqlite3LeaveMutexAndCloseZombie" then you may 
have some sort of mismanagement in your code.  Or it might be just a 
once-in-a-blue-moon problem which will never occur again.

The rest I lave up to the devs.  SQLite should not be corrupting a database 
just because it was unexpectedly terminated, no matter what it was doing when 
terminated.  It was written to avoid that and no amount of testing has shown 
such a bug.

> SQLITE3 version is 3.20.1. Database size is around 5 GB.

You have a 5 GB database on a device which may have a 16 GB capacity ?  I 
assume you know what you're doing.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=02%7C01%7C%7C84050f4c5810477347f708d5793effe3%7C84df9e7fe9f640afb435%7C1%7C0%7C636548233837936119=ALIGynOvAu4HWcRE3wlBELXyEjC39PDXTYJDSNnJiqc%3D=0
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] iOS Watchdog and database corruption

2018-02-21 Thread Simon Slavin
On 21 Feb 2018, at 2:35pm, Deon Brewis  wrote:

> The application got watchdog terminated by iOS because the main thread was 
> taking too long (waiting for the sqlite3close on the worker thread). The 
> resultant force close seems to have aborted SQLITE in such a way that it 
> caused the database to be corrupted.
> 
> Worker thread stack:
> Thread 4:
>ftruncate: external code (libsystem_kernel.dylib)
>unixTruncate: sqlite3.c @ 34036
>sqlite3WalCheckpoint: sqlite3.c @ 56846
>sqlite3WalClose: sqlite3.c @ 56955
>sqlite3PagerClose: sqlite3.c @ 51556
>sqlite3BtreeClose: sqlite3.c @ 62169
>sqlite3LeaveMutexAndCloseZombie: sqlite3.c @ 142752
>sqlite3Close: sqlite3.c @ 0

I'm puzzled by this.  iOS gives applications quite a long time to terminate 
before calling "kill" on them.  Had "applicationWillTerminate" been called ?  
Was it definitely your main thread (via thread 4) which was delaying the 
termination, and not another thread ?  You should find the offending thread 
identified further up in that same report, just before it starts listing the 
call-stacks of each thread.

If the offending call really was "sqlite3LeaveMutexAndCloseZombie" then you may 
have some sort of mismanagement in your code.  Or it might be just a 
once-in-a-blue-moon problem which will never occur again.

The rest I lave up to the devs.  SQLite should not be corrupting a database 
just because it was unexpectedly terminated, no matter what it was doing when 
terminated.  It was written to avoid that and no amount of testing has shown 
such a bug.

> SQLITE3 version is 3.20.1. Database size is around 5 GB.

You have a 5 GB database on a device which may have a 16 GB capacity ?  I 
assume you know what you're doing.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] iOS Watchdog and database corruption

2018-02-21 Thread Richard Hipp
On 2/21/18, Deon Brewis  wrote:
>
> a) Is it expected that an app crash / force terminate in the middle of a
> SQLITE3 checkpoint like this can cause corruption?

No.  See, for example, https://www.sqlite.org/atomiccommit.html and
https://www.sqlite.org/wal.html and .  If the filesystem is behaving
properly, and assuming no other application tries to "clean up" after
a app crash, then the database will be automatically restored to a
consistent state the next time it is opened.  This is extensively
tested.

Usually issues like this come back to either filesystem bugs or the
watchdog, or some other component, going in an deleting the -wal file
in an effort to be helpful and "clean up" after the application crash,
and thereby deleting information that SQLite needs to recover,
resulting in a corrupt database.

Other ways in which the database file can go corrupt:
https://www.sqlite.org/howtocorrupt.html

>
> b) Is there a way I can do a close without triggering a checkpoint? (In
> order to speed up close, so that it doesn't trigger a watchdog).
>

Set the SQLITE_DBCONFIG_NO_CKPT_ON_CLOSE option.
https://www.sqlite.org/c3ref/c_dbconfig_enable_fkey.html

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users