Re: [sqlite] Persisting Database Lock Issue

2019-12-15 Thread Rowan Worth
On Fri, 13 Dec 2019 at 23:50, 고예찬  wrote:

> Hello, I am experiencing `database is locked` error. I wonder if anyone has
> gone through or resolved similar issue.
>
> To illustrate, I have a .db file with below settings:
> ```
> PRAGMA journal_mode=WAL;
> PRAGMA wal_autocheckpoint=128;
> PRAGMA journal_size_limit=0;
> ```
> and I run two programs in python - one writing the data 1 row per second,
> one read / upload / then delete the data every 30 seconds.
>
> Normally, it works fine but occasionally after power failure and reboot(the
> programs run on an IoT device), the reader program throws `database is
> locked` error and never goes away after since, while the writer program
> seems to work fine. This lock doesn't go away even after program/system
> restart.


This is an odd symptom as the fcntl/POSIX file locks (sqlite's default
locking mechanism under linux) are attributed to processes, and the kernel
never writes them to disk. Is there some kind of network file system
involved or does the IoT device have its own SSD?


> One interesting fact is that if I do `cp data.db data1.db && rm data.db &&
> mv data1.db data.db`. The error always goes away completely. Though it
> happens again after several hard reboot.
>

This series of commands gets around the locks because it creates two copies
of the database contents stored in different inodes. The kernel's locking
table is based on inodes rather than file paths, so you end up with
existing processes accessing the deleted inode and newer processes
accessing the new inode, even though they're both referring to the same
path ./data.db (sidenote: this is also a great way to corrupt databases).


> It would be very helpful if anyone can give any advice.
>

There's not really any operating system/file system mechanism which
adequately explains the symptoms you're seeing, so I'd be looking into your
software components. Figure out what is holding the lock and why. Look for
sqlite_stmts which are never disposed - these will retain their locks (the
sqlite3_next_stmt function can be useful for finding these).

Also, look for code which interacts with data.db directly using regular
file syscalls rather than going via sqlite. Because this:


> If I leave the problem as it is(writer keeps writing, and reader
> keeps retrying upon error), the error thrown changes to `database disk
> image is malformed`.
>

Strongly suggests there is a process somewhere which is touching data.db
without obeying sqlite's expected locking protocol (which prevents
corruption). Even something as innocent as an open()/close() on the file
can cause problems, because of the semantics of POSIX locks (section 2.2 of
https://www.sqlite.org/howtocorrupt.html)

-Rowan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Persisting Database Lock Issue

2019-12-13 Thread 고예찬
Hello, I am experiencing `database is locked` error. I wonder if anyone has
gone through or resolved similar issue.

To illustrate, I have a .db file with below settings:
```
PRAGMA journal_mode=WAL;
PRAGMA wal_autocheckpoint=128;
PRAGMA journal_size_limit=0;
```
and I run two programs in python - one writing the data 1 row per second,
one read / upload / then delete the data every 30 seconds.

Normally, it works fine but occasionally after power failure and reboot(the
programs run on an IoT device), the reader program throws `database is
locked` error and never goes away after since, while the writer program
seems to work fine. This lock doesn't go away even after program/system
restart. If I leave the problem as it is(writer keeps writing, and reader
keeps retrying upon error), the error thrown changes to `database disk
image is malformed`.

One interesting fact is that if I do `cp data.db data1.db && rm data.db &&
mv data1.db data.db`. The error always goes away completely. Though it
happens again after several hard reboot.

I would love to come up with a reproducible test case or db file, but I
couldn't simulate power failures. The database file also works normally
when it is moved from one place to another. These make it very hard to
diagnose, and find the root cause of the issue.

I suspected a bad sector issue on the SSD that the programs run on, but
SMART self-assessment test result(with `smartctl -H`) shows me a PASSED.

It would be very helpful if anyone can give any advice.

The operation environment is below: OS: Ubuntu 16.04
Kernel: 4.13.0-38-generic
Python: 2.7.12
SQLite: 3.11.0
Disk Type: ext4

SSD info: (`smartctl -i`)

=== START OF INFORMATION SECTION ===
Device Model: SQF-SMSM4-256G-SBC
Serial Number:...
LU WWN Device Id: 0 00 0
Firmware Version: SBFMA012
User Capacity:256,060,514,304 bytes [256 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:Solid State Device
Form Factor:  mSATA
Device is:Not in smartctl database [for details use: -P showall]
ATA Version is:   Unknown(0x0ff8) (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:Thu Dec 12 23:09:04 2019 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users