On Fri, 13 Dec 2019 at 23:50, 고예찬 <yeechan...@gmail.com> wrote:

> Hello, I am experiencing `database is locked` error. I wonder if anyone has
> gone through or resolved similar issue.
>
> To illustrate, I have a .db file with below settings:
> ```
> PRAGMA journal_mode=WAL;
> PRAGMA wal_autocheckpoint=128;
> PRAGMA journal_size_limit=0;
> ```
> and I run two programs in python - one writing the data 1 row per second,
> one read / upload / then delete the data every 30 seconds.
>
> Normally, it works fine but occasionally after power failure and reboot(the
> programs run on an IoT device), the reader program throws `database is
> locked` error and never goes away after since, while the writer program
> seems to work fine. This lock doesn't go away even after program/system
> restart.


This is an odd symptom as the fcntl/POSIX file locks (sqlite's default
locking mechanism under linux) are attributed to processes, and the kernel
never writes them to disk. Is there some kind of network file system
involved or does the IoT device have its own SSD?


> One interesting fact is that if I do `cp data.db data1.db && rm data.db &&
> mv data1.db data.db`. The error always goes away completely. Though it
> happens again after several hard reboot.
>

This series of commands gets around the locks because it creates two copies
of the database contents stored in different inodes. The kernel's locking
table is based on inodes rather than file paths, so you end up with
existing processes accessing the deleted inode and newer processes
accessing the new inode, even though they're both referring to the same
path ./data.db (sidenote: this is also a great way to corrupt databases).


> It would be very helpful if anyone can give any advice.
>

There's not really any operating system/file system mechanism which
adequately explains the symptoms you're seeing, so I'd be looking into your
software components. Figure out what is holding the lock and why. Look for
sqlite_stmts which are never disposed - these will retain their locks (the
sqlite3_next_stmt function can be useful for finding these).

Also, look for code which interacts with data.db directly using regular
file syscalls rather than going via sqlite. Because this:


> If I leave the problem as it is(writer keeps writing, and reader
> keeps retrying upon error), the error thrown changes to `database disk
> image is malformed`.
>

Strongly suggests there is a process somewhere which is touching data.db
without obeying sqlite's expected locking protocol (which prevents
corruption). Even something as innocent as an open()/close() on the file
can cause problems, because of the semantics of POSIX locks (section 2.2 of
https://www.sqlite.org/howtocorrupt.html)

-Rowan
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to