FWIW in the building I work in we have 20-30 users hitting around a dozen
SQLite DBs 8 hours a day 5 days a week, with all DBs served over nfs. The
number of corruptions I've seen in the last 5 years which nfs *might* be
responsible for is *very low*. The number of corruptions where nfs was
*definitely* responsible is... zero I think? Definitely single digits. And
off the top of my head I can recall four incidents where corruption was
found in a copy of the database during our backup procedure, while the main
DB was intact.

The thing is, we don't use SQLite's backup api in this procedure -- we lock
the DB and make a copy using OS syscalls. Ironically this was the cause of
the vast majority of our corruptions, because our initial implementation
did not account for POSIX fcntl locking semantics and when we closed our
file descriptor used in the copy we inadvertently dropped SQLite's locks.
Anyway, the incidents I'm talking about occurred long after that bug was
fixed. Somehow a regular copy of a file, involving a single nfs client,
wound up corrupt.

I've looked very closely at these and the corruption affected a single 4k
region of the DB. It's been observed on DBs using both 4k and 1k page
sizes. It could be a broken nfs cache but the weird part is that the first
512 bytes of the region are absolutely fine. Exactly the same as the
non-corrupt DB. Then the next 3584 bytes are random garbage.

This is not a solved mystery, and probably it never will be as it's been
many months since we've seen these particular symptoms. Maybe the problem
stemmed from nfs, but it could also be a faulty disk controller, or strange
kernel interaction with multiple fds, or a memory corruption in our
application itself (which is kind of what I lean towards given the
corruption is 512 bytes misaligned from the page boundary).


Erm, I got a bit carried away. My point is, it's not all doom and gloom.
I'm sure nfs implementations used to be much worse, and I'm not saying
they're perfect these days -- we get semi-regular reports from windows
users regarding DB corruption, and there's one particular customer we have
who's NAS server likes to keep hold of file locks on behalf of some unknown
client, long after all client machines have been rebooted (resulting in
SQLite being unable to access the DBs). And nfs on OSX seems to be a
non-starter; not sure what's going on with that client implementation but
the brief experimentation I've done with it suggested that corruption was
guaranteed.


But if your nfs solution is configured not to lie, to honour lock and sync
requests, things work well. Corruption has almost been a non-issue for us
since we fixed our application's bug. The bigger problem we face is
concurrency, due to the single writer model and the writer-starvation
solution locking new readers out of the DB until all current readers are
done (so the writer can finish). We plan to migrate to an actual SQL server
for that reason, but please don't take it as a criticism of SQLite - I
think it does a marvelous job in a scenario it definitely wasn't designed
for.

-Rowan

On 14 August 2018 at 21:07, Wout Mertens <wout.mert...@gmail.com> wrote:

> Idle musing again, I'm pretty bad at dropping thoughts that are not
> immediately applicable to me, sorry.
>
> I know that multi-writer sqlite and NFS don't play well with each other.
>
> However, I wonder if some constraints could be added that would make this
> situation safe.
>
> My problem space is that of a shared NixOS package store between VMs, which
> holds metadata about the available packages:
>
>    - many writers need access to the same db
>    - their only communication channel is the POSIX filesystem that holds
>    the db
>    - they only write "seldomly", every few seconds at the fastest
>    - they do read all the time
>    - it is ok if read data is a little bit stale (10s is acceptable)
>    - it is ok if write transactions fail and can be retried
>    - it is ok if writes are slow
>    - it is never ok for data to be corrupt
>
> Is there a way to use safely sqlite in this situation, perhaps by using
> extra lock files or some other additional mechanism?
>
> One solution I can think of involves sending all writes through a single
> master, via files describing changes and lots of polling, but that seems
> really outlandish.
>
> Wout.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to