On Sat, 28 Sep 2019 at 06:59, Roman Fleysher <roman.fleys...@einstein.yu.edu>
wrote:

> ( somewhat related to Re: [sqlite] Safe to use SQLite over a sketchy
> network?)
>
> Dear SQLiters,
>
> I am using SQLite over GPFS distributed file system. I was told it
> honestly implements file locking. I never experienced corruption. But it is
> slow in the sense that when many jobs from many compute nodes try to access
> the same database things slow down considerably.
>
> I suspect, from the point of view of file system, there is lots of
> pressure to develop fast grabbing of a lock and slow release. I think this
> is because the key to fast network file system in general is making it as
> independent as possible, thus distributed. Avoid bottlenecks. But locking
> is by definition a bottleneck.
>


> From experience, it seems that because SQLite still requests file locks,
> the performance increase is not that big. I wonder if there is a way to
> disable SQLite's internal file locking mechanism.


In my experience with SQLite over network file systems, the biggest
bottleneck has nothing to do with locking and everything to do with
synchronous I/O, journalling, and the single-writer model.

Disabling locking in your scenario is _guaranteed_ to break your jobs.
SQLite on the compute nodes will at some point read a half-committed change
to the database and return SQLITE_CORRUPT (best case), or silently compute
a garbage result (worst case). Unless, that is, the database in question is
read-only and never updated. But if that was the case there would be no
scaling issue with the number of compute nodes as read-locks do not
conflict with each other.

The best thing you can do to improve concurrency for SQLite over a
networked file system is to carefully manage your transaction lifetimes.
There are several patterns to avoid:

1. Lots of small write transaction
2. Transactions which are open for a long time
3. Write transactions which do a lot of work before taking the RESERVED lock

All of which apply to SQLite on a local filesystem, but the network latency
magnifies the effects. To elaborate quickly, synchronous I/O and data being
written twice¹ impose a significant constant-time cost per transaction,
which is why small writes are not efficient. Avoiding long-running
transactions applies to both read and write transactions, because during a
DB update there is a period where the writer needs exclusive access to the
DB. If there is a long-running read transaction active at this point, the
writer must wait for it to finish and the effect is _every_ node wanting to
access the DB has to wait for this one read transaction.

¹once to the journal, once to the main DB

Somewhat related is a transaction which reads a bunch of data before doing
any DB updates - the problem here is that another node may take the
RESERVED lock during the read phase. SQLite only supports a single writer
at a time, so when the transaction tries to proceed to its write phase it
will not be able to proceed; you end up having to abort it and redo the
read phase. This one is avoided by phrasing the transaction using "BEGIN
IMMEDIATE", which will cause SQLite to take the RESERVED lock at the start
of the transaction.


I think WAL journal mode can improve concurrency but of course it doesn't
work in a network context. Anyway, trying to shortcut SQLite's mechanisms
is almost certainly the wrong question to be asking. If you don't need
locking then you don't need consistency and you should consider whether a
DB is the right tool or whether regular files would suffice.

-Rowan
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to