Thanks to detective work by Rod at hemiola and David du Colombier,
I've been able to find & fix the long-standing deadlock in fossil's
snapshot code.
The problem is the lock c->dirtylk, which prevents any thread from
dirtying new blocks while flushThread is writing old dirty blocks to
disk. Deadlock occurs when a thread T tries to dirty a block B while
holding a lock on an already-dirty block A. T can't proceed until
c->dirtylk is unlocked, which won't happen until flushThread has written
out block A; but block A can't be written out until T unlocks it.
The deadlock is usually observed when a fossil 'sync' occurs (eg at the
beginning of a periodic temporary snapshot) while 'snap -a' is running;
in this case thread T is archThread, updating the label in block B for a
dirty block A which it has just sent to venti. But there are many other
cases where a thread locks two blocks and marks one dirty, which are
vulnerable to this deadlock if a snapshot occurs just at the wrong time.
The solution is simple (though I only realised it after trying and
rejecting several complicated ones). It turns out that c->dirtylk
serves no useful purpose and can be eliminated altogether. Its use
is explained by a comment in cacheFlush:
/*
* Lock c->dirtylk so that more blocks aren't being dirtied
* while we try to write out what's already here.
* Otherwise we might not ever finish!
*/
But whenever cacheFlush is called with wait=1, a write lock is already
held on the "epoch lock" fs->elk, which prevents any new 9p requests
from touching the file system. Any dirtying of blocks is limited to
internal fossil activities for requests which were already in progress
when cacheFlush was called, and thus strictly bounded even without
c->dirtylk.
Patch is fossil-snap-deadlock.