Lucifers,
I had a close look at Lucy's locking code and while I didn't find any glaring
problems, I think it could be improved in some places. First of all, I'd like
to use a locking mechanism provided by the OS instead of rolling our own. The
main upside is that we can detect reliably whether a process holding a lock
has crashed without relying on a heuristic. On UNIX there are:
fcntl: This is specified in POSIX and should exist on every modern UNIX
platform. The main problem is that fcntl identifies locks by a (pid, inode)
pair. This means that you can only hold a single lock on a file in a process.
If you close a lock file, any locks obtained via other file descriptors will
be relased, too. This is especially dangerous in multi-threaded applications
and AFAICS can only be worked around with refcounted global locks protected by
a mutex.
lockf: Typically just a wrapper around fcntl.
flock: This is a BSD interface that also works on Linux. Can be safely used
with threads. Requires Linux 2.6.12 to work with NFS.
On Windows, exclusive (and mandatory) locks can be obtained when opening a
file with CreateFile, shared locks with FileLockEx.
Regarding NFS, you can find lots of information online about certain
implementations not supporting certain locking APIs. But it seems to me that
this is mostly anecdotal. My guess is that modern systems "just work" with
either fcntl or flock.
To get started, I'd propose to implement a locking backend based on flock that
is the default for local operation and optional on shared volumes.
Some other things I noted:
1. Lock_Is_Locked is in general prone to race conditions.
1.1. It is used to test the merge lock when indexing and pruning which is OK
because the write lock is held. But it makes it hard to reason about the code.
1.2. Testing NFS snapshot read locks with ShLock_Is_Locked is inherently racy.
If it turns out that shared locks based on flock or fcntl work reliably on
somewhat modern NFS setups, I'd love to remove Is_Locked and the current
SharedLock implementation completely.
2. The background merger first obtains the write lock, then the merge lock,
then releases the write lock and acquires it again. With multiple background
mergers and no timeouts, this could lead to a classic deadlock. It isn't a
problem in practice because typically there's only a single merger and we have
timeouts. But again, it makes the code hard to reason about.
Here's a sketch of how I'd rework the code. The key idea is to separate the
purging of snapshots and dead merges. Purging of dead merges is error recovery
and should be done before modifying the index. Purging snapshots is routine
cleanup that should be done only after modifying the index.
Indexer() {
if (Lock_Request(merge_lock)) {
// No active background merger.
Lock_Obtain(write_lock);
// Purge dead merge if merge.json exists.
Lock_Release(write_lock);
Lock_Release(merge_lock);
}
Lock_Obtain(write_lock);
// Test for and read merge.json without checking
// the merge lock.
// Add documents.
// Commit.
// Purge snapshots (but not dead merge).
Lock_Release(write_lock);
}
BackgroundMerger() {
Lock_Obtain(merge_lock);
Lock_Obtain(write_lock);
// Purge dead merge if merge.json exists.
// Set up.
// Write merge.json.
Lock_Release(write_lock);
// Merge.
Lock_Obtain(write_lock);
// Commit.
// Remove merge.json.
// Purge snapshots (but not dead merge).
Lock_Release(write_lock);
Lock_Release(merge_lock);
}
Benefits:
- No need for Is_Locked.
- Consistent locking order with outer merge lock and
inner write lock.
- Don't purge twice when indexing and background merging.
Nick