Lucifers,

I had a close look at Lucy's locking code and while I didn't find any glaring problems, I think it could be improved in some places. First of all, I'd like to use a locking mechanism provided by the OS instead of rolling our own. The main upside is that we can detect reliably whether a process holding a lock has crashed without relying on a heuristic. On UNIX there are:

fcntl: This is specified in POSIX and should exist on every modern UNIX platform. The main problem is that fcntl identifies locks by a (pid, inode) pair. This means that you can only hold a single lock on a file in a process. If you close a lock file, any locks obtained via other file descriptors will be relased, too. This is especially dangerous in multi-threaded applications and AFAICS can only be worked around with refcounted global locks protected by a mutex.

lockf: Typically just a wrapper around fcntl.

flock: This is a BSD interface that also works on Linux. Can be safely used with threads. Requires Linux 2.6.12 to work with NFS.

On Windows, exclusive (and mandatory) locks can be obtained when opening a file with CreateFile, shared locks with FileLockEx.

Regarding NFS, you can find lots of information online about certain implementations not supporting certain locking APIs. But it seems to me that this is mostly anecdotal. My guess is that modern systems "just work" with either fcntl or flock.

To get started, I'd propose to implement a locking backend based on flock that is the default for local operation and optional on shared volumes.

Some other things I noted:

1. Lock_Is_Locked is in general prone to race conditions.

1.1. It is used to test the merge lock when indexing and pruning which is OK because the write lock is held. But it makes it hard to reason about the code.

1.2. Testing NFS snapshot read locks with ShLock_Is_Locked is inherently racy. If it turns out that shared locks based on flock or fcntl work reliably on somewhat modern NFS setups, I'd love to remove Is_Locked and the current SharedLock implementation completely.

2. The background merger first obtains the write lock, then the merge lock, then releases the write lock and acquires it again. With multiple background mergers and no timeouts, this could lead to a classic deadlock. It isn't a problem in practice because typically there's only a single merger and we have timeouts. But again, it makes the code hard to reason about.

Here's a sketch of how I'd rework the code. The key idea is to separate the purging of snapshots and dead merges. Purging of dead merges is error recovery and should be done before modifying the index. Purging snapshots is routine cleanup that should be done only after modifying the index.

    Indexer() {
        if (Lock_Request(merge_lock)) {
            // No active background merger.
            Lock_Obtain(write_lock);
                // Purge dead merge if merge.json exists.
            Lock_Release(write_lock);
            Lock_Release(merge_lock);
        }

        Lock_Obtain(write_lock);
            // Test for and read merge.json without checking
            // the merge lock.
            // Add documents.
            // Commit.
            // Purge snapshots (but not dead merge).
        Lock_Release(write_lock);
    }

    BackgroundMerger() {
        Lock_Obtain(merge_lock);
            Lock_Obtain(write_lock);
                // Purge dead merge if merge.json exists.
                // Set up.
                // Write merge.json.
            Lock_Release(write_lock);
            // Merge.
            Lock_Obtain(write_lock);
                // Commit.
                // Remove merge.json.
                // Purge snapshots (but not dead merge).
            Lock_Release(write_lock);
        Lock_Release(merge_lock);
    }

Benefits:

- No need for Is_Locked.
- Consistent locking order with outer merge lock and
  inner write lock.
- Don't purge twice when indexing and background merging.

Nick

Reply via email to