On Tue, Feb 14, 2017 at 9:03 AM, Nick Wellnhofer <[email protected]> wrote: > On 13/02/2017 20:44, Serkan Mulayim wrote: >> What would the implications of simply deleting write.lock and merge.lock >> be? > > > In most cases, this shouldn't be necessary. Lucy stores the PID of the > process that created a lock and tries to clear stale lock files from crashed > processes. But this won't work if another processes reuses the PID. If > you're absolutely sure that a lock doesn't belong to an active Indexer, you > can delete the lock directories manually. > > Side note: This could be improved by supporting locking mechanisms that > release locks automatically if a process crashes. But these are OS-dependent > and aren't guaranteed to work reliably over NFS: > > - `fcntl(F_SETLK)` or `lockf` on POSIX (unsuitable for multi-threaded > operation). > - `flock` on BSD, Linux. > - `CreateFile` with a 0 sharing mode on Windows.
As another sidenote, there are techniques for reliable exclusive locking when the datastore is NFS. Namely, instead of using the default locking mechanisms in Unix, you can use the link(2) system interface (which is an atomic operation on NFS) with an agreed-upon name for your lock. For example, if your shared volume was "/shared", then you could create a temporary file using mkstemp on the volume, then attempt to link(2) the temporary file to that known lockfile name, "/shared/lock". If the link succeeds, you have the lock, but if the operation fails, another process obtained the lock. This method does require that your processes clean up (i.e. delete) the file when you want to release the lock, however. When it comes to rebuilding the index, we typically build the index under a temporary directory name, then swap out the directories to the production path using a forced-symlink (ln -sf). As long as the old index is kept for the maximum length of time of a searcher process, there's no danger. In other words: 1. Build to /shared/index_123/ (number could also be the PID of the index-building process). 2. Delete /shared/index_old/. 3. Use readlink(2) to grab the current (real) pathname of the index (/shared/index_122) 4. cd /shared ; ln -sf index_123/ /shared/index (production path) 5. Rename the previous index (/shared/index_122) to /shared/index_old/. By building the index under a temporary directory name, then swapping out the directory when we want to put the new index into production, we avoid the locking problems between readers and writers entirely. -- Tilghman
