Re: [lucy-dev] Improving Lucy's locking code

Nick Wellnhofer Sat, 18 Feb 2017 07:22:52 -0800

On 18/02/2017 03:48, Peter Karman wrote:

We already write the PID to Lucy lock files, so we can check if the process
that created the lock is still running, yes?


Or is that the very heuristic that you're wanting to move away from?

Yes, because it's unreliable. We don't detect whether another unrelatedprocess happens to reuse the PID. For example:


- We have an Indexer with PID 42.
- The machine crashes during indexing, leaving a lockfile with PID 42.
- The machine restarts and happens to assign PID 42 to another process
  before an Indexer runs.
- Any new Indexer will be locked out as long as this other process is
  running.

Right now, the only remedy is to manually delete the lock file. Fortunately,this scenario is unlikely if only the indexing process terminates abnormally,because PIDs won't get reused until they wrap around. Even if there's a systemcrash, there's a good chance that an Indexer is started before the old PID isreused.

On a shared volume like NFS, this problem is more pronounced. A single machinethat goes down or loses its network connection in the wrong moment will blockall other indexers until it gets back up and starts an indexing session.

Native locks are released by the operating system if a process crashes. Thiseven works on NFS after a certain timeout with modern client and serverimplementations. Other than that, native shared locks should be faster on NFS.


Nick

Re: [lucy-dev] Improving Lucy's locking code

Reply via email to