On Fri, Nov 18, 2011 at 5:18 PM, Marvin Humphrey <[email protected]> wrote: >> The lockfile contains: >> { >> "host": "host6", >> "name": "write", >> "pid": "24342" >> } > > OK, all that looks correct. Also, since the lockfile is still there and > definitely corresponds to the process that crashed, we can assume that no > other process has messed with the index directory since. > > Question: is there a seg_2 folder in the index dir? If so, is there anything > inside it?
Yes, seg_2 was there - it also had the same timestamp as the lockfile, implying it had been created by the same process that created the lockfile itself. Secondly, no it was empty. > It could be NFS cache consistency: a deletion operation succeeds, and the item > is really gone from the NFS drive, but the local cache of the NFS client > doesn't get updated in time and a subsequent check on whether the item exists > returns an incorrect result. > > http://nfs.sourceforge.net/#faq_a8 > > Perfect cache coherency among disparate NFS clients is very expensive to > achieve, so NFS settles for something weaker that satisfies the > requirements of most everyday types of file sharing. > > A tremendous amount of energy has gone into making NFS mimic local file system > behaviors as closely as possible, both by the NFS devs and by us (see > <http://incubator.apache.org/lucy/docs/perl/Lucy/Docs/FileLocking.html>) but > it's a very hard problem and compromises are impossible to avoid. Some food for thought, thanks. I'll start looking into my index store servers and their NFS exports. > Best practice would be to avoid writing to Lucy indexes on NFS drives if > possible. Read performance is going to be lousy anyway unless you make the > NFS mount read-only. Just too much data (and we needed redundancy), and the load needs to be spread across as many storage nodes as possible (we have separate source-store and index-store servers). When all the indexing machines are grinding away in unison, sucking from the source-store servers (via NFS), and writing to the index-store servers (also via NFS), the load can get quite high and some NFS tweaking has been done on the source/target servers. Merging nodes reads indexes from the index-store servers (NFS) and writes them on the search-server (NFS) nodes themselves. Searching is then always local. The load at the moment is negligible, so I know it's not causing a problem with NFS - however, with NFS you never know, so I'll be focusing on that next. Thanks for your comments!
