[ http://issues.apache.org/jira/browse/LUCENE-673?page=comments#action_12443975 ] Marvin Humphrey commented on LUCENE-673: ----------------------------------------
It seems that NFS doesn't support delete-upon-last-close semantics. That means that an IndexWriter can delete files out from underneath a cached IndexReader, and they're really gone, no? Stale NFS Filehandle exception. > Exceptions when using Lucene over NFS > ------------------------------------- > > Key: LUCENE-673 > URL: http://issues.apache.org/jira/browse/LUCENE-673 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Affects Versions: 2.0.0 > Environment: NFS server/client > Reporter: Michael McCandless > > I'm opening this issue to track details on the known problems with > Lucene over NFS. > The summary is: if you have one machine writing to an index stored on > an NFS mount, and other machine(s) reading (and periodically > re-opening the index) then sometimes on re-opening the index the > reader will hit a FileNotFound exception. > This has hit many users because this is a natural way to "scale up" > your searching (single writer, multiple readers) across machines. The > best current workaround (I think?) is to take the approach Solr takes > (either by actually using Solr or copying/modifying its approach) to > take snapshots of the index and then have the readers open the > snapshots instead of the "live" index being written to. > I've been working on two patches for Lucene: > * A locking (LockFactory) implementation using native OS locks > * Lock-less commits > (I'll open separate issues with the details for those). > I have a simple stress test where one machine is constantly adding > docs to an index over NFS, and another machine is constantly > re-opening the index searcher over NFS. > These tests have revealed new details (at least for me!) about the > root cause of our NFS problems: > * Even when using native locks over NFS, Lucene still hits these > exceptions! > I was surprised by this because I had always thought (assumed?) > the NFS problem was because the "simple" file-based locking was > not correct over NFS, and that switching to native OS filesystem > locking would resolve it, but it doesn't. > I can reproduce the "FileNotFound" exceptions even when using NFS > V4 (the latest NFS protocol), so this is not just a "your NFS > server is too old" issue. > * Then, when running the same stress test with the lock-less > changes, I don't hit any exceptions. I've tested on NFS version > 2, 3 and 4 (using the "nfsvers=N" mount option). > I think this means that in fact (as Hoss at one point suggested I > believe), the NFS problems are likely due to the cache coherence of > the NFS file system (I think the "segments" file in particular) > against the existence of the actual segment data files. > In other words, even if you lock correctly, on the reader side it will > sometimes see stale contents of the "segments" file which lead it to > try to open a now deleted segment data file. > So I think this is good news / bad news: the bad news is, native > locking doesn't fix our problems with NFS (as at least I had expected > it to). But the good news is, it looks like (still need to do more > thorough testing of this) the changes for lock-less commits do enable > Lucene to work fine over NFS. > [One quick side note in case it helps others: to get native locks > working over NFS on Ubuntu/Debian Linux 6.06, I had to "apt-get > install nfs-common" on the NFS client machines. Before I did this I > would hit "No locks available" IOExceptions on calling the "tryLock" > method. The default nfs server install on the server machine just > worked because it runs in kernel mode and it start a lockd process.] -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]