[ https://issues.apache.org/jira/browse/LUCENE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463358 ]
Michael McCandless commented on LUCENE-767: ------------------------------------------- Carrying over from the java-dev list: Grant Ingersoll wrote: > Can you explain in more detail on this bug why this makes you nervous? Well ... the only specific example I have is NFS (always my favorite example!). As I understand it, the NFS client typically uses a separate cache to hold the "attributes" of the file, including file length. This cache often has weaker or maybe just "different" guarantees than the "data cache" that holds the file contents. So basically you can ask what the file length is and get a wrong (stale) answer. EG see http://nfs.sourceforge.net, which describes Linux's NFS client approach. The NFS client on Apple's OS X seems to be even worse! I think very likely Lucene may not trip up on this specifically since a reader would only ask for this file's length for the first time once the file is done being written (ie the commit of segments_N has occurred) and so hopefully it's not in the attribute cache yet? I think there may very well be cases of other filesystems where "checking file length" is risky (that we all just don't know about (yet!)), which is why I favor using explicit values instead of relying on file system semantics, whenever possible. Maybe I'm just too paranoid :) But for all the places / devices Lucene has gone and will go, relying on the bare minimum set of IO operations I think will maximize our overall portability. Every filesystem has its quirks. > maxDoc should be explicitly stored in the index, not derived from file length > ----------------------------------------------------------------------------- > > Key: LUCENE-767 > URL: https://issues.apache.org/jira/browse/LUCENE-767 > Project: Lucene - Java > Issue Type: Improvement > Affects Versions: 1.9, 2.0.0, 2.0.1, 2.1 > Reporter: Michael McCandless > Assigned To: Michael McCandless > Priority: Minor > > This is a spinoff of LUCENE-140 > In general we should rely on "as little as possible" from the file system. > Right now, maxDoc is derived by checking the file length of the FieldsReader > index file (.fdx) which makes me nervous. I think we should explicitly store > it instead. > Note that there are no known cases where this is actually causing a problem. > There was some speculation in the discussion of LUCENE-140 that it could be > one of the possible, but in digging / discussion there were no specifically > relevant JVM bugs found (yet!). So this would be a defensive fix at this > point. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]