[ 
https://issues.apache.org/jira/browse/LUCENE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463358
 ] 

Michael McCandless commented on LUCENE-767:
-------------------------------------------


Carrying over from the java-dev list:


Grant Ingersoll wrote:

> Can you explain in more detail on this bug why this makes you nervous?

Well ... the only specific example I have is NFS (always my favorite
example!).

As I understand it, the NFS client typically uses a separate cache to
hold the "attributes" of the file, including file length.  This cache
often has weaker or maybe just "different" guarantees than the "data
cache" that holds the file contents.  So basically you can ask what
the file length is and get a wrong (stale) answer.  EG see
http://nfs.sourceforge.net, which describes Linux's NFS client
approach.  The NFS client on Apple's OS X seems to be even worse!

I think very likely Lucene may not trip up on this specifically since
a reader would only ask for this file's length for the first time once
the file is done being written (ie the commit of segments_N has
occurred) and so hopefully it's not in the attribute cache yet?

I think there may very well be cases of other filesystems where
"checking file length" is risky (that we all just don't know about
(yet!)), which is why I favor using explicit values instead of relying
on file system semantics, whenever possible.

Maybe I'm just too paranoid :)

But for all the places / devices Lucene has gone and will go, relying
on the bare minimum set of IO operations I think will maximize our
overall portability.  Every filesystem has its quirks.


> maxDoc should be explicitly stored in the index, not derived from file length
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-767
>                 URL: https://issues.apache.org/jira/browse/LUCENE-767
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 1.9, 2.0.0, 2.0.1, 2.1
>            Reporter: Michael McCandless
>         Assigned To: Michael McCandless
>            Priority: Minor
>
> This is a spinoff of LUCENE-140
> In general we should rely on "as little as possible" from the file system.  
> Right now, maxDoc is derived by checking the file length of the FieldsReader 
> index file (.fdx) which makes me nervous.  I think we should explicitly store 
> it instead.
> Note that there are no known cases where this is actually causing a problem. 
> There was some speculation in the discussion of LUCENE-140 that it could be 
> one of the possible, but in digging / discussion there were no specifically 
> relevant JVM bugs found (yet!).  So this would be a defensive fix at this 
> point.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to