Re: [jira] Commented: (LUCENE-767) maxDoc should be explicitly stored in the index, not derived from file length

robert engels Tue, 09 Jan 2007 10:40:37 -0800

It would appear that NFS Version 2 is not suitable for Lucene. NFSVersion 3 looks like it should work. See http://nfs.sourceforge.net/#section_a

I will take this opportunity to state again what I've always beentold, and it seems to hold up, using NFS for shared interactivelyupdated files is always going to be troublesome. They have patched itover the years to help, but it just wasn't designed for this for thebeginning.

Unix systems never even had file system locks. It was assumed thatshared access to shared data would be accomplished via a sharedserver - not by sharing access to the data directly. It is far moreefficient and robust to do things this way.

Modifying a shared Lucene directory via NFS directly is always goingto be error prone.


Why not just implement a server/parallel index solution ?

On Jan 9, 2007, at 12:25 PM, Michael McCandless (JIRA) wrote:

[ https://issues.apache.org/jira/browse/LUCENE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463358 ]
Michael McCandless commented on LUCENE-767:
-------------------------------------------


Carrying over from the java-dev list:


Grant Ingersoll wrote:
Can you explain in more detail on this bug why this makes younervous?
Well ... the only specific example I have is NFS (always my favorite
example!).

As I understand it, the NFS client typically uses a separate cache to
hold the "attributes" of the file, including file length.  This cache
often has weaker or maybe just "different" guarantees than the "data
cache" that holds the file contents.  So basically you can ask what
the file length is and get a wrong (stale) answer.  EG see
http://nfs.sourceforge.net, which describes Linux's NFS client
approach.  The NFS client on Apple's OS X seems to be even worse!

I think very likely Lucene may not trip up on this specifically since
a reader would only ask for this file's length for the first time once
the file is done being written (ie the commit of segments_N has
occurred) and so hopefully it's not in the attribute cache yet?

I think there may very well be cases of other filesystems where
"checking file length" is risky (that we all just don't know about
(yet!)), which is why I favor using explicit values instead of relying
on file system semantics, whenever possible.

Maybe I'm just too paranoid :)

But for all the places / devices Lucene has gone and will go, relying
on the bare minimum set of IO operations I think will maximize our
overall portability.  Every filesystem has its quirks.
maxDoc should be explicitly stored in the index, not derived fromfile length-----------------------------------------------------------------------------
                Key: LUCENE-767
                URL: https://issues.apache.org/jira/browse/LUCENE-767
            Project: Lucene - Java
         Issue Type: Improvement
   Affects Versions: 1.9, 2.0.0, 2.0.1, 2.1
           Reporter: Michael McCandless
        Assigned To: Michael McCandless
           Priority: Minor

This is a spinoff of LUCENE-140
In general we should rely on "as little as possible" from the filesystem. Right now, maxDoc is derived by checking the file lengthof the FieldsReader index file (.fdx) which makes me nervous. Ithink we should explicitly store it instead.Note that there are no known cases where this is actually causinga problem. There was some speculation in the discussion ofLUCENE-140 that it could be one of the possible, but in digging /discussion there were no specifically relevant JVM bugs found(yet!). So this would be a defensive fix at this point.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of theadministrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [jira] Commented: (LUCENE-767) maxDoc should be explicitly stored in the index, not derived from file length

Reply via email to