[jira] Commented: (LUCENE-1314) IndexReader.clone

Michael McCandless (JIRA) Fri, 26 Dec 2008 04:40:17 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659235#action_12659235
 ]


Michael McCandless commented on LUCENE-1314:
--------------------------------------------


OK I reviewed the patch; some comments:

  * We have clone & reopen methods on *Reader (ParallelReader,
    MultiReader) that are not synchronized; shouldn't they be
    synchronized as well?

  * You are still re-loading segments_N when cloning, which is
    incorrect (in fact I had fixed this in my patch above but it got
    lost); you should just clone segmentInfos instead.

  * In SegmentReader.java we have "if (doClone) acquireWriteLock();",
    which isn't right?  Ie, if this reader does not currently have the
    write lock (it has no "local mods") it should not acquire it?  One
    should be allowed to clone a stale reader?

  * Have you done any tests to see the cost of the copy-on-write
    cloning of deleted docs BitVector & norms?  The first new mod to
    the cloned reader pays that penalty.  Marvin's "tombstone"
    deletions would bring this penalty to near zero, but it's a big
    change (and should certainly be decoupled from this!).

  * In IndexReader.clone()'s javadocs can you state that on cloning a
    reader with pending modifications, the original reader then
    becomes readOnly (in addition to passing the write lock to the
    cloned reader)?

  * Can you rename IndexReader.cloneBitVector --> cloneDeletedDocs?

  * When you clone the deleted docs (because refCount is > 1) you are
    first decRef'ing the old one and then making the clone... can you
    change that so the decRef is done last?  I don't think any actual
    bug would result from the way it is now, but let's be defensive
    (don't decRef something until you really are done using it).

  * Class Ref does not need to delcare its private int refCount as
    volatile; you always access that member from a synchronized
    context.

  * In SegmentReader.clone, you incRef the deleteDocsCopyOnWriteRef,
    but it might be null, right?  Oh I see, it's always init'd to a
    new Ref() but then you make a new Ref() again on the first
    delete.  Can you make it null by default?  (Seems like it should
    be null if deletedDocs is).

  * Can you change Ref() so that it init's its refCount to 1?  This
    way new Ref() need not immediately call incRef, and a Ref() with
    refCount 0 is never allowed out into the wild, and you should then
    add an "assert refCount > 0" in incRef as well.

  * In SegmentReader.doClose() you are failing to call
    deletedDocsCopyOnWriteRef.decRef(), so you have a refCount leak.
    Can you create a unit test that 1) opens reader 1, 2) does deletes
    on reader 1, 3) clones reader 1 --> reader 2, 4) closes reader 1,
    5) deletes more docs with reader 1, and 6) asserts that the
    deletedDocs BitVector did not get cloned?  First verify the test
    fails, then fix the bug...

  * How should unDeleteAll() work?  EG it seems like you must decRef
    the Ref, then set it to null?  Can you add some tests for the
    various permutations of this?

  * SegmentReader.Norm now has two refCounts, and I think both are
    necessary. One tracks refs to the Norm instance itself and the
    other tracks refs to the byte[].  Can you add some comments
    explaining the difference (because it's confusing at first blush)?

  * I think the answer to your question "// is it necessary to clone
    everything?" in SegmentReader.Norm.clone() is "yes"; can you
    make sure you are cloning everything and then remove that comment?

  * In SegmentReader.doClose() we are also failing to decRef Norm
    instances (I think this is a pre-existing bug) and the newly added
    Refs to the byte[]'s as well.  Can you fix both of these?

  * In SegmentReader.Norm.cloneBytes() don't you need to create a new
    bytesRef as well, because you are making a private copy at that
    point?  Oh I see, you do this up in SegmentReader.doSetNorm; can
    you move it (= making a new bytesRef) down into cloneBytes()?
    Also, do the decRef of the old Ref last, not first, just like the
    deletedDocs case.

  * I think you might have an off-by-one error in Norm cloning.  A
    newly cloned norm seems to share its byte[] with the original put
    gets a bytesRef with refCount 1?  (Should be refCount 2?).

  * Can you rename deletedDocsCopyOnWriteRef -> deletedDocsRef?

  * Can you add "new and experimental" caveats to the package-private
    APIs you've added (cloneNormBytes, cloneDeletedDocs)?



> IndexReader.clone
> -----------------
>
>                 Key: LUCENE-1314
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1314
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.3.1
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, 
> LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, 
> lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, 
> lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, 
> lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch
>
>
> Based on discussion 
> http://www.nabble.com/IndexReader.reopen-issue-td18070256.html.  The problem 
> is reopen returns the same reader if there are no changes, so if docs are 
> deleted from the new reader, they are also reflected in the previous reader 
> which is not always desired behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1314) IndexReader.clone

Reply via email to