Re: To clone or have a pluggable docidbitset for IndexReader

Jason Rutherglen Tue, 16 Dec 2008 14:23:33 -0800

> ie "snapshot" the previous reader without going through disk as
intermediary, right?


Yes.

>  refuses any further changes to itself (freezes itself)

Should I create a new variable for "refuse updates/freeze" or use readonly?
If the variable is true then inside of doClose throw an exception?  If
someone tries to clone a cloned reader that has no lock then the resulting
reader is frozen as well.  Let define more about how frozen behaves and the
exceptions thrown and from where for a frozen reader.

Copy on write for norms and deletedocs is implemented, now that we've more
or less agreed on the rules, I can make sure the unit tests reflect testing
the rules.

> If this makes sense, can you update the patch on LUCENE-1314 to enforce
these semantics?

Yes.

> I think we should get this in for 2.9?

Should be doable!

On Tue, Dec 16, 2008 at 12:55 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

>
> So it seems like a cloned reader would share everything with the
> previous reader, but these rules would be enforced:
>
>  * If the old reader had pending changes (held the write lock) when
>    it was cloned, it 1) transfers the write lock to the clone, 2)
>    refuses any further changes to itself (freezes itself), 3)
>    continues to reflect the pending changes, and 4) will not commit
>    its changes to disk when it's closed.  Ie it freezes itself into
>    a "point in time" snapshot, just not via an on-disk index.
>
>  * If any changes (to deletions or norms) are done with the new
>    reader, it then makes a private copy ("copy on write").  This
>    would apply to reopen too, since clone & reopen share the same
>    code; so this is an "improvement" over the current reopen
>    semantics and we should fix the javadocs saying so.
>
> It seems like the only reason to clone would be if you intend to
> [further] change deletions or norms but still want to use the previous
> reader w/ the unchanged deletions and norms, ie "snapshot" the
> previous reader without going through disk as intermediary, right?
>
> I think this is a reasonable use case.  Since an IndexReader can still
> make changes (something I think we should eventually move away from,
> but cannot, yet, because of the immediacy of deletions that
> IndexReader offers), cloning is an important tool to let you make an
> efficient "point in time" snapshot (without having to go through the
> Directory).
>
> If this makes sense, can you update the patch on LUCENE-1314 to
> enforce these semantics?  I think we should get this in for 2.9?
>
> Mike
>
>
> Jason Rutherglen wrote:
>
>  Mike,
>>
>> > needing a fast way to swap in your own deleted docs?
>>
>> Yes, however it is necessary to have a new IndexReader as well from a
>> "reopened" reader.  So clone seems the best approach (unless there's a way
>> I'm not seeing).  The clone
>> code is coming along, the norms test seems to pass.  As long as similar
>> rules as reopen are followed such as from the javadoc "The re-opened reader
>> instance and the old instance might share the same resources. For this
>> reason no index modification operations (e. g. deleteDocument(int),
>> setNorm(int, String, byte)) should be performed using one of the readers
>> until the old reader instance is closed. Otherwise, the behavior of the
>> readers is undefined.".
>>
>> I think the clone method javadoc should read "After cloning a reader, the
>> original reader will throw exceptions on index modification operations (e.
>> g. deleteDocument(int), setNorm(int, String, byte))".  This way one may read
>> from the original, but the cloned reader (new reader) may accept updates.
>>  This happens by way to automatically releasing a lock on clone (does this
>> cause any unforseen problems?).
>>
>> Jason
>>
>> On Tue, Dec 16, 2008 at 7:00 AM, Michael McCandless <
>> luc...@mikemccandless.com> wrote:
>>
>> Jason,
>>
>> Is your need for IndexReader.clone entirely driven by needing a fast way
>> to swap in your own deleted docs?
>>
>> Meaning, if you could plug in your own deleted docs to a reader (somehow),
>> would you not use clone anymore?
>>
>> Mike
>>
>>
>> Jason Rutherglen wrote:
>>
>> Hello,
>>
>> In trying to figure out the best way to have a system for realtime whereby
>> the deletedDocs do not need to be saved there are two possible methods,
>> 1) setting the DocIdBitSet manually (which breaks the saving and things,
>> but does not require doing norms cloning), or 2) implementing
>> IndexReader.clone
>> which requires deletedDocs and norms "copy on write".
>>
>> The discussion about reopen (
>> https://issues.apache.org/jira/browse/LUCENE-743)
>> was lengthy and I can see from the code and the discussion why no one
>> wants to
>> revisit IndexReader.reopen in the form of IndexReader.clone and possibly
>> mess things up.
>>
>> Is some alternative easier API possible that I'm missing?
>>
>> -J
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

Re: To clone or have a pluggable docidbitset for IndexReader

Reply via email to