[ 
https://issues.apache.org/jira/browse/LUCENE-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856509#action_12856509
 ] 

Earwin Burrfoot commented on LUCENE-2355:
-----------------------------------------

- Norms do both backward and forward sharing of their byte array. Eg, we clone 
a reader, then load up norms on the clone, and original sees and uses the same 
byte array.
This is really complex and likely noone ever needs it. Even if you do hit such 
a situation, and norms are loaded twice, first for a clone and then for 
original reader, it is transient and norms are really smallish compared to 
other memory hogs, so we can live with it.
I'm nixing this sharing. Successfully loaded norms are simply inherited from 
papa-reader.

> Refactor Directory/Multi/SegmentReader creation/reopening/cloning/closing
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-2355
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2355
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Earwin Burrfoot
>
> *Reader lifecycle evolved over time to become some heavily tangled mess. It's 
> hard to understand what's going on there, it's even harder to add some 
> fields/logic while ensuring that all possible code paths preserve these 
> fields/interact with the logic properly. While some of said mess is justified 
> by the task at hand, a big part is just badly done copypaste and can be 
> removed.
> I am currently refactoring this and intended to open an issue with a working 
> patch, but the task winded up somewhat bigger than I expected, so I'm opening 
> it earlier to track stuff encountered/changed/fixed.
> The list is by no means exhaustive.
> - an iteration to create SRs is copypasted several times, one of them (IW) 
> with wrong iteration bound
> - it is also overly complex and can be folded for create/reopen cases
> - readers sent to IndexReaderWarmer are termindexless/docstoreless on some 
> occasions
> - it is possible to clone() your way to readwrite NRT reader
> - IndexDeletionPolicy is not always preserved through clones/reopens
> - cloned readers share CoreReaders and, consequently, updated 
> termsIndex/docStores
> - threadlocal versions of fieldsReader/termsVector are bound to SR, not 
> CoreReaders and thus are recreated on clone/reopen
> - double-initialization for some fields (someone got lost and did this to be 
> sure I guess), stupid assert checks ( qwe = new(); assert qwe != null )
> - SR is not always recreated when compound status of underlying segment 
> changes
> - deleting already deleted doc marks deletions dirty and rewrites them
> - lots of synchronization is done around Reader, while it can be narrowed 
> down to norms/deletions/whatever
> I did some structural modifications:
> - CompositeReader extracts common code from DirectoryReader and MultiReader 
> (complete)
> - ReadonlyDirectoryReader and ReadonlySegmentReader are dead, 
> MutableD/SReaders are introduced and carry all modification logic/fields (DR 
> complete, SR in progress)
> - WriterBackedReader encapsulates NRT reader logic (complete)
> - CoreReaders split into CoreReaders, DocStores, TermInfos. All of these are 
> immutable and SR is cloned when you need to change its mode (in progress)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to