Refactor Directory/Multi/SegmentReader creation/reopening/cloning/closing
-------------------------------------------------------------------------

                 Key: LUCENE-2355
                 URL: https://issues.apache.org/jira/browse/LUCENE-2355
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: Earwin Burrfoot


*Reader lifecycle evolved over time to become some heavily tangled mess. It's 
hard to understand what's going on there, it's even harder to add some 
fields/logic while ensuring that all possible code paths preserve these 
fields/interact with the logic properly. While some of said mess is justified 
by the task at hand, a big part is just badly done copypaste and can be removed.

I am currently refactoring this and intended to open an issue with a working 
patch, but the task winded up somewhat bigger than I expected, so I'm opening 
it earlier to track stuff encountered/changed/fixed.
The list is by no means exhaustive.
- an iteration to create SRs is copypasted several times, one of them (IW) with 
wrong iteration bound
- it is also overly complex and can be folded for create/reopen cases
- readers sent to IndexReaderWarmer are termindexless/docstoreless on some 
occasions
- it is possible to clone() your way to readwrite NRT reader
- IndexDeletionPolicy is not always preserved through clones/reopens
- cloned readers share CoreReaders and, consequently, updated 
termsIndex/docStores
- threadlocal versions of fieldsReader/termsVector are bound to SR, not 
CoreReaders and thus are recreated on clone/reopen
- double-initialization for some fields (someone got lost and did this to be 
sure I guess), stupid assert checks ( qwe = new(); assert qwe != null )
- SR is not always recreated when compound status of underlying segment changes
- deleting already deleted doc marks deletions dirty and rewrites them
- lots of synchronization is done around Reader, while it can be narrowed down 
to norms/deletions/whatever

I did some structural modifications:
- CompositeReader extracts common code from DirectoryReader and MultiReader 
(complete)
- ReadonlyDirectoryReader and ReadonlySegmentReader are dead, MutableD/SReaders 
are introduced and carry all modification logic/fields (DR complete, SR in 
progress)
- WriterBackedReader encapsulates NRT reader logic (complete)
- CoreReaders split into CoreReaders, DocStores, TermInfos. All of these are 
immutable and SR is cloned when you need to change its mode (in progress)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to