[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Busch updated LUCENE-2324: ---------------------------------- Attachment: lucene-2324.patch The patch removes all *PerThread classes downstream of DocumentsWriter. This simplifies a lot of the flushing logic in the different consumers. The patch also removes FreqProxMergeState, because we don't have to interleave posting lists from different threads anymore of course. I really like these simplifications! There is still a lot to do: The changes in DocumentsWriter and IndexWriter are currently just experimental to make everything compile. Next I will introduce DocumentsWriterPerThread and implement the sequenceID logic (which was discussed here in earlier comments) and the new RAM management. I also want to go through the indexing chain once again - there are probably a few more things to clean up or simplify. The patch compiles and actually a surprising amount of tests pass. Only multi-threaded tests seem to fail, which is not very surprising, considering I removed all thread-handling logic from DocumentsWriter. :) So this patch isn't working yet - just wanted to post my current progress. > Per thread DocumentsWriters that write their own private segments > ----------------------------------------------------------------- > > Key: LUCENE-2324 > URL: https://issues.apache.org/jira/browse/LUCENE-2324 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: 3.1 > > Attachments: lucene-2324.patch, LUCENE-2324.patch > > > See LUCENE-2293 for motivation and more details. > I'm copying here Mike's summary he posted on 2293: > Change the approach for how we buffer in RAM to a more isolated > approach, whereby IW has N fully independent RAM segments > in-process and when a doc needs to be indexed it's added to one of > them. Each segment would also write its own doc stores and > "normal" segment merging (not the inefficient merge we now do on > flush) would merge them. This should be a good simplification in > the chain (eg maybe we can remove the *PerThread classes). The > segments can flush independently, letting us make much better > concurrent use of IO & CPU. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org