[
https://issues.apache.org/jira/browse/LUCENE-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12972531#action_12972531
]
Jason Rutherglen commented on LUCENE-2814:
------------------------------------------
bq. I think we've got DWPT down to about as bite sized as it can be (it's still
gonna be big!)
Indeed!
bq. I think coordinating on IRC #lucene is a good idea?
It'd be nice if there were a log of IRC #lucene, otherwise I prefer Jira.
bq. It seems like LUCENE-2573 needs to be incorporated into IW's new
FlushControl class
Right, into the DWPT branch.
> stop writing shared doc stores across segments
> ----------------------------------------------
>
> Key: LUCENE-2814
> URL: https://issues.apache.org/jira/browse/LUCENE-2814
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Affects Versions: 3.1, 4.0
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Attachments: LUCENE-2814.patch, LUCENE-2814.patch, LUCENE-2814.patch
>
>
> Shared doc stores enables the files for stored fields and term vectors to be
> shared across multiple segments. We've had this optimization since 2.1 I
> think.
> It works best against a new index, where you open an IW, add lots of docs,
> and then close it. In that case all of the written segments will reference
> slices a single shared doc store segment.
> This was a good optimization because it means we never need to merge these
> files. But, when you open another IW on that index, it writes a new set of
> doc stores, and then whenever merges take place across doc stores, they must
> now be merged.
> However, since we switched to shared doc stores, there have been two
> optimizations for merging the stores. First, we now bulk-copy the bytes in
> these files if the field name/number assignment is "congruent". Second, we
> now force congruent field name/number mapping in IndexWriter. This means
> this optimization is much less potent than it used to be.
> Furthermore, the optimization adds *a lot* of hair to
> IndexWriter/DocumentsWriter; this has been the source of sneaky bugs over
> time, and causes odd behavior like a merge possibly forcing a flush when it
> starts. Finally, with DWPT (LUCENE-2324), which gets us truly concurrent
> flushing, we can no longer share doc stores.
> So, I think we should turn off the write-side of shared doc stores to pave
> the path for DWPT to land on trunk and simplify IW/DW. We still must support
> reading them (until 5.0), but the read side is far less hairy.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]