[
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979382#action_12979382
]
Jason Rutherglen commented on LUCENE-2324:
------------------------------------------
bq. Once flush is triggered, the thread doing the flushing is free to flush any
DWPT.
OK.
bq. OK let's start there and put back re-use only if we see a real perf issue?
I think that's best. Balancing RAM isn't implemented in the branch, we can't
predict the future usage of DWPT(s) (which could languish consuming RAM with
byte[]s well after they're flushed due to a sudden drop in the number of
calling threads external to IW).
{quote}But it's really a "nuke the world" option which scares me. EG it could
be a looong indexing session (app doesn't call commit() until the end) and we
could be throwing away alot of progress.{quote}
Right. Another option is to on commit try to flush all segments, meaning even
if one DWPT/segment aborts, continue on with the other DWPTs (ie, a best
effort). Then perhaps throw an exception with a report of which segment
flushes succeeded, or simply return a report object detailing what happened
during commit (somewhat expert usage though). Either way I think we need to
give a few options to the user, then choose a default and see if it sticks.
The default should probably be "best effort".
> Per thread DocumentsWriters that write their own private segments
> -----------------------------------------------------------------
>
> Key: LUCENE-2324
> URL: https://issues.apache.org/jira/browse/LUCENE-2324
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Michael Busch
> Assignee: Michael Busch
> Priority: Minor
> Fix For: Realtime Branch
>
> Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
> LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
> lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out
>
>
> See LUCENE-2293 for motivation and more details.
> I'm copying here Mike's summary he posted on 2293:
> Change the approach for how we buffer in RAM to a more isolated
> approach, whereby IW has N fully independent RAM segments
> in-process and when a doc needs to be indexed it's added to one of
> them. Each segment would also write its own doc stores and
> "normal" segment merging (not the inefficient merge we now do on
> flush) would merge them. This should be a good simplification in
> the chain (eg maybe we can remove the *PerThread classes). The
> segments can flush independently, letting us make much better
> concurrent use of IO & CPU.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]