[ 
https://issues.apache.org/jira/browse/LUCENE-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12553925
 ] 

Michael McCandless commented on LUCENE-1097:
--------------------------------------------


{quote}
cool patch! It looks like you added the abort check to all the
different spots in SegmentMerger and CFSWriter where most of
the merge time is spent: mergeFields, mergeTerms, mergeNorms,
and mergeVectors. So an abort should never take long, no
matter which part of the merge is being done when
IndexWriter.close(false) is called. Good!
{quote}
Thanks!  Yes, all places that can take alot of time are now checking
whether the merge has been aborted.

{quote}
I'm wondering how you came up with the values you pass in
checkAbort.work()? And does this abort checking have a
performance impact?
{quote}

I did this empirically by increasing the value to .work() until the
msec delay in successive calls was not larger than ~ 3 seconds or so,
when building the Wikipedia index with full length docs.  Most of the
time it's ~50-100 msec between checks.

I wanted to keep the number of calls to the synchronized
merge.isAborted() method to a minimum so we don't impact performance.
I haven't yet tested it, but I think the performance impact will be in
the noise...I'll test to be sure.

{quote}
In the future when we add more files (e. g. for column
fields) we have to remember to add corresponding abort checks
to new code that we add to SegmentMerger. I think it would be
good if you added some comments to SegmentMerger.merge() that
reminds us and gives some hints how to determine the values
for checkAbort.work().
{quote}

Good point, will do!

{quote}
The abort logic should work with all merge policies, right?
Not only with background merges?
{quote}
You mean merge schedulers right?  (Aborting shouldn't affect the merge
policy, which just selects the merges to be done but doesn't run
them).  But, yes, this will work w/ all merge schedulers.  The new
unit test (in TestIndexWriter) specifically tests SerialMergeScheduler
(in addition to ConcurrentMergeScheduler).


> IndexWriter.close(false) does not actually stop background merge threads
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-1097
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1097
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1097.patch
>
>
> Right now when you close(false), IndexWriter marks any running merges
> as aborted but then does not wait for these merges to finish.  This
> can cause problems because those threads still hold files open, so,
> someone might think they can call close(false) and then (say) delete
> all files from that directory, which would fail on Windows.
> Instead, close(false) should notify each running merge that it has
> been aborted, and not return until all running merges are done.  Then,
> SegmentMerger should periodically check whether it has been aborted
> and stop if so.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to