[ 
https://issues.apache.org/jira/browse/LUCENE-7700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875888#comment-15875888
 ] 

Dawid Weiss edited comment on LUCENE-7700 at 2/21/17 12:13 PM:
---------------------------------------------------------------

bq. One difference with your patch is we would now wrap the Directory for merge 
on every merge, instead of once up front, but that's fine (the cost is tiny vs. 
cost of the merge)

I admit I do have a very specific scenario at hand and you're infinitely more 
experienced with merging, so if this is a problem we can always change it! The 
"get-directory-wrapped-for-merging" part is a bit clumsy, but I didn't figure 
out how to do it better.

bq. And it's nice that we can remove IW's ThreadLocal tracking the rate 
limiters.

I think so too.

bq. I do think this it's important that the IO throttling applies when building 
the CFS file? For a large merge, this is a big burst of IO in the end

That part I didn't look too closely at, I agree. It should definitely be 
consistent with the rest of the throughput-control code, but there's no 
OneMerge instance there to work with... I'll take another look, maybe I'll come 
up with something.


was (Author: dweiss):
bq. One difference with your patch is we would now wrap the Directory for merge 
on every merge, instead of once up front, but that's fine (the cost is tiny vs. 
cost of the merge)

I admit I do have a very specific scenario at hand and you're infinitely more 
experienced with merging, so if this is a problem we can always change it! The 
"get-directory-wrapped-for-merging" part is a bit clumsy, but I didn't figure 
out how to do it better.

bq. And it's nice that we can remove IW's ThreadLocal tracking the rate 
limiters.

I think so too.

bq. I do think this it's important that the IO throttling applies when building 
the CFS file? For a large merge, this is a big burst of IO in the end

That part I didn't look to closely at, I agree. It should definitely be 
consistent with the rest of the throughput-control code, but there's no 
OneMerge instance there to work with... I'll take another look, maybe I'll come 
up with something.

> Move throughput control and merge aborting out of IndexWriter's core?
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-7700
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7700
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>         Attachments: LUCENE-7700.patch
>
>
> Here is a bit of a background:
> - I wanted to implement a custom merging strategy that would have a custom 
> i/o flow control (global),
> - currently, the CMS is tightly bound with a few classes -- MergeRateLimiter, 
> OneMerge, IndexWriter.
> Looking at the code it seems to me that everything with respect to I/O 
> control could be nicely pulled out into classes that explicitly control the 
> merging process, that is only MergePolicy and MergeScheduler. By default, one 
> could even run without any additional I/O accounting overhead (which is 
> currently in there, even if one doesn't use the CMS's throughput control).
> Such refactoring would also give a chance to nicely move things where they 
> belong -- job aborting into OneMerge (currently in RateLimiter), rate limiter 
> lifecycle bound to OneMerge (MergeScheduler could then use per-merge or 
> global accounting, as it pleases).
> Just a thought and some initial refactorings for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to