[ 
https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004006#comment-13004006
 ] 

Simon Willnauer commented on LUCENE-2573:
-----------------------------------------

{quote}
Maybe rename TieredFP -> ByRAMFP? Also, I'm not sure we need the N
tiers? I suspect that may flush too heavily? Can we instead simplify
it and have only the low and high water marks? So we flush when
active RAM is over low water mark? (And we stall if active + flushing
RAM exceeds high water mark).
{quote}
this really sounds like a different FlushPolicy to me. But is worth a try - 
should be easy to add with this patch. so you mean we always flush ALL DWPT 
once we reached the low watermark? I don't think this is a good idea. And I 
wonder if that is a bit too aggressive to say you put DW into stalled mode if 
we exceed the high watermark. Anyway we can try and see what works better right?

bq. Can we rename isHealthy to isStalled (ie, invert it)?
sure isStalled sounds fine

bq. I'm still unsure we should even include any healthy check APIs.
for now this is internal only so even if we decide to I would shift that to a 
different issue.

{quote}
Maybe rename pendingBytes to flushingBytes? Or maybe
flushPendingBytes? (Just to make it clear what we are pending on...).
{quote}

yeah that is true - flushPendingBytes to make it consistent - my fault...

{quote}
I wonder if FP.findFlushes should be renamed to something like
FP.visit, and return void? Ie, it's called for its side effects of
marking DWPTs for flushing, right? Separately, whether or not this
thread will go and flush a DWPT is for IW to decide? (Like it could
be this thread didn't mark any new flush required, but it should go
off and pull a DWPT previously marked by another thread). So then IW
would have a private volatile boolean recording whether any active
DWPTs have flushPending.
{quote}
I was unsure about the name too so I just made it consistent with MergePolicy. 
Visit is ok I think.
the return value is maybe a relict from earlier version where I haven't had the 
DocWriterSession#hasPendingFlushes() yeah I think we can make that void and 
simply check if there are any. I think I do that today already 

> Tiered flushing of DWPTs by RAM with low/high water marks
> ---------------------------------------------------------
>
>                 Key: LUCENE-2573
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2573
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael Busch
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: Realtime Branch
>
>         Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, 
> LUCENE-2573.patch, LUCENE-2573.patch
>
>
> Now that we have DocumentsWriterPerThreads we need to track total consumed 
> RAM across all DWPTs.
> A flushing strategy idea that was discussed in LUCENE-2324 was to use a 
> tiered approach:  
> - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM)
> - Flush all DWPTs at a high water mark (e.g. at 110%)
> - Use linear steps in between high and low watermark:  E.g. when 5 DWPTs are 
> used, flush at 90%, 95%, 100%, 105% and 110%.
> Should we allow the user to configure the low and high water mark values 
> explicitly using total values (e.g. low water mark at 120MB, high water mark 
> at 140MB)?  Or shall we keep for simplicity the single setRAMBufferSizeMB() 
> config method and use something like 90% and 110% for the water marks?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to