[ 
https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032936#comment-13032936
 ] 

Earwin Burrfoot commented on LUCENE-3092:
-----------------------------------------

Chris, I don't like the idea of expanding IOContext again and again, but this 
case seems in line with intended purporse - give Directory implementation hints 
as to what we're going to do with it.

I don't like events either. They look fragile and binding them to threads is a 
WTF. With all our pausing/unpausing magic there's no guarantee merge will end 
on the same thread it started on.

bq. Stuff like FlushPolicy could take information about concurrent merges and 
hold of flushes for a little while if memory allows it etc.
Coordinating access to shared resource (IO subsystem) with events is very 
awkward. Ok, your FlushPolicy receives events from MergePolicy and holds 
flushes during merge. _Now, when a flush is in progress, should FlushPolicy 
notify MergePolicy so it can hold its merges?_
It goes downhill from there. What if FP and MP fire events simultaneously? :) 
What should other listeners do?

Try looking at a bigger picture. Merges are not your problem. Neither are 
flushes. Your problem is that several threads try to take their dump on disk 
simultaneously (for whatever reason, you don't really care). So what we need is 
an arbitration mechanism for Directory writes. A mechanism located presumably @ 
Directory level (eg, we don't need to throttle anything when writing to RAMDir).

One possible implementation is that we add a constructor parameter to 
FSDirectory specifying desired level of IO parallelism, and then it keeps track 
of its IndexOutputs and stalls writes selectively. We can also add 
'expectedWriteSize' to IOContext, so the Directory may favor shorter writes 
over bigger ones. Instead of 'expectedWriteSize' we can use 'priority'.

> NRTCachingDirectory, to buffer small segments in a RAMDir
> ---------------------------------------------------------
>
>                 Key: LUCENE-3092
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3092
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3092-listener.patch, LUCENE-3092.patch
>
>
> I created this simply Directory impl, whose goal is reduce IO
> contention in a frequent reopen NRT use case.
> The idea is, when reopening quickly, but not indexing that much
> content, you wind up with many small files created with time, that can
> possibly stress the IO system eg if merges, searching are also
> fighting for IO.
> So, NRTCachingDirectory puts these newly created files into a RAMDir,
> and only when they are merged into a too-large segment, does it then
> write-through to the real (delegate) directory.
> This lets you spend some RAM to reduce I0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to