Hi. I wrote a sample code to test out speed difference between SEQUENTIAL and O_DIRECT( I used the madvise flag-MADV_DONTNEED) reads .
This is the link to the code: http://pastebin.com/8QywKGyS There was a speed difference which when i switched between the two flags. I have not used the O_DIRECT flag because Linus had criticized it. Is this what the flags are intended to be used for ? This is just a sample code with a test file . On Wed, Apr 6, 2011 at 12:11 PM, Simon Willnauer < simon.willna...@googlemail.com> wrote: > Hey Varun, > On Tue, Apr 5, 2011 at 11:07 PM, Michael McCandless > <luc...@mikemccandless.com> wrote: >> Hi Varun, >> >> Those two issues would make a great GSoC! Comments below... > +1 >> >> On Tue, Apr 5, 2011 at 1:56 PM, Varun Thacker >> <varunthacker1...@gmail.com> wrote: >> >>> I would like to combine two tasks as part of my project >>> namely-Directory createOutput and openInput should take an IOContext >>> (Lucene-2793) and compliment it by Generalize DirectIOLinuxDir to >>> UnixDir (Lucene-2795). >>> >>> The first part of the project is aimed at significantly reducing time >>> taken to search during indexing by adding an IOContext which would >>> store buffer size and have options to bypass the OS’s buffer cache >>> (This is what causes the slowdown in search ) and other hints. Once >>> completed I would move on to Lucene-2795 and generalize the Directory >>> implementation to make a UnixDirectory . >> >> So, the first part (LUCENE-2793) should cause no change at all to >> performance, functionality, etc., because it's "merely" installing the >> plumbing (IOContext threaded throughout the low-level store APIs in >> Lucene) so that higher levels can send important details down to the >> Directory. We'd fix IndexWriter/IndexReader to fill out this >> IOContext with the details (merging, flushing, new reader, etc.). >> >> There's some fun/freedom here in figuring out just what details should >> be included in IOContext... (eg: is it low level "set buffer size to 4 KB" >> or is it high level "I am opening a new near-real-time reader"). >> >> This first step is a rote cutover, just changing APIs but in no way >> taking advantage of the new APIs. >> >> The 2nd step (LUCENE-2795) would then take advantage of this plumbing, >> by creating a UnixDir impl that, using JNI (C code), passes advanced >> flags when opening files, based on the incoming IOContext. >> >> The goal is a single UnixDir that has ifdefs so that it's usable >> across multiple Unices, and eg would use direct IO if the context is >> merging. If we are ambitious we could rope Windows into the mix, too, >> and then this would be NativeDir... >> >> We can measure success by validating that a big merge while searching >> does not hurt search performance? (Ie we should be able to reproduce >> the results from >> http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html). > > Thanks for the summary mike! >> >>> I have spoken to Micheal McCandless and Simon Willnauer about >>> undertaking these tasks. Micheal McCandless has agreed to mentor me . >>> I would love to be able to contribute and learn from Apache Lucene >>> community this summer. Also I would love suggestions on how to make my >>> application proposal stronger. >> >> I think either Simon or I can be the "official" mentor, and then the >> other one of us (and other Lucene committers) will support/chime >> in... > > I will take the official responsibility here once we are there! > simon >> >> This is an important change for Lucene! >> >> Mike >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > -- Regards, Varun Thacker http://varunthacker.wordpress.com