Thanks for the explanation, Mike! D.
On Wed, Sep 18, 2019 at 3:21 PM Michael McCandless <[email protected]> wrote: > > Dawid, it's confusing: direct IO is different from a direct ByteBuffer! > > Direct IO means you bypass all kernel "smarts", so the Linux buffer cache is > not used, no IO scheduling, no write cache that the pdflush daemon must > periodically move to disk, etc. This is normally a bad idea, and better to > use fadvise/madvise to give kernel hints about what you are doing, and use > the buffer cache for what it's good at. Linus hates that direct IO is even > an option for us ... > > Back when I wrote NativeUnixDirectory, the idea was to prevent ongoing merges > from so heavily impacting ongoing searches, when you are doing indexing and > searching on one node. We open the newly merged segments files using direct > IO, and do our own buffering, and then all writes go straight to disk instead > of using up precious hot pages that are in use for searching. I think I ran > some simple performance tests back then but I don't remember the results ... > more testing is needed to see if it really helps. > > At Amazon, we are using segment based replication ever 60 seconds to copy > newly indexed segments out to all searchers, so we never have nodes doing > both indexing or searching, it's either or ... but, copying out max sized > newly merged segments to the searchers is causing some thrashing so we are > exploring using direct IO for those writes, and then separately warming the > new segments after the copy. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, Sep 17, 2019 at 1:16 PM Uwe Schindler <[email protected]> wrote: >> >> We discussed this already on Berlinbuzzwords (Mike and Michael). Yes it's >> possible and may work for merges where block io is possible. But most of us >> said: it's fine to not use io cache for merging, but it won't make pages >> hot. So merges are invisible to OS, so you have to warm merged segments if >> you write directly. If you read directly on merging, you won't pollute cache >> with one time reads, but it also won't use cache if already cached. >> We should better make a proposal for f/madvise. The jdk people are open for >> that, and I am jdk committer now, so I can make a prototype. >> >> Uwe >> >> Am September 17, 2019 4:48:26 PM UTC schrieb Dawid Weiss >> <[email protected]>: >>> >>> Isn't that restricted to aligned block-only access though? I can >>> imagine this would complicate the implementation if somebody wanted to >>> use it directly. >>> >>> Dawid >>> >>> On Tue, Sep 17, 2019 at 5:37 PM Michael McCandless >>> <[email protected]> wrote: >>>> >>>> >>>> Whoa! That would be awesome -- no more JNI to use Direct I/O? >>>> Looks like you use it like this: >>>> >>>> FileChannel fc = FileChannel.open(p, StandardOpenOption.WRITE, >>>> ExtendedOpenOption.DIRECT >>>> >>>> But it looks like you need to enable the jdk.unsupported module, added >>>> with http://openjdk.java.net/jeps/260 >>>> >>>> Mike McCandless >>>> >>>> http://blog.mikemccandless.com >>>> >>>> >>>> On Mon, Sep 16, 2019 at 11:55 AM Michael Sokolov <[email protected]> >>>> wrote: >>>>> >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8189192 makes it appear that >>>>> Direct I/O is (or may be?) available now in JDK's since JDK10. Should >>>>> we try using that API in NativeUnixDirectory in order to avoid JNI >>>>> calls? >>>>> ________________________________ >>>>> To unsubscribe, e-mail: [email protected] >>>>> For additional commands, e-mail: [email protected] >>>>> >>> ________________________________ >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >> >> -- >> Uwe Schindler >> Achterdiek 19, 28357 Bremen >> https://www.thetaphi.de --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
