Re: Direct I/O

Dawid Weiss Wed, 18 Sep 2019 06:55:11 -0700

Thanks for the explanation, Mike!

D.


On Wed, Sep 18, 2019 at 3:21 PM Michael McCandless
<[email protected]> wrote:
>
> Dawid, it's confusing: direct IO is different from a direct ByteBuffer!
>
> Direct IO means you bypass all kernel "smarts", so the Linux buffer cache is 
> not used, no IO scheduling, no write cache that the pdflush daemon must 
> periodically move to disk, etc.  This is normally a bad idea, and better to 
> use fadvise/madvise to give kernel hints about what you are doing, and use 
> the buffer cache for what it's good at.  Linus hates that direct IO is even 
> an option for us ...
>
> Back when I wrote NativeUnixDirectory, the idea was to prevent ongoing merges 
> from so heavily impacting ongoing searches, when you are doing indexing and 
> searching on one node.  We open the newly merged segments files using direct 
> IO, and do our own buffering, and then all writes go straight to disk instead 
> of using up precious hot pages that are in use for searching.  I think I ran 
> some simple performance tests back then but I don't remember the results ... 
> more testing is needed to see if it really helps.
>
> At Amazon, we are using segment based replication ever 60 seconds to copy 
> newly indexed segments out to all searchers, so we never have nodes doing 
> both indexing or searching, it's either or ... but, copying out max sized 
> newly merged segments to the searchers is causing some thrashing so we are 
> exploring using direct IO for those writes, and then separately warming the 
> new segments after the copy.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Sep 17, 2019 at 1:16 PM Uwe Schindler <[email protected]> wrote:
>>
>> We discussed this already on Berlinbuzzwords (Mike and Michael). Yes it's 
>> possible and may work for merges where block io is possible. But most of us 
>> said: it's fine to not use io cache for merging, but it won't make pages 
>> hot. So merges are invisible to OS, so you have to warm merged segments if 
>> you write directly. If you read directly on merging, you won't pollute cache 
>> with one time reads, but it also won't use cache if already cached.
>> We should better make a proposal for f/madvise. The jdk people are open for 
>> that, and I am jdk committer now, so I can make a prototype.
>>
>> Uwe
>>
>> Am September 17, 2019 4:48:26 PM UTC schrieb Dawid Weiss 
>> <[email protected]>:
>>>
>>> Isn't that restricted to aligned block-only access though? I can
>>> imagine this would complicate the implementation if somebody wanted to
>>> use it directly.
>>>
>>> Dawid
>>>
>>> On Tue, Sep 17, 2019 at 5:37 PM Michael McCandless
>>> <[email protected]> wrote:
>>>>
>>>>
>>>>  Whoa!  That would be awesome -- no more JNI to use Direct I/O?
>>>>  Looks like you use it like this:
>>>>
>>>>  FileChannel fc = FileChannel.open(p, StandardOpenOption.WRITE,
>>>>                                    ExtendedOpenOption.DIRECT
>>>>
>>>>  But it looks like you need to enable the jdk.unsupported module, added 
>>>> with http://openjdk.java.net/jeps/260
>>>>
>>>>  Mike McCandless
>>>>
>>>>  http://blog.mikemccandless.com
>>>>
>>>>
>>>>  On Mon, Sep 16, 2019 at 11:55 AM Michael Sokolov <[email protected]> 
>>>> wrote:
>>>>>
>>>>>
>>>>>  https://bugs.openjdk.java.net/browse/JDK-8189192 makes it appear that
>>>>>  Direct I/O is (or may be?) available now in JDK's since JDK10. Should
>>>>>  we try using that API in NativeUnixDirectory in order to avoid JNI
>>>>>  calls?
>>>>> ________________________________
>>>>>  To unsubscribe, e-mail: [email protected]
>>>>>  For additional commands, e-mail: [email protected]
>>>>>
>>> ________________________________
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>
>> --
>> Uwe Schindler
>> Achterdiek 19, 28357 Bremen
>> https://www.thetaphi.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Direct I/O

Reply via email to