On Thursday, October 27, 2016, Brunoais <brunoa...@gmail.com> wrote: > Did you read the C code?
I looked at the Linux code in the JDK. > Have you got any idea how many functions Windows or Linux (nearly all > flavors) have for the read operation towards a file? I do. > > I have already done that homework myself. I may not have read JVM's source > code but I know well that there's functions on both Windows and Linux that > provide such interface I mentioned although they require a slightly > different treatment (and different constants). You should read the JDK (native) source code instead of guessing/assuming. On Linux, it doesn't use aio facilities for files. The kernel io scheduler may issue readahead behind the scenes, but there's no nonblocking file io that's at the heart of your premise. > > > On 27/10/2016 00:06, Vitaly Davidovich wrote: > >> >> >> On Wednesday, October 26, 2016, Brunoais <brunoa...@gmail.com <mailto: >> brunoa...@gmail.com>> wrote: >> >> It is actually based on the premise that: >> >> 1. The first call to ReadableByteChannel.read(ByteBuffer) sets the OS >> buffer size to fill in as the same size as ByteBuffer. >> >> Why do you say that? AFAICT, it issues a read syscall and that will block >> if the data isn't in page cache. >> >> 2. The consecutive calls to ReadableByteChannel.read(ByteBuffer) >> orders >> the JVM to order the OS to execute memcpy() to copy from its memory >> to the shared memory created at ByteBuffer instantiation (in >> java 8) >> using Unsafe and then for the JVM to update the ByteBuffer fields. >> >> I think subsequent reads just invoke the same read syscall, passing the >> current file offset maintained by the file channel instance. >> >> 3. The call will not block waiting for I/O and it won't take longer >> than the JNI interface if no new data exists. However, it will >> block >> waiting for the OS to execute memcpy() to the shared memory. >> >> So why do you think it won't block? >> >> >> Is my premise wrong? >> >> If I read correctly, if I don't use a DirectBuffer, there would be >> even another intermediate buffer to copy data to before giving it >> to the "user" which would be useless. >> >> If you use a HeapByteBuffer, then there's an extra copy from the native >> buffer to the Java buffer. >> >> >> >> On 26/10/2016 11:57, Pavel Rappo wrote: >> >> I believe I see where you coming from. Please correct me if >> I'm wrong. >> >> Your implementation is based on the premise that a call to >> ReadableByteChannel.read() >> _initiates_ the operation and returns immediately. The OS then >> continues to fill >> the buffer while there's a free space in the buffer and the >> channel hasn't encountered EOF. >> >> Is that right? >> >> On 25 Oct 2016, at 22:16, Brunoais <brunoa...@gmail.com> >> wrote: >> >> Thank you for your time. I'll try to explain it. I hope I >> can clear it up. >> First of it, I made a meaning mistake between asynchronous >> and non-blocking. This implementation uses a non-blocking >> algorithm internally while providing a blocking-like >> algorithm on the surface. It is single-threaded and not >> multi-threaded where one thread fetches data and blocks >> waiting and the other accumulates it and provides to >> whichever wants it. >> >> Second of it, I had made a mistake of going after >> BufferedReader instead of going after BufferedInputStream. >> If you want me to go after BufferedReader it's ok but I >> only thought that going after BufferedInputStream would be >> more generically useful than BufferedReaderwhen I started >> the poc. >> >> On to my code: >> Short answers: >> • The sleep(int) exists because I don't know how >> to wait until more data exists in the buffer which is part >> of read()'s contract. >> • The ByteBuffer gives a buffer that is filled by >> the OS (what I believe Channels do) instead of getting >> data only by demand (what I believe Streams do). >> Full answers: >> The blockingFill(boolean) method is a method for a busy >> wait for a fill which is used exclusively by the read() >> method. All other methods use the version that does not >> sleep (fill(boolean)). >> blockingFill(boolean)'s existance like that is only >> because the read() method must not return unless either: >> >> • The stream ended. >> • The next byte is ready for reading. >> Additionally, statistically, that while loop will rarely >> evaluate to true as reads are in chunks so readPos will be >> behind writePos most of the time. >> I have no idea if an interrupt will ever happen, to be >> honest. The main reasons why I'm using a sleep is because >> I didn't want a hog onto the CPU in a full thread usage >> busy wait and because I didn't find any way of doing a >> thread sleep in order to wake up later when the buffer >> managed by native code has more data. >> The Non-blocking part is managed by the buffer the OS >> keeps filling most if not all the time. That buffer is the >> field >> >> ByteBuffer readBuffer >> That's the gaining part against the plain old Buffered >> classes. >> >> >> Did that make sense to you? Feel free to ask anything else >> you need. >> >> On 25/10/2016 20:52, Pavel Rappo wrote: >> >> I've skimmed through the code and I'm not sure I can >> see any asynchronicity >> (you were pointing at the lack of it in BufferedReader). >> And the mechanics of this is very puzzling to me, to >> be honest: >> void blockingFill(boolean forced) throws >> IOException { >> fill(forced); >> while (readPos == writePos) { >> try { >> Thread.sleep(100); >> } catch (InterruptedException e) { >> // An interrupt may mean more data is >> available >> } >> fill(forced); >> } >> } >> I thought you were suggesting that we should utilize >> the tools which OS provides >> more efficiently. Instead we have something that looks >> very similarly to a >> "busy loop" and... also who and when is supposed to >> interrupt Thread.sleep()? >> Sorry, I'm not following. Could you please explain how >> this is supposed to work? >> >> On 24 Oct 2016, at 15:59, Brunoais >> <brunoa...@gmail.com> >> wrote: >> Attached and sending! >> On 24/10/2016 13:48, Pavel Rappo wrote: >> >> Could you please send a new email on this list >> with the source attached as a >> text file? >> >> On 23 Oct 2016, at 19:14, Brunoais >> <brunoa...@gmail.com> >> wrote: >> Here's my poc/prototype: >> >> http://pastebin.com/WRpYWDJF >> >> I've implemented the bare minimum of the >> class that follows the same contract of >> BufferedReader while signaling all issues >> I think it may have or has in comments. >> I also wrote some javadoc to help guiding >> through the class. >> I could have used more fields from >> BufferedReader but the names were so >> minimalistic that were confusing me. I >> intent to change them before sending this >> to openJDK. >> One of the major problems this has is long >> overflowing. It is major because it is >> hidden, it will be extremely rare and it >> takes a really long time to reproduce. >> There are different ways of dealing with >> it. From just documenting to actually >> making code that works with it. >> I built a simple test code for it to have >> some ideas about performance and correctness. >> >> http://pastebin.com/eh6LFgwT >> >> This doesn't do a through test if it is >> actually working correctly but I see no >> reason for it not working correctly after >> fixing the 2 bugs that test found. >> I'll also leave here some conclusions >> about speed and resource consumption I found. >> I made tests with default buffer sizes, >> 5000B 15_000B and 500_000B. I noticed >> that, with my hardware, with the 1 530 000 >> 000B file, I was getting around: >> In all buffers and fake work: 10~15s speed >> improvement ( from 90% HDD speed to 100% >> HDD speed) >> In all buffers and no fake work: 1~2s >> speed improvement ( from 90% HDD speed to >> 100% HDD speed) >> Changing the buffer size was giving >> different reading speeds but both were >> quite equal in how much they would change >> when changing the buffer size. >> Finally, I could always confirm that I/O >> was always the slowest thing while this >> code was running. >> For the ones wondering about the file >> size; it is both to avoid OS cache and to >> make the reading at the main use-case >> these objects are for (large streams of >> bytes). >> @Pavel, are you open for discussion now >> ;)? Need anything else? >> On 21/10/2016 19:21, Pavel Rappo wrote: >> >> Just to append to my previous email. >> BufferedReader wraps any Reader out there. >> Not specifically FileReader. While >> you're talking about the case of effective >> reading from a file. >> I guess there's one existing >> possibility to provide exactly what >> you need (as I >> understand it) under this method: >> /** >> * Opens a file for reading, >> returning a {@code BufferedReader} to >> read text >> * from the file in an efficient >> manner... >> ... >> */ >> java.nio.file.Files#newBuffere >> dReader(java.nio.file.Path) >> It can return _anything_ as long as it >> is a BufferedReader. We can do it, but it >> needs to be investigated not only for >> your favorite OS but for other OSes as >> well. Feel free to prototype this and >> we can discuss it on the list later. >> Thanks, >> -Pavel >> >> On 21 Oct 2016, at 18:56, Brunoais >> <brunoa...@gmail.com> >> wrote: >> Pavel is right. >> In reality, I was expecting such >> BufferedReader to use only a >> single buffer and have that Buffer >> being filled asynchronously, not >> in a different Thread. >> Additionally, I don't have the >> intention of having a larger >> buffer than before unless stated >> through the API (the constructor). >> In my idea, internally, it is >> supposed to use >> java.nio.channels.Asynchronous >> FileChannel >> or equivalent. >> It does not prevent having two >> buffers and I do not intent to >> change BufferedReader itself. I'd >> do an BufferedAsyncReader of sorts >> (any name suggestion is welcome as >> I'm an awful namer). >> On 21/10/2016 18:38, Roger Riggs >> wrote: >> >> Hi Pavel, >> I think Brunoais asking for a >> double buffering scheme in >> which the implementation of >> BufferReader fills (a second >> buffer) in parallel with the >> application reading from the >> 1st buffer >> and managing the swaps and >> async reads transparently. >> It would not change the API >> but would change the >> interactions between the >> buffered reader >> and the underlying stream. It >> would also increase memory >> requirements and processing >> by introducing or using a >> separate thread and the >> necessary synchronization. >> Though I think the formal >> interface semantics could be >> maintained, I have doubts >> about compatibility and its >> unintended consequences on >> existing subclasses, >> applications and libraries. >> $.02, Roger >> On 10/21/16 1:22 PM, Pavel >> Rappo wrote: >> >> Off the top of my head, I >> would say it's not >> possible to change the >> design of an >> _extensible_ type that has >> been out there for 20 or >> so years. All these I/O >> streams from java.io >> <http://java.io> were >> designed for simple >> synchronous use case. >> It's not that their design >> is flawed in some way, >> it's that they doesn't seem to >> suit your needs. Have you >> considered using >> java.nio.channels.Asynchronous >> FileChannel >> in your applications? >> -Pavel >> >> On 21 Oct 2016, at >> 17:08, Brunoais >> <brunoa...@gmail.com> >> wrote: >> Any feedback on this? >> I'm really interested >> in implementing such >> >> BufferedReader/BufferedStreamReader >> to allow speeding up >> my applications >> without having to >> think in an >> asynchronous way or >> multi-threading while >> programming with it. >> That's why I'm asking >> this here. >> On 13/10/2016 14:45, >> Brunoais wrote: >> >> Hi, >> I looked at >> BufferedReader >> source code for >> java 9 long with >> the source code of >> the >> channels/streams >> used. I noticed >> that, like in java >> 7, BufferedReader >> does not use an >> Async API to load >> data from files, >> instead, the data >> loading is all >> done synchronously >> even when the OS >> allows requesting >> a file to be read >> and getting a >> warning later when >> the file is >> effectively read. >> Why Is >> BufferedReader not >> async while >> providing a sync API? >> >> <BufferedNonBlockStream.java><Tests.java> >> >> >> >> >> >> -- >> Sent from my phone >> > > -- Sent from my phone