On Thursday, October 27, 2016, Brunoais <brunoa...@gmail.com
<mailto:brunoa...@gmail.com>> wrote:
Did you read the C code?
I looked at the Linux code in the JDK.
Have you got any idea how many functions Windows or Linux
(nearly all flavors) have for the read operation towards a file?
I do.
I have already done that homework myself. I may not have read
JVM's source code but I know well that there's functions on
both Windows and Linux that provide such interface I
mentioned although they require a slightly different
treatment (and different constants).
You should read the JDK (native) source code instead of
guessing/assuming. On Linux, it doesn't use aio facilities for
files. The kernel io scheduler may issue readahead behind the
scenes, but there's no nonblocking file io that's at the heart of
your premise.
On 27/10/2016 00:06, Vitaly Davidovich wrote:
On Wednesday, October 26, 2016, Brunoais
<brunoa...@gmail.com <mailto:brunoa...@gmail.com>> wrote:
It is actually based on the premise that:
1. The first call to
ReadableByteChannel.read(ByteBuffer) sets the OS
buffer size to fill in as the same size as ByteBuffer.
Why do you say that? AFAICT, it issues a read syscall and
that will block if the data isn't in page cache.
2. The consecutive calls to
ReadableByteChannel.read(ByteBuffer)
orders
the JVM to order the OS to execute memcpy() to
copy from its memory
to the shared memory created at ByteBuffer
instantiation (in
java 8)
using Unsafe and then for the JVM to update the
ByteBuffer fields.
I think subsequent reads just invoke the same read
syscall, passing the current file offset maintained by
the file channel instance.
3. The call will not block waiting for I/O and it
won't take longer
than the JNI interface if no new data exists.
However, it will
block
waiting for the OS to execute memcpy() to the
shared memory.
So why do you think it won't block?
Is my premise wrong?
If I read correctly, if I don't use a DirectBuffer,
there would be
even another intermediate buffer to copy data to
before giving it
to the "user" which would be useless.
If you use a HeapByteBuffer, then there's an extra copy
from the native buffer to the Java buffer.
On 26/10/2016 11:57, Pavel Rappo wrote:
I believe I see where you coming from. Please
correct me if
I'm wrong.
Your implementation is based on the premise that
a call to
ReadableByteChannel.read()
_initiates_ the operation and returns
immediately. The OS then
continues to fill
the buffer while there's a free space in the
buffer and the
channel hasn't encountered EOF.
Is that right?
On 25 Oct 2016, at 22:16, Brunoais
<brunoa...@gmail.com>
wrote:
Thank you for your time. I'll try to explain
it. I hope I
can clear it up.
First of it, I made a meaning mistake between
asynchronous
and non-blocking. This implementation uses a
non-blocking
algorithm internally while providing a
blocking-like
algorithm on the surface. It is
single-threaded and not
multi-threaded where one thread fetches data
and blocks
waiting and the other accumulates it and
provides to
whichever wants it.
Second of it, I had made a mistake of going after
BufferedReader instead of going after
BufferedInputStream.
If you want me to go after BufferedReader
it's ok but I
only thought that going after
BufferedInputStream would be
more generically useful than
BufferedReaderwhen I started
the poc.
On to my code:
Short answers:
• The sleep(int) exists because I
don't know how
to wait until more data exists in the buffer
which is part
of read()'s contract.
• The ByteBuffer gives a buffer that
is filled by
the OS (what I believe Channels do) instead
of getting
data only by demand (what I believe
Streams do).
Full answers:
The blockingFill(boolean) method is a method
for a busy
wait for a fill which is used exclusively by
the read()
method. All other methods use the version
that does not
sleep (fill(boolean)).
blockingFill(boolean)'s existance like that
is only
because the read() method must not return
unless either:
• The stream ended.
• The next byte is ready for reading.
Additionally, statistically, that while loop
will rarely
evaluate to true as reads are in chunks so
readPos will be
behind writePos most of the time.
I have no idea if an interrupt will ever
happen, to be
honest. The main reasons why I'm using a
sleep is because
I didn't want a hog onto the CPU in a full
thread usage
busy wait and because I didn't find any way
of doing a
thread sleep in order to wake up later when
the buffer
managed by native code has more data.
The Non-blocking part is managed by the
buffer the OS
keeps filling most if not all the time. That
buffer is the
field
ByteBuffer readBuffer
That's the gaining part against the plain old
Buffered
classes.
Did that make sense to you? Feel free to ask
anything else
you need.
On 25/10/2016 20:52, Pavel Rappo wrote:
I've skimmed through the code and I'm not
sure I can
see any asynchronicity
(you were pointing at the lack of it in
BufferedReader).
And the mechanics of this is very
puzzling to me, to
be honest:
void blockingFill(boolean forced) throws
IOException {
fill(forced);
while (readPos == writePos) {
try {
Thread.sleep(100);
} catch
(InterruptedException e) {
// An interrupt may mean
more data is
available
}
fill(forced);
}
}
I thought you were suggesting that we
should utilize
the tools which OS provides
more efficiently. Instead we have
something that looks
very similarly to a
"busy loop" and... also who and when is
supposed to
interrupt Thread.sleep()?
Sorry, I'm not following. Could you
please explain how
this is supposed to work?
On 24 Oct 2016, at 15:59, Brunoais
<brunoa...@gmail.com>
wrote:
Attached and sending!
On 24/10/2016 13:48, Pavel Rappo wrote:
Could you please send a new email
on this list
with the source attached as a
text file?
On 23 Oct 2016, at 19:14,
Brunoais
<brunoa...@gmail.com>
wrote:
Here's my poc/prototype:
http://pastebin.com/WRpYWDJF
I've implemented the bare
minimum of the
class that follows the same
contract of
BufferedReader while
signaling all issues
I think it may have or has in
comments.
I also wrote some javadoc to
help guiding
through the class.
I could have used more fields
from
BufferedReader but the names
were so
minimalistic that were
confusing me. I
intent to change them before
sending this
to openJDK.
One of the major problems
this has is long
overflowing. It is major
because it is
hidden, it will be extremely
rare and it
takes a really long time to
reproduce.
There are different ways of
dealing with
it. From just documenting to
actually
making code that works with it.
I built a simple test code
for it to have
some ideas about performance
and correctness.
http://pastebin.com/eh6LFgwT
This doesn't do a through
test if it is
actually working correctly
but I see no
reason for it not working
correctly after
fixing the 2 bugs that test
found.
I'll also leave here some
conclusions
about speed and resource
consumption I found.
I made tests with default
buffer sizes,
5000B 15_000B and 500_000B. I
noticed
that, with my hardware, with
the 1 530 000
000B file, I was getting around:
In all buffers and fake work:
10~15s speed
improvement ( from 90% HDD
speed to 100%
HDD speed)
In all buffers and no fake
work: 1~2s
speed improvement ( from 90%
HDD speed to
100% HDD speed)
Changing the buffer size was
giving
different reading speeds but
both were
quite equal in how much they
would change
when changing the buffer size.
Finally, I could always
confirm that I/O
was always the slowest thing
while this
code was running.
For the ones wondering about
the file
size; it is both to avoid OS
cache and to
make the reading at the main
use-case
these objects are for (large
streams of
bytes).
@Pavel, are you open for
discussion now
;)? Need anything else?
On 21/10/2016 19:21, Pavel
Rappo wrote:
Just to append to my
previous email.
BufferedReader wraps any
Reader out there.
Not specifically
FileReader. While
you're talking about the
case of effective
reading from a file.
I guess there's one existing
possibility to provide
exactly what
you need (as I
understand it) under this
method:
/**
* Opens a file for reading,
returning a {@code
BufferedReader} to
read text
* from the file in an
efficient
manner...
...
*/
java.nio.file.Files#newBufferedReader(java.nio.file.Path)
It can return _anything_
as long as it
is a BufferedReader. We
can do it, but it
needs to be investigated
not only for
your favorite OS but for
other OSes as
well. Feel free to
prototype this and
we can discuss it on the
list later.
Thanks,
-Pavel
On 21 Oct 2016, at
18:56, Brunoais
<brunoa...@gmail.com>
wrote:
Pavel is right.
In reality, I was
expecting such
BufferedReader to use only a
single buffer and
have that Buffer
being filled
asynchronously, not
in a different Thread.
Additionally, I don't have the
intention of having a
larger
buffer than before
unless stated
through the API (the
constructor).
In my idea,
internally, it is
supposed to use
java.nio.channels.AsynchronousFileChannel
or equivalent.
It does not prevent
having two
buffers and I do not
intent to
change BufferedReader
itself. I'd
do an
BufferedAsyncReader of sorts
(any name suggestion
is welcome as
I'm an awful namer).
On 21/10/2016 18:38,
Roger Riggs
wrote:
Hi Pavel,
I think Brunoais
asking for a
double buffering
scheme in
which the
implementation of
BufferReader fills (a second
buffer) in parallel with the
application reading from the
1st buffer
and managing the
swaps and
async reads
transparently.
It would not
change the API
but would change the
interactions between the
buffered reader
and the
underlying stream. It
would also
increase memory
requirements and processing
by introducing or
using a
separate thread and the
necessary synchronization.
Though I think
the formal
interface semantics could be
maintained, I have doubts
about
compatibility and its
unintended consequences on
existing subclasses,
applications and libraries.
$.02, Roger
On 10/21/16 1:22
PM, Pavel
Rappo wrote:
Off the top of my head, I
would say it's not
possible to change the
design of an
_extensible_ type that has
been out there for 20 or
so years. All
these I/O
streams from java.io <http://java.io>
<http://java.io> were
designed for simple
synchronous use case.
It's not that their design
is flawed in
some way,
it's that they doesn't seem to
suit your needs. Have you
considered using
java.nio.channels.AsynchronousFileChannel
in your
applications?
-Pavel
On 21 Oct 2016, at
17:08, Brunoais
<brunoa...@gmail.com>
wrote:
Any feedback on this?
I'm really interested
in implementing such
BufferedReader/BufferedStreamReader
to allow speeding up
my applications
without having to
think in an
asynchronous way or
multi-threading while
programming with it.
That's why I'm asking
this here.
On 13/10/2016 14:45,
Brunoais wrote:
Hi,
I looked at
BufferedReader
source code for
java 9 long with
the source code of
the
channels/streams
used. I noticed
that, like in java
7, BufferedReader
does not use an
Async API to load
data from files,
instead, the data
loading is all
done synchronously
even when the OS
allows requesting
a file to be read
and getting a
warning later when
the file is
effectively read.
Why Is
BufferedReader not
async while
providing a sync API?
<BufferedNonBlockStream.java><Tests.java>
--
Sent from my phone
--
Sent from my phone