Hello everybody!
Would you please review an updated webrev?
http://cr.openjdk.java.net/~igerasim/8020669/3/webrev/
<http://cr.openjdk.java.net/%7Eigerasim/8020669/3/webrev/>
Thanks in advance,
Ivan
On 24.07.2013 23:36, Martin Buchholz wrote:
Use javadoc style even in private methods.
s/Read/Reads/
Use @param initialSize
/**
+ * Read all the bytes from an input stream. The {@code initialSize}
+ * parameter indicates the initial size of the byte[] to allocate.
+ */
---
This needs to be tested for an ordinary zero-length file. It looks
like for the zero case
+ int newCapacity = capacity << 1;
will infloop not actually increasing the buffer.
---
You might want to copy this technique from ArrayList et al:
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
On Tue, Jul 23, 2013 at 5:09 PM, Ivan Gerasimov
<ivan.gerasi...@oracle.com <mailto:ivan.gerasi...@oracle.com>> wrote:
Would you please take a look at the updated webrev?
http://cr.openjdk.java.net/~igerasim/8020669/2/webrev/
<http://cr.openjdk.java.net/%7Eigerasim/8020669/2/webrev/>
readAllBytes() was recently (in b93) changed by Alan Bateman to
fix 8014928.
Here's what I've done:
- reverted readAllBytes() to its implementation prior b93.
- modified it to address both 8020669 and 8014928.
http://bugs.sun.com/view_bug.do?bug_id=8020669
<http://bugs.sun.com/view_bug.do?bug_id=8014928>
http://bugs.sun.com/view_bug.do?bug_id=8014928
Sincerely yours,
Ivan
On 23.07.2013 18:09, David M. Lloyd wrote:
Here's how it should behave:
- allocate 'size' byte byte array
- if size > 0:
- use simple old I/O to read into the array
- do a one-byte read(), if not EOF then expand the array,
using a simple growth pattern like 3/2 (with a special case
for 0), and continue reading until EOF
- if the array matches the size of the file, return the array,
else use copyOf() to shrink it
This way you only ever copy the array size() was wrong.
On 07/23/2013 05:06 AM, Ivan Gerasimov wrote:
Hi Roger!
This is how my implementation behaves:
- allocate 'size' bytes in BAOS
- allocate 8k for temp buffer
- in cycle read 8k or less bytes from input stream and
copy them into BAOS
- if capacity of BAOS isn't sufficient (file had grown),
its buffer will
be reallocated
Thus, 1 reallocation and 1 copying of already read data on
each 8k piece
of additional bytes.
In normal case, i.e. when fc.size() is correct, we have
overhead of 1
allocation and copying 'size' bytes in size/8k iterations.
And this is how current implementation does
- allocate 'size' bytes
- allocate 'size' bytes of native memory for temp buffer
in IOUtil.read()
- read the whole file into temp buffer
- copy the temp buffer back into our buffer
In common when fc.size() is right, we have 1 allocation
and copying
'size' bytes from temp buffer back.
So there is a difference in allocations/copying, but in my
opinion it's
not that significant for this particular task.
Sincerely yours,
Ivan
On 22.07.2013 20:03, roger riggs wrote:
Hi Ivan,
I'm concerned about the change in behavior for the
existing working
cases.
How many times are the bytes copied in your proposed
implementation?
How many arrays are allocated and discarded?
Files.copy() uses an extra array for the copies.
BAOS should only be used for size == 0; that would
address the issue
without changing the current behavior or allocations.
Roger
On 7/20/2013 6:15 PM, Ivan Gerasimov wrote:
Roger, David thanks for suggestions!
Would you please take a look at an updated webrev?
http://cr.openjdk.java.net/~igerasim/8020669/1/webrev/
<http://cr.openjdk.java.net/%7Eigerasim/8020669/1/webrev/>
- File size is used as an initial size of BAOS's
buffer.
- BAOS avoids copying its buffer in toByteArray(),
if size is correct .
I don't want to initialize BAOS with a positive
number if size
happened to be zero.
Because zero could indicate that the file is
really empty.
Sincerely yours,
Ivan
On 19.07.2013 22:30, David M. Lloyd wrote:
My mistake, we're not talking about strings.
Still you can subclass
and determine whether the buffer size guess
was right, and if so
return the array as-is (swap in an empty array
or something as needed).
On 07/19/2013 01:28 PM, David M. Lloyd wrote:
It's trivial to subclass
ByteArrayOutputStream and add a method which
converts its contents to a string using
the two protected fields which
give you all the info you need to do so.
So no extra copy is needed
that you aren't already doing.
On 07/19/2013 01:06 PM, roger riggs wrote:
Hi Ivan,
I think this change takes too big a
hit for the cases where the
size is
correct.
No real file system can be wrong about
the size of a file so this
is a
problem
only for special file systems. If the
problem is with size
reporting zero
then maybe using the incremental read
only for size == would be a
better
fix.
At least you could pass the size to
the constructor for BAOS and
avoid
the thrashing for every re-size; but
still it will allocate and
create
an extra copy
of the every time.
$.02, Roger
On 7/19/2013 1:15 PM, Ivan Gerasimov
wrote:
Hello everybody!
Would you please review a fix for
the problem with
j.n.f.Files.readAllBytes() function?
The current implementation relies
on FileChannel.size() to
preallocate
a buffer for the whole file's content.
However, some special filesystems
can report a wrong size.
An example is procfs under Linux,
which reports many files under
/proc
to be zero sized.
Thus it is proposed not to rely on
the size() and instead
continuously
read until EOF.
The downside is reallocation and
copying file content between
buffers.
But taking into account that the
doc says: "It is not intended for
reading in large files." it should
not be a big problem.
http://bugs.sun.com/view_bug.do?bug_id=8020669
http://cr.openjdk.java.net/~igerasim/8020669/0/webrev/
<http://cr.openjdk.java.net/%7Eigerasim/8020669/0/webrev/>
The fix is for JDK8. If it is
approved, it can be applied to JDK7 as
well.
Sincerely yours,
Ivan Gerasimov