Hi Stuart,
On 3/06/2020 8:08 am, Stuart Marks wrote:
Hi Jim,
This was mentioned previously in this thread but not discussed very
much. I suggest you take a look at
jdk.internal.util.ArraysSupport.newLength(). Ivan Gerasimov and I worked
this over fairly closely last year, and I'm pretty sure it does what
Martin is saying, which I also think is the right thing.
The intent is that it be used for things that have growable arrays,
where the array might have a larger capacity than the logical number of
elements currently stored. Sometimes the array needs to be grown to
accommodate an immediate need (minGrowth) but it's desired to grow
larger (prefGrowth) in anticipation of future needs. If minGrowth can't
be accommodated, it throws OOME, but if prefGrowth can't be
accommodated, it might be acceptable to provide a smaller amount of growth.
(Of course, all this assumes that there is sufficient memory available
to allocate the actual array. ArraysSupport.newLength doesn't attempt to
ascertain that.)
One issue is integer wraparound (overflow). This is the primary value
that ArraysSupport.newLength provides. It would be good to centralize
these computations instead of having them be spread all over.
Another issue is the one that MAX_ARRAY_LENGTH (also called
MAX_ARRAY_SIZE) is trying to address. This is sort-of a misnomer. It's
not the actual maximum array size (which in fact isn't known the the
library). It's actually the maximum array size that the library is
fairly confident the VM can provide, assuming that enough memory is
actually available. What the heck does that mean?
The theoretical maximum array size is Integer.MAX_VALUE, since the JLS
and JVMS don't allow anything larger. However, actual VM implementations
will refuse to allocate an array above a certain amount slightly smaller
than that, even if there is enough memory available. In practice, I
believe the values for current Hotspot are Integer.MAX_VALUE-3 or
Integer.MAX_VALUE-2, depending on whether compressed OOPS are in use.
Why is this significant? Consider the following case, where the capacity
of something is currently Integer.MAX_VALUE-100, and it's filled with
elements. The application performs some operation that requires 50
elements (minGrowth) be added. A new array could certainly be allocated
with size Integer.MAX_VALUE-50, but typical growth policies for these
kinds of containers want to increase the current capacity by 1.5x or 2x
(prefGrowth). Doing this multiplication would exceed Integer.MAX_VALUE,
so that won't work. Clearly, we need to clamp the capacity somewhere.
We don't want to clamp the capacity at Integer.MAX_VALUE, because this
allocation would fail on every JVM I'm aware of, even if enough memory
is available. So we don't do that. Instead, we clamp the preferred
growth at some fairly arbitrary value smaller than Integer.MAX_VALUE,
which is here called MAX_ARRAY_LENGTH, and increase the capacity to that
instead. This allows the container's requested allocation to proceed: it
satisfies minGrowth, but it doesn't satisfy prefGrowth. Instead, it
returns a capacity value that's reasonably likely to succeed, given an
unknown JVM implementation limit.
Recall that the container now has Integer.MAX_VALUE-50 elements and the
capacity is MAX_ARRAY_SIZE, which is currently defined somewhat
arbitrarily at Integer.MAX_VALUE-8. Suppose the application now wants to
add 43 elements. What should happen?
We could say, this exceeds MAX_ARRAY_LENGTH, so refuse the request and
throw OOME. This is reasonable and consistent in some sense, but perhaps
not in another. Suppose there is sufficient memory, and the JVM does
allow arrays of Integer.MAX_VALUE-7 to be created. Shouldn't we even try?
That's what hugeLength() does -- it returns a value that attempts an
allocation beyond the max preferential growth, and leaves it up to the
JVM to grant or refuse the request based on its own implementation limits.
IIUC what you are saying is that MAX_ARRAY_LENGTH is treated as a
soft-limit. A request for prefGrowth won't be allowed to exceed it. But
if minGrowth takes the length passed it then the code tries to do the
allocation that large anyway. If it succeeds we win, and if we get OOME
that is what we would have thrown anyway if we rejected the request as
too big.
So my misunderstanding in this was that MAX_ARRAY_LENGTH is not
attempting to define the actual VM hard limit, just a large value close
to that which is expected to always be valid (actual memory permitting).
Thanks for the detailed explanation.
David
-----
Anyway, this is all quite subtle, and maybe comments in ArraysSupport
don't describe this adequately. But the code that implements this kind
of policy has been copied to different locations around the JDK, and it
uses somewhat different terminology, and it might have slightly
different bugs, but they're all essentially trying to implement this
policy.
**
Several questions could be asked:
1) Is this the right policy for implementing growable arrays?
2) In cases where a class needs a growable array, can and should it be
refactored to use ArraysSupport.newLength()?
3) Does ArraysSupport.newLength() need to be modified to accommodate
needs of additional call sites?
4) We might want to consider refactoring PriorityBlockingQueue and
ArrayDeque to use ArraysSupport.newLength, in order to provide a
consistent policy for collections. Other growable-array-based
collections (Vector, ArrayList, PriorityQueue) do already.
s'marks
On 6/1/20 4:47 AM, Jim Laskey wrote:
Thanks David will run with that,
On May 31, 2020, at 8:34 PM, David Holmes <david.hol...@oracle.com>
wrote:
On 31/05/2020 12:29 am, Jim Laskey wrote:
I'm working through https://bugs.openjdk.java.net/browse/JDK-8230744
<https://bugs.openjdk.java.net/browse/JDK-8230744> Several classes
throw OutOfMemoryError without message .
I'm wondering why hugeCapacity in
src/jdk.zipfs/share/classes/jdk/nio/zipfs/ByteArrayChannel.java is
defined as
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = buf.length;
int newCapacity = oldCapacity << 1;
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
buf = Arrays.copyOf(buf, newCapacity);
}
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
Not sure how we could have minCapacity < 0 at this point. It should
have been checked before the call to grow, and grow will not make it
negative.
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
That's a bug plain and simple. It should never report a size >
MAX_ARRAY_SIZE.
}
It just seems that it's pushing the inevitable off to
Arrays.copyOf. Shouldn't it be:
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0 || minCapacity > MAX_ARRAY_SIZE) {
throw
new OutOfMemoryError("ByteArrayChannel exceeds
maximum size: " +
MAX_ARRAY_SIZE);
}
return MAX_ARRAY_SIZE;
}
That seems more appropriate to me - modulo the question mark over
minCapacity being negative.
Real question: is there some hidden purpose behind this kind of logic?
The basic strategy is to double the current capacity unless that will
trigger an unnecessary exception, in which case just use the
requested capacity, but again watch for the implementation limits.
Cheers,
David
-----
Cheers,
-- Jim