Re: Sometimes constraints are questionable

Jim Laskey Wed, 03 Jun 2020 06:22:14 -0700

Stuart (and Martin),

I absolutely agree with the reasoning. ArraysSupport.newLength() is also 
probably easier for the compiler to optimize around (no exception.)


I guess the only conflict here is to be meaningfully informative. The bug 
assigned to me (https://bugs.openjdk.java.net/browse/JDK-8230744 
<https://bugs.openjdk.java.net/browse/JDK-8230744>) was really about not 
getting clear information when the OOM is generated, Getting an OOM from the 
bowels of the VM doesn't always give a clear picture (even with a backtrace) 
when the allocate code is distant from the expression that calculated the new 
size (VM: I can't do this. You figure it out, Developer: ? ). 

Needed here is something like:

        ArraysSupport.newLength(oldLength, minGrowth, perfGrowth, () -> "Queue 
exceeds implementation limit")

There are other JDK attempts using addExact/multiplyExact to catch the error 
closer to action. 

Note that there are a few cases, ex. StringJoiner, where the growth is a three 
way sum where any two args could go negative (overflow) and the third arg 
bringing it positive again.

Based on what I see in the JDK, the world would be well served with an 
Arrays.grow(array, minGrowth, perfGrowth, () -> "Queue exceeds implementation 
limit"). This differs Arrays.copyOf in 1) does overflow checks 2) provides 
meaningful message 3) eliminates a common and often poorly implemented code 
pattern. The alternative to this is catching the OOM at site to replace with a 
meaningful message (this of course is brimming with issues.)

I will replace sites modified by this bug with ArraysSupport.newLength.

Cheers,

-- Jim




> On Jun 2, 2020, at 7:08 PM, Stuart Marks <[email protected]> wrote:
> 
> Hi Jim,
> 
> This was mentioned previously in this thread but not discussed very much. I 
> suggest you take a look at jdk.internal.util.ArraysSupport.newLength(). Ivan 
> Gerasimov and I worked this over fairly closely last year, and I'm pretty 
> sure it does what Martin is saying, which I also think is the right thing.
> 
> The intent is that it be used for things that have growable arrays, where the 
> array might have a larger capacity than the logical number of elements 
> currently stored. Sometimes the array needs to be grown to accommodate an 
> immediate need (minGrowth) but it's desired to grow larger (prefGrowth) in 
> anticipation of future needs. If minGrowth can't be accommodated, it throws 
> OOME, but if prefGrowth can't be accommodated, it might be acceptable to 
> provide a smaller amount of growth.
> 
> (Of course, all this assumes that there is sufficient memory available to 
> allocate the actual array. ArraysSupport.newLength doesn't attempt to 
> ascertain that.)
> 
> One issue is integer wraparound (overflow). This is the primary value that 
> ArraysSupport.newLength provides. It would be good to centralize these 
> computations instead of having them be spread all over.
> 
> Another issue is the one that MAX_ARRAY_LENGTH (also called MAX_ARRAY_SIZE) 
> is trying to address. This is sort-of a misnomer. It's not the actual maximum 
> array size (which in fact isn't known the the library). It's actually the 
> maximum array size that the library is fairly confident the VM can provide, 
> assuming that enough memory is actually available. What the heck does that 
> mean?
> 
> The theoretical maximum array size is Integer.MAX_VALUE, since the JLS and 
> JVMS don't allow anything larger. However, actual VM implementations will 
> refuse to allocate an array above a certain amount slightly smaller than 
> that, even if there is enough memory available. In practice, I believe the 
> values for current Hotspot are Integer.MAX_VALUE-3 or Integer.MAX_VALUE-2, 
> depending on whether compressed OOPS are in use.
> 
> Why is this significant? Consider the following case, where the capacity of 
> something is currently Integer.MAX_VALUE-100, and it's filled with elements. 
> The application performs some operation that requires 50 elements (minGrowth) 
> be added. A new array could certainly be allocated with size 
> Integer.MAX_VALUE-50, but typical growth policies for these kinds of 
> containers want to increase the current capacity by 1.5x or 2x (prefGrowth). 
> Doing this multiplication would exceed Integer.MAX_VALUE, so that won't work. 
> Clearly, we need to clamp the capacity somewhere.
> 
> We don't want to clamp the capacity at Integer.MAX_VALUE, because this 
> allocation would fail on every JVM I'm aware of, even if enough memory is 
> available. So we don't do that. Instead, we clamp the preferred growth at 
> some fairly arbitrary value smaller than Integer.MAX_VALUE, which is here 
> called MAX_ARRAY_LENGTH, and increase the capacity to that instead. This 
> allows the container's requested allocation to proceed: it satisfies 
> minGrowth, but it doesn't satisfy prefGrowth. Instead, it returns a capacity 
> value that's reasonably likely to succeed, given an unknown JVM 
> implementation limit.
> 
> Recall that the container now has Integer.MAX_VALUE-50 elements and the 
> capacity is MAX_ARRAY_SIZE, which is currently defined somewhat arbitrarily 
> at Integer.MAX_VALUE-8. Suppose the application now wants to add 43 elements. 
> What should happen?
> 
> We could say, this exceeds MAX_ARRAY_LENGTH, so refuse the request and throw 
> OOME. This is reasonable and consistent in some sense, but perhaps not in 
> another. Suppose there is sufficient memory, and the JVM does allow arrays of 
> Integer.MAX_VALUE-7 to be created. Shouldn't we even try?
> 
> That's what hugeLength() does -- it returns a value that attempts an 
> allocation beyond the max preferential growth, and leaves it up to the JVM to 
> grant or refuse the request based on its own implementation limits.
> 
> Anyway, this is all quite subtle, and maybe comments in ArraysSupport don't 
> describe this adequately. But the code that implements this kind of policy 
> has been copied to different locations around the JDK, and it uses somewhat 
> different terminology, and it might have slightly different bugs, but they're 
> all essentially trying to implement this policy.
> 
> **
> 
> Several questions could be asked:
> 
> 1) Is this the right policy for implementing growable arrays?
> 
> 2) In cases where a class needs a growable array, can and should it be 
> refactored to use ArraysSupport.newLength()?
> 
> 3) Does ArraysSupport.newLength() need to be modified to accommodate needs of 
> additional call sites?
> 
> 4) We might want to consider refactoring PriorityBlockingQueue and ArrayDeque 
> to use ArraysSupport.newLength, in order to provide a consistent policy for 
> collections. Other growable-array-based collections (Vector, ArrayList, 
> PriorityQueue) do already.
> 
> s'marks
> 
> 
> 
> 
> 
> On 6/1/20 4:47 AM, Jim Laskey wrote:
>> Thanks David will run with that,
>>> On May 31, 2020, at 8:34 PM, David Holmes <[email protected]> wrote:
>>> 
>>> On 31/05/2020 12:29 am, Jim Laskey wrote:
>>>> I'm working through https://bugs.openjdk.java.net/browse/JDK-8230744 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8230744> Several classes throw 
>>>> OutOfMemoryError without message .
>>>> I'm wondering why hugeCapacity in 
>>>> src/jdk.zipfs/share/classes/jdk/nio/zipfs/ByteArrayChannel.java is defined 
>>>> as
>>>>     /**
>>>>      * The maximum size of array to allocate.
>>>>      * Some VMs reserve some header words in an array.
>>>>      * Attempts to allocate larger arrays may result in
>>>>      * OutOfMemoryError: Requested array size exceeds VM limit
>>>>      */
>>>>     private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
>>>>     /**
>>>>      * Increases the capacity to ensure that it can hold at least the
>>>>      * number of elements specified by the minimum capacity argument.
>>>>      *
>>>>      * @param minCapacity the desired minimum capacity
>>>>      */
>>>>     private void grow(int minCapacity) {
>>>>         // overflow-conscious code
>>>>         int oldCapacity = buf.length;
>>>>         int newCapacity = oldCapacity << 1;
>>>>         if (newCapacity - minCapacity < 0)
>>>>             newCapacity = minCapacity;
>>>>         if (newCapacity - MAX_ARRAY_SIZE > 0)
>>>>             newCapacity = hugeCapacity(minCapacity);
>>>>         buf = Arrays.copyOf(buf, newCapacity);
>>>>     }
>>>>     private static int hugeCapacity(int minCapacity) {
>>>>         if (minCapacity < 0) // overflow
>>>>             throw new OutOfMemoryError();
>>> 
>>> Not sure how we could have minCapacity < 0 at this point. It should have 
>>> been checked before the call to grow, and grow will not make it negative.
>>> 
>>>>         return (minCapacity > MAX_ARRAY_SIZE) ?
>>>>             Integer.MAX_VALUE :
>>>>             MAX_ARRAY_SIZE;
>>> 
>>> That's a bug plain and simple. It should never report a size > 
>>> MAX_ARRAY_SIZE.
>>> 
>>>>     }
>>>> It just seems that it's pushing the inevitable off to Arrays.copyOf.  
>>>> Shouldn't it be:
>>>>     private static int hugeCapacity(int minCapacity) {
>>>>         if (minCapacity < 0 || minCapacity > MAX_ARRAY_SIZE) {
>>>>             throw
>>>>                 new OutOfMemoryError("ByteArrayChannel exceeds maximum 
>>>> size: " +
>>>>                                       MAX_ARRAY_SIZE);
>>>>         }
>>>>                  return MAX_ARRAY_SIZE;
>>>>     }
>>> 
>>> That seems more appropriate to me - modulo the question mark over 
>>> minCapacity being negative.
>>> 
>>>> Real question: is there some hidden purpose behind this kind of logic?
>>> 
>>> The basic strategy is to double the current capacity unless that will 
>>> trigger an unnecessary exception, in which case just use the requested 
>>> capacity, but again watch for the implementation limits.
>>> 
>>> Cheers,
>>> David
>>> -----
>>> 
>>>> Cheers,
>>>> -- Jim

Re: Sometimes constraints are questionable

Reply via email to