Sorry Martin, I appear to have missed your original request to file this bug. I since filed the following:

 6933217: Huge arrays handled poorly in core libraries

The changes you are proposing seem reasonable to me.

-Chris.

Martin Buchholz wrote:
[Chris or Alan, please review and file a bug]

OK, guys,

Here's a patch:

http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/

Martin

On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.st...@gmail.com> wrote:
Hi Martin,

Thank you for your reply.  If I may, PriorityQueue appears to employ the
simple strategy that I suggested above in its grow method:

        int newCapacity = ((oldCapacity < 64)?
                           ((oldCapacity + 1) * 2):
                           ((oldCapacity / 2) * 3));
        if (newCapacity < 0) // overflow
            newCapacity = Integer.MAX_VALUE;

It might be desirable to set a common strategy for capacity increase for all
collections.

Regards,

Kevin

On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <marti...@google.com> wrote:
Hi Kevin,

As you've noticed, creating objects within a factor of two of
their natural limits is a good way to expose lurking bugs.

I'm the one responsible for the algorithm in ArrayList.
I'm a bit embarrassed, looking at that code today.
We could set the array size to Integer.MAX_VALUE,
but then you might hit an independent buglet in hotspot
that you cannot allocate an array with Integer.MAX_VALUE
elements, but Integer.MAX_VALUE - 5 (or so) works.

It occurs to me that increasing the size by 50% is better done by
int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;

I agree with the plan of setting the capacity to something near
MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.

These bugs are not known.
Chris Hegarty, could you file a bug for us?

Martin

On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.st...@gmail.com>
wrote:
Greetings,

I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
java.io.ByteArrayOutputStream which arise when the capacities of the
data
structures reach a particular threshold.  More below.

When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
size
reaches its capacity and an add or an insert operation is invoked, the
capacity is increased by only one element.  Notice that in the following
excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
oldCapacity + 1 unless this value would not suffice to accommodate the
required capacity in which case it is set to the required capacity.  If
the
current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity
*
3)/2 + 1 overflows and resolves to a negative number resulting in the
new
capacity being set to the required capacity.  The major consequence of
this
is that each subsequent add/insert operation results in a full resize of
the
ArrayList causing performance to degrade significantly.

        int newCapacity = (oldCapacity * 3)/2 + 1;
            if (newCapacity < minCapacity)
        newCapacity = minCapacity;

Hashtable breaks entirely when the size of its backing array reaches
(1/2) *
Integer.MAX_VALUE and a rehash is necessary as is evident from the
following
excerpt from rehash.  Notice that rehash will attempt to create an array
of
negative size if the size of the backing array reaches (1/2) *
Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
negative number.

    int newCapacity = oldCapacity * 2 + 1;
    HashtableEntry newTable[] = new HashtableEntry[newCapacity];

When the capacity of the backing array in a ByteArrayOutputStream
reaches
(1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
operation is invoked, the capacity of the backing array is increased
only by
the required number of elements.  Notice that in the following excerpt
from
ByteArrayOutputStream.write(int) the new backing array capacity is set
to 2
* buf.length unless this value would not suffice to accommodate the
required
capacity in which case it is set to the required capacity.  If the
current
backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
buf.length << 1 overflows and resolves to a negative number resulting in
the
new capacity being set to the required capacity.  The major consequence
of
this, like with ArrayList, is that each subsequent write operation
results
in a full resize of the ByteArrayOutputStream causing performance to
degrade
significantly.

    int newcount = count + 1;
    if (newcount > buf.length) {
            buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
newcount));
    }

It is interesting to note that any statements about the amortized time
complexity of add/insert operations, such as the one in the ArrayList
javadoc, are invalidated by the performance related bugs.  One solution
to
the above situations is to set the new capacity of the backing array to
Integer.MAX_VALUE when the initial size calculation results in a
negative
number during a resize.

Apologies if these bugs are already known.

Regards,

Kevin


Reply via email to