Hi Chris, I recently discovered another place to handle huge arrays better - in AbstractCollection. I've put those changes into http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize2/ I propose to qfold these into the original changes for this bug http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/ which have not yet been committed.
Martin On Tue, Mar 9, 2010 at 04:41, Christopher Hegarty -Sun Microsystems Ireland <christopher.hega...@sun.com> wrote: > Sorry Martin, I appear to have missed your original request to file this > bug. I since filed the following: > > 6933217: Huge arrays handled poorly in core libraries > > The changes you are proposing seem reasonable to me. > > -Chris. > > Martin Buchholz wrote: >> >> [Chris or Alan, please review and file a bug] >> >> OK, guys, >> >> Here's a patch: >> >> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/ >> >> Martin >> >> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.st...@gmail.com> >> wrote: >>> >>> Hi Martin, >>> >>> Thank you for your reply. If I may, PriorityQueue appears to employ the >>> simple strategy that I suggested above in its grow method: >>> >>> int newCapacity = ((oldCapacity < 64)? >>> ((oldCapacity + 1) * 2): >>> ((oldCapacity / 2) * 3)); >>> if (newCapacity < 0) // overflow >>> newCapacity = Integer.MAX_VALUE; >>> >>> It might be desirable to set a common strategy for capacity increase for >>> all >>> collections. >>> >>> Regards, >>> >>> Kevin >>> >>> On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <marti...@google.com> >>> wrote: >>>> >>>> Hi Kevin, >>>> >>>> As you've noticed, creating objects within a factor of two of >>>> their natural limits is a good way to expose lurking bugs. >>>> >>>> I'm the one responsible for the algorithm in ArrayList. >>>> I'm a bit embarrassed, looking at that code today. >>>> We could set the array size to Integer.MAX_VALUE, >>>> but then you might hit an independent buglet in hotspot >>>> that you cannot allocate an array with Integer.MAX_VALUE >>>> elements, but Integer.MAX_VALUE - 5 (or so) works. >>>> >>>> It occurs to me that increasing the size by 50% is better done by >>>> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1; >>>> >>>> I agree with the plan of setting the capacity to something near >>>> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize. >>>> >>>> These bugs are not known. >>>> Chris Hegarty, could you file a bug for us? >>>> >>>> Martin >>>> >>>> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.st...@gmail.com> >>>> wrote: >>>>> >>>>> Greetings, >>>>> >>>>> I've noticed bugs in java.util.ArrayList, java.util.Hashtable and >>>>> java.io.ByteArrayOutputStream which arise when the capacities of the >>>>> data >>>>> structures reach a particular threshold. More below. >>>>> >>>>> When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its >>>>> size >>>>> reaches its capacity and an add or an insert operation is invoked, the >>>>> capacity is increased by only one element. Notice that in the >>>>> following >>>>> excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) >>>>> * >>>>> oldCapacity + 1 unless this value would not suffice to accommodate the >>>>> required capacity in which case it is set to the required capacity. If >>>>> the >>>>> current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity >>>>> * >>>>> 3)/2 + 1 overflows and resolves to a negative number resulting in the >>>>> new >>>>> capacity being set to the required capacity. The major consequence of >>>>> this >>>>> is that each subsequent add/insert operation results in a full resize >>>>> of >>>>> the >>>>> ArrayList causing performance to degrade significantly. >>>>> >>>>> int newCapacity = (oldCapacity * 3)/2 + 1; >>>>> if (newCapacity < minCapacity) >>>>> newCapacity = minCapacity; >>>>> >>>>> Hashtable breaks entirely when the size of its backing array reaches >>>>> (1/2) * >>>>> Integer.MAX_VALUE and a rehash is necessary as is evident from the >>>>> following >>>>> excerpt from rehash. Notice that rehash will attempt to create an >>>>> array >>>>> of >>>>> negative size if the size of the backing array reaches (1/2) * >>>>> Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a >>>>> negative number. >>>>> >>>>> int newCapacity = oldCapacity * 2 + 1; >>>>> HashtableEntry newTable[] = new HashtableEntry[newCapacity]; >>>>> >>>>> When the capacity of the backing array in a ByteArrayOutputStream >>>>> reaches >>>>> (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write >>>>> operation is invoked, the capacity of the backing array is increased >>>>> only by >>>>> the required number of elements. Notice that in the following excerpt >>>>> from >>>>> ByteArrayOutputStream.write(int) the new backing array capacity is set >>>>> to 2 >>>>> * buf.length unless this value would not suffice to accommodate the >>>>> required >>>>> capacity in which case it is set to the required capacity. If the >>>>> current >>>>> backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then >>>>> buf.length << 1 overflows and resolves to a negative number resulting >>>>> in >>>>> the >>>>> new capacity being set to the required capacity. The major consequence >>>>> of >>>>> this, like with ArrayList, is that each subsequent write operation >>>>> results >>>>> in a full resize of the ByteArrayOutputStream causing performance to >>>>> degrade >>>>> significantly. >>>>> >>>>> int newcount = count + 1; >>>>> if (newcount > buf.length) { >>>>> buf = Arrays.copyOf(buf, Math.max(buf.length << 1, >>>>> newcount)); >>>>> } >>>>> >>>>> It is interesting to note that any statements about the amortized time >>>>> complexity of add/insert operations, such as the one in the ArrayList >>>>> javadoc, are invalidated by the performance related bugs. One solution >>>>> to >>>>> the above situations is to set the new capacity of the backing array to >>>>> Integer.MAX_VALUE when the initial size calculation results in a >>>>> negative >>>>> number during a resize. >>>>> >>>>> Apologies if these bugs are already known. >>>>> >>>>> Regards, >>>>> >>>>> Kevin >>>>> >>> >