Hi Martin, Thank you for your reply. If I may, PriorityQueue appears to employ the simple strategy that I suggested above in its grow method:
int newCapacity = ((oldCapacity < 64)? ((oldCapacity + 1) * 2): ((oldCapacity / 2) * 3)); if (newCapacity < 0) // overflow newCapacity = Integer.MAX_VALUE; It might be desirable to set a common strategy for capacity increase for all collections. Regards, Kevin On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <marti...@google.com> wrote: > Hi Kevin, > > As you've noticed, creating objects within a factor of two of > their natural limits is a good way to expose lurking bugs. > > I'm the one responsible for the algorithm in ArrayList. > I'm a bit embarrassed, looking at that code today. > We could set the array size to Integer.MAX_VALUE, > but then you might hit an independent buglet in hotspot > that you cannot allocate an array with Integer.MAX_VALUE > elements, but Integer.MAX_VALUE - 5 (or so) works. > > It occurs to me that increasing the size by 50% is better done by > int newCapacity = oldCapacity + (oldCapacity >> 1) + 1; > > I agree with the plan of setting the capacity to something near > MAX_VALUE on overflow, and throw OutOfMemoryError on next resize. > > These bugs are not known. > Chris Hegarty, could you file a bug for us? > > Martin > > On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.st...@gmail.com> > wrote: > > Greetings, > > > > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and > > java.io.ByteArrayOutputStream which arise when the capacities of the data > > structures reach a particular threshold. More below. > > > > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its > size > > reaches its capacity and an add or an insert operation is invoked, the > > capacity is increased by only one element. Notice that in the following > > excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) * > > oldCapacity + 1 unless this value would not suffice to accommodate the > > required capacity in which case it is set to the required capacity. If > the > > current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity * > > 3)/2 + 1 overflows and resolves to a negative number resulting in the new > > capacity being set to the required capacity. The major consequence of > this > > is that each subsequent add/insert operation results in a full resize of > the > > ArrayList causing performance to degrade significantly. > > > > int newCapacity = (oldCapacity * 3)/2 + 1; > > if (newCapacity < minCapacity) > > newCapacity = minCapacity; > > > > Hashtable breaks entirely when the size of its backing array reaches > (1/2) * > > Integer.MAX_VALUE and a rehash is necessary as is evident from the > following > > excerpt from rehash. Notice that rehash will attempt to create an array > of > > negative size if the size of the backing array reaches (1/2) * > > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a > > negative number. > > > > int newCapacity = oldCapacity * 2 + 1; > > HashtableEntry newTable[] = new HashtableEntry[newCapacity]; > > > > When the capacity of the backing array in a ByteArrayOutputStream reaches > > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write > > operation is invoked, the capacity of the backing array is increased only > by > > the required number of elements. Notice that in the following excerpt > from > > ByteArrayOutputStream.write(int) the new backing array capacity is set to > 2 > > * buf.length unless this value would not suffice to accommodate the > required > > capacity in which case it is set to the required capacity. If the > current > > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then > > buf.length << 1 overflows and resolves to a negative number resulting in > the > > new capacity being set to the required capacity. The major consequence > of > > this, like with ArrayList, is that each subsequent write operation > results > > in a full resize of the ByteArrayOutputStream causing performance to > degrade > > significantly. > > > > int newcount = count + 1; > > if (newcount > buf.length) { > > buf = Arrays.copyOf(buf, Math.max(buf.length << 1, > newcount)); > > } > > > > It is interesting to note that any statements about the amortized time > > complexity of add/insert operations, such as the one in the ArrayList > > javadoc, are invalidated by the performance related bugs. One solution > to > > the above situations is to set the new capacity of the backing array to > > Integer.MAX_VALUE when the initial size calculation results in a negative > > number during a resize. > > > > Apologies if these bugs are already known. > > > > Regards, > > > > Kevin > > >