Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-02 Thread Laurent Bourgès
 ---

  604 Arrays.fill(elementData, newSize, size, null);

 In performance-critical code I would avoid Arrays.fill because it adds a
 bit of overhead (unless it's intrinsified, which I don't think it is).


Last week, I sent few benchmarks I did on array cleaning (zero fill)
comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ...
Arrays.fill is the winner (always faster than arraycopy which use native
code) by 10 - 20% !
I suspect aggressive hotspot optimizations (native code ?) because I agree
Arrays.fill looks like a stupid for-loop !

Does somebody have clues explaining the Arrays.fill performance ?

Is there an alternative way to clear an array giving the best possible
performance (like c memset) ?
it could be useful to have in JDK a native array fill implementation
(System.fill ?) if it gives better performance.

Laurent


Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-02 Thread Patrick Wright
On Tue, Apr 2, 2013 at 9:27 AM, Laurent Bourgès
bourges.laur...@gmail.comwrote:

  ---
 
   604 Arrays.fill(elementData, newSize, size, null);
 
  In performance-critical code I would avoid Arrays.fill because it adds a
  bit of overhead (unless it's intrinsified, which I don't think it is).
 

 Last week, I sent few benchmarks I did on array cleaning (zero fill)
 comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ...
 Arrays.fill is the winner (always faster than arraycopy which use native
 code) by 10 - 20% !
 I suspect aggressive hotspot optimizations (native code ?) because I agree
 Arrays.fill looks like a stupid for-loop !

 Does somebody have clues explaining the Arrays.fill performance ?


There was at least one round of optimization done by the HotSpot team in
mid-2010 -
This adds new logic to recognize fill idioms and convert them into a
call to an optimized fill routine.  Loop predication creates easily
matched loops that are simply replaced with calls to the new assembly
stubs.  Currently only 1,2 and 4 byte primitive types are supported.
Objects and longs/double will be supported in a later putback.  Tested
with runthese, nsk and ctw plus jbb2005. 

see
http://openjdk.5641.n7.nabble.com/review-M-for-4809552-Optimize-Arrays-fill-td10322.html

Looks like the change was part of 6u23
http://download.java.net/jdk6/6u23/promoted/b03/changes/JDK6u23.b03.list.html

Could not find anything more recent than that (on a quick mail search)

Cheers
Patrick


Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-02 Thread Martin Buchholz
Thanks for the research.
It seems like hotspot is recognizing and optimizing fill loops, rather than
intrinsifying calls to Arrays.fill itself (good!).

Anyways, I'd still like the simple fill loops in ArrayList to stay
unchanged.  Using Arrays.fill is only slightly more readable.  There would
be more of a case for fill if it was an actual array class instance method.



On Tue, Apr 2, 2013 at 2:12 AM, Patrick Wright pdoubl...@gmail.com wrote:

 On Tue, Apr 2, 2013 at 9:27 AM, Laurent Bourgès
 bourges.laur...@gmail.comwrote:

   ---
  
604 Arrays.fill(elementData, newSize, size, null);
  
   In performance-critical code I would avoid Arrays.fill because it adds
 a
   bit of overhead (unless it's intrinsified, which I don't think it is).
  
 
  Last week, I sent few benchmarks I did on array cleaning (zero fill)
  comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ...
  Arrays.fill is the winner (always faster than arraycopy which use native
  code) by 10 - 20% !
  I suspect aggressive hotspot optimizations (native code ?) because I
 agree
  Arrays.fill looks like a stupid for-loop !
 
  Does somebody have clues explaining the Arrays.fill performance ?
 

 There was at least one round of optimization done by the HotSpot team in
 mid-2010 -
 This adds new logic to recognize fill idioms and convert them into a
 call to an optimized fill routine.  Loop predication creates easily
 matched loops that are simply replaced with calls to the new assembly
 stubs.  Currently only 1,2 and 4 byte primitive types are supported.
 Objects and longs/double will be supported in a later putback.  Tested
 with runthese, nsk and ctw plus jbb2005. 

 see

 http://openjdk.5641.n7.nabble.com/review-M-for-4809552-Optimize-Arrays-fill-td10322.html

 Looks like the change was part of 6u23

 http://download.java.net/jdk6/6u23/promoted/b03/changes/JDK6u23.b03.list.html

 Could not find anything more recent than that (on a quick mail search)

 Cheers
 Patrick



Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-02 Thread Mike Duigou

On Apr 2 2013, at 10:55 , Martin Buchholz wrote:

 Thanks for the research.
 It seems like hotspot is recognizing and optimizing fill loops, rather than
 intrinsifying calls to Arrays.fill itself (good!).

Why wouldn't doing both be better?

 Anyways, I'd still like the simple fill loops in ArrayList to stay
 unchanged.  Using Arrays.fill is only slightly more readable.

Part of the goal of the change was to make the intent clearer. I'll improve the 
comments instead.

   There would
 be more of a case for fill if it was an actual array class instance method.

Maybe for Java 9

Mike

 
 
 
 On Tue, Apr 2, 2013 at 2:12 AM, Patrick Wright pdoubl...@gmail.com wrote:
 
 On Tue, Apr 2, 2013 at 9:27 AM, Laurent Bourgès
 bourges.laur...@gmail.comwrote:
 
 ---
 
 604 Arrays.fill(elementData, newSize, size, null);
 
 In performance-critical code I would avoid Arrays.fill because it adds
 a
 bit of overhead (unless it's intrinsified, which I don't think it is).
 
 
 Last week, I sent few benchmarks I did on array cleaning (zero fill)
 comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ...
 Arrays.fill is the winner (always faster than arraycopy which use native
 code) by 10 - 20% !
 I suspect aggressive hotspot optimizations (native code ?) because I
 agree
 Arrays.fill looks like a stupid for-loop !
 
 Does somebody have clues explaining the Arrays.fill performance ?
 
 
 There was at least one round of optimization done by the HotSpot team in
 mid-2010 -
 This adds new logic to recognize fill idioms and convert them into a
 call to an optimized fill routine.  Loop predication creates easily
 matched loops that are simply replaced with calls to the new assembly
 stubs.  Currently only 1,2 and 4 byte primitive types are supported.
 Objects and longs/double will be supported in a later putback.  Tested
 with runthese, nsk and ctw plus jbb2005. 
 
 see
 
 http://openjdk.5641.n7.nabble.com/review-M-for-4809552-Optimize-Arrays-fill-td10322.html
 
 Looks like the change was part of 6u23
 
 http://download.java.net/jdk6/6u23/promoted/b03/changes/JDK6u23.b03.list.html
 
 Could not find anything more recent than that (on a quick mail search)
 
 Cheers
 Patrick
 



Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-02 Thread Martin Buchholz
On Tue, Apr 2, 2013 at 11:24 AM, Mike Duigou mike.dui...@oracle.com wrote:


 On Apr 2 2013, at 10:55 , Martin Buchholz wrote:

  Thanks for the research.
  It seems like hotspot is recognizing and optimizing fill loops, rather
 than
  intrinsifying calls to Arrays.fill itself (good!).

 Why wouldn't doing both be better?


If hotspot recognizes and optimizes fill loops, Arrays.fill is optimized
for free.


  Anyways, I'd still like the simple fill loops in ArrayList to stay
  unchanged.  Using Arrays.fill is only slightly more readable.

 Part of the goal of the change was to make the intent clearer. I'll
 improve the comments instead.


ArrayList is one of those classes that are important for educational
reasons.  Studying it will be part of many peoples' university education.
 So I applaud efforts to improve clarity.  But I think the fill loops are
sufficiently clear as they stand.


Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-02 Thread Mike Duigou

On Apr 1 2013, at 22:24 , Martin Buchholz wrote:

(Other changes accepted as suggested)

 If we are willing to make small incompatible changes, how about when 
 serializing, storing capacity as the size, so that reconstituted ArrayLists 
 are pre-trimmed to size?

Yes, I found it strange that clone() returns a trimmed copy and serialization 
preserved capacity. Recording size as capacity would be a difference but not an 
incompatibility (objects serialized by both old and new implementations could 
be deserialized correctly by old and new).

 ---
 
 I still believe that with the help of the gc team, one can cook up a 
 trim-to-size when gc-ing fix, but that's going to be very hard to implement.

I'm saving my VM RFEs for split arrays, huge arrays, sparse arrays and array 
pinning. :-)

Mike

 
 ---
 
 
 
 
 On Mon, Apr 1, 2013 at 7:00 PM, Mike Duigou mike.dui...@oracle.com wrote:
 Hello all;
 
 I have posted an updated version of the empty ArrayList and HashMap patch.
 
 http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/
 
 This revised implementation introduces *no new fields* to either class. For 
 ArrayList the lazy allocation of the backing array occurs only if the list is 
 created at default size. According to our performance analysis team, 
 approximately 85% of ArrayList instances are created at default size so this 
 optimization will be valid for an overwhelming majority of cases.
 
 For HashMap, creative use is made of the threshold field to track the 
 requested initial size until the bucket array is needed. On the read side the 
 empty map case is tested with isEmpty(). On the write size a comparison of 
 (table == EMPTY_TABLE) is used to detect the need to inflate the bucket 
 array. In readObject there's a little more work to try to choose an efficient 
 initial capacity.
 
 Mike
 
 On Mar 26 2013, at 17:25 , Mike Duigou wrote:
 
  Hello all;
 
  This is a review for optimization work that came out of internal analysis 
  of Oracle's Java applications. It's based upon analysis that shows that in 
  large applications as much as 10% of maps and lists are initialized but 
  never receive any entries. A smaller number spend a large proportion of 
  their lifetime empty. We've found similar results across other workloads as 
  well. This patch is not a substitute for pre-sizing your collections and 
  maps--doing so will *always* have better results.
 
  This patch extends HashMap and ArrayList to provide special handling for 
  newly created instances that avoids creating the backing array until 
  needed. There is a very small additional cost for detecting when to inflate 
  the map or list that is measurable in interpreted tests but disappears in 
  JITed code.
 
  http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
 
  We expect that should this code prove successful in Java 8 it will be 
  backported to Java 7 updates.
 
  The unit test may appear to be somewhat unrelated. It was created after 
  resolving a bug in an early version of this patch to detect the issue 
  encountered (LinkedHashMap.init() was not being called in readObject() when 
  the map was empty).
 
  Mike
 
 



Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-02 Thread Mike Duigou
Thanks for the pointer. I had remembered reading this changeset and it has 
motivated to use Arrays.fill but I could not have found it.

Mike

On Apr 2 2013, at 02:12 , Patrick Wright wrote:

 On Tue, Apr 2, 2013 at 9:27 AM, Laurent Bourgès
 bourges.laur...@gmail.comwrote:
 
 ---
 
 604 Arrays.fill(elementData, newSize, size, null);
 
 In performance-critical code I would avoid Arrays.fill because it adds a
 bit of overhead (unless it's intrinsified, which I don't think it is).
 
 
 Last week, I sent few benchmarks I did on array cleaning (zero fill)
 comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ...
 Arrays.fill is the winner (always faster than arraycopy which use native
 code) by 10 - 20% !
 I suspect aggressive hotspot optimizations (native code ?) because I agree
 Arrays.fill looks like a stupid for-loop !
 
 Does somebody have clues explaining the Arrays.fill performance ?
 
 
 There was at least one round of optimization done by the HotSpot team in
 mid-2010 -
 This adds new logic to recognize fill idioms and convert them into a
 call to an optimized fill routine.  Loop predication creates easily
 matched loops that are simply replaced with calls to the new assembly
 stubs.  Currently only 1,2 and 4 byte primitive types are supported.
 Objects and longs/double will be supported in a later putback.  Tested
 with runthese, nsk and ctw plus jbb2005. 
 
 see
 http://openjdk.5641.n7.nabble.com/review-M-for-4809552-Optimize-Arrays-fill-td10322.html
 
 Looks like the change was part of 6u23
 http://download.java.net/jdk6/6u23/promoted/b03/changes/JDK6u23.b03.list.html
 
 Could not find anything more recent than that (on a quick mail search)
 
 Cheers
 Patrick



Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-01 Thread Mike Duigou
Hello all;

I have posted an updated version of the empty ArrayList and HashMap patch.

http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/

This revised implementation introduces *no new fields* to either class. For 
ArrayList the lazy allocation of the backing array occurs only if the list is 
created at default size. According to our performance analysis team, 
approximately 85% of ArrayList instances are created at default size so this 
optimization will be valid for an overwhelming majority of cases. 

For HashMap, creative use is made of the threshold field to track the requested 
initial size until the bucket array is needed. On the read side the empty map 
case is tested with isEmpty(). On the write size a comparison of (table == 
EMPTY_TABLE) is used to detect the need to inflate the bucket array. In 
readObject there's a little more work to try to choose an efficient initial 
capacity.

Mike

On Mar 26 2013, at 17:25 , Mike Duigou wrote:

 Hello all;
 
 This is a review for optimization work that came out of internal analysis of 
 Oracle's Java applications. It's based upon analysis that shows that in large 
 applications as much as 10% of maps and lists are initialized but never 
 receive any entries. A smaller number spend a large proportion of their 
 lifetime empty. We've found similar results across other workloads as well. 
 This patch is not a substitute for pre-sizing your collections and 
 maps--doing so will *always* have better results.
 
 This patch extends HashMap and ArrayList to provide special handling for 
 newly created instances that avoids creating the backing array until needed. 
 There is a very small additional cost for detecting when to inflate the map 
 or list that is measurable in interpreted tests but disappears in JITed code. 
 
 http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
 
 We expect that should this code prove successful in Java 8 it will be 
 backported to Java 7 updates.
 
 The unit test may appear to be somewhat unrelated. It was created after 
 resolving a bug in an early version of this patch to detect the issue 
 encountered (LinkedHashMap.init() was not being called in readObject() when 
 the map was empty).
 
 Mike



Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-04-01 Thread Martin Buchholz
Overall, this kind of change seems barely worth doing.

---

This change:

 // Let gc do its work
-for (int i = 0; i  size; i++)
-elementData[i] = null;
+Arrays.fill(elementData, null);

changes the performance of clear from O(size) to O(capacity), which is bad,
and unrelated to the empty collection optimization.

---

+if(elementData == EMPTY_ELEMENTDATA) {

Please use standard jdk style - space after keywords such as if.

---

 183 public void ensureCapacity(int minCapacity) {
 184 if (minCapacity  0)
 185 ensureExplicitCapacity(minCapacity);
 186 }
 187
 188 private void ensureCapacityInternal(int minCapacity) {
 189 if(elementData == EMPTY_ELEMENTDATA) {
 190 minCapacity = Math.max(10, minCapacity);
 191 }
 192
 193 ensureExplicitCapacity(minCapacity);
 194 }

It looks to me like, if the user does

x = new ArrayList();
x.ensureCapacity(2);

then the capacity will be 2, not 10, an observable change from the previous
implementation, and sort-of contradicting the spec of ArrayList()

---

 604 Arrays.fill(elementData, newSize, size, null);

In performance-critical code I would avoid Arrays.fill because it adds a
bit of overhead (unless it's intrinsified, which I don't think it is).

---

If we are willing to make small incompatible changes, how about when
serializing, storing capacity as the size, so that reconstituted ArrayLists
are pre-trimmed to size?

---

I still believe that with the help of the gc team, one can cook up a
trim-to-size when gc-ing fix, but that's going to be very hard to implement.

---




On Mon, Apr 1, 2013 at 7:00 PM, Mike Duigou mike.dui...@oracle.com wrote:

 Hello all;

 I have posted an updated version of the empty ArrayList and HashMap patch.

 http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/

 This revised implementation introduces *no new fields* to either class.
 For ArrayList the lazy allocation of the backing array occurs only if the
 list is created at default size. According to our performance analysis
 team, approximately 85% of ArrayList instances are created at default size
 so this optimization will be valid for an overwhelming majority of cases.

 For HashMap, creative use is made of the threshold field to track the
 requested initial size until the bucket array is needed. On the read side
 the empty map case is tested with isEmpty(). On the write size a comparison
 of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket
 array. In readObject there's a little more work to try to choose an
 efficient initial capacity.

 Mike

 On Mar 26 2013, at 17:25 , Mike Duigou wrote:

  Hello all;
 
  This is a review for optimization work that came out of internal
 analysis of Oracle's Java applications. It's based upon analysis that shows
 that in large applications as much as 10% of maps and lists are initialized
 but never receive any entries. A smaller number spend a large proportion of
 their lifetime empty. We've found similar results across other workloads as
 well. This patch is not a substitute for pre-sizing your collections and
 maps--doing so will *always* have better results.
 
  This patch extends HashMap and ArrayList to provide special handling for
 newly created instances that avoids creating the backing array until
 needed. There is a very small additional cost for detecting when to inflate
 the map or list that is measurable in interpreted tests but disappears in
 JITed code.
 
  http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
 
  We expect that should this code prove successful in Java 8 it will be
 backported to Java 7 updates.
 
  The unit test may appear to be somewhat unrelated. It was created after
 resolving a bug in an early version of this patch to detect the issue
 encountered (LinkedHashMap.init() was not being called in readObject() when
 the map was empty).
 
  Mike




Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-03-29 Thread Peter Levart


On 03/28/2013 06:38 PM, Mike Duigou wrote:

We have heard back from the performance folks that 85% of empty lists are 
created at default size. The proposed patch is going to be revised to do the 
inflation trick only for default sized maps which will eliminate the need for a 
new field. Something creative involving the use of the existing size field can 
certainly be considered for a future optimization.


This is actually a better approach, since it leaves a gate to non-lazy 
allocation. As with all lazy things, delaying failure can have 
undesirable effects. Catching OOME at HashMap construction vs. 1st put...


Regards, Peter



On Mar 28 2013, at 04:59 , Doug Lea wrote:


On 03/27/13 12:17, Martin Buchholz wrote:

But you can support any requested initial size if stored in the size field
when list is empty.

In other words: Mike, please do not add a field if your goal is
to save space!

Part of the analysis was that with current object layouts the added fields did 
not change the memory footprint of either class. This, of course only applies 
to current versions of Hotspot and it's object layout.


Also, I hope you or the performance testing folks
have extensive and accurate enough tests to show that the change
not only helps the empty case but does not hurt the vastly more
common non-empty case.

They do. I believe that the object layout hides any increase in size.


Considering that this is the most
commonly used java.util class, there should be empirical
evidence that this is a net improvement.

I will try to find how much of the analysis I can share.


My guess is that it can be done (we did something similar for
ConcurrentHashMap) but it takes  a lot of care.

-Doug



On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote:


This seems like a good idea. I will follow up with the performance people
to see if their findings include the requested initial size.

Mike

On Mar 26 2013, at 22:53 , Brian Goetz wrote:


What percentage of the empty lists are default-sized?  I suspect it is

large, in which case we could apply this trick only for the default-sized
lists, and eliminate the extra field.

On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote:


Hello all;

This is a review for optimization work that came out of internal

analysis of Oracle's Java applications. It's based upon analysis that shows
that in large applications as much as 10% of maps and lists are initialized
but never receive any entries. A smaller number spend a large proportion of
their lifetime empty. We've found similar results across other workloads as
well. This patch is not a substitute for pre-sizing your collections and
maps--doing so will *always* have better results.

This patch extends HashMap and ArrayList to provide special handling

for newly created instances that avoids creating the backing array until
needed. There is a very small additional cost for detecting when to inflate
the map or list that is measurable in interpreted tests but disappears in
JITed code.

http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/

We expect that should this code prove successful in Java 8 it will be

backported to Java 7 updates.

The unit test may appear to be somewhat unrelated. It was created after

resolving a bug in an early version of this patch to detect the issue
encountered (LinkedHashMap.init() was not being called in readObject() when
the map was empty).

Mike




Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-03-28 Thread Peter Levart

Hi Mike,

Have you considered the possibility to re-constitute the initial 
capacity from threshold and loadFactor?


Regards, Peter

On 03/27/2013 05:28 PM, Mike Duigou wrote:

I started looking at crafty reuse of size but found too many direct references 
to size to attempt getting this right in the current iteration. Reusing size is 
definitely still available to someone who wants to dive in and prepare an 
implementation.

Mike

On Mar 27 2013, at 09:17 , Martin Buchholz wrote:


But you can support any requested initial size if stored in the size field when 
list is empty.


On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote:
This seems like a good idea. I will follow up with the performance people to 
see if their findings include the requested initial size.

Mike

On Mar 26 2013, at 22:53 , Brian Goetz wrote:


What percentage of the empty lists are default-sized?  I suspect it is large, 
in which case we could apply this trick only for the default-sized lists, and 
eliminate the extra field.

On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote:


Hello all;

This is a review for optimization work that came out of internal analysis of 
Oracle's Java applications. It's based upon analysis that shows that in large 
applications as much as 10% of maps and lists are initialized but never receive 
any entries. A smaller number spend a large proportion of their lifetime empty. 
We've found similar results across other workloads as well. This patch is not a 
substitute for pre-sizing your collections and maps--doing so will *always* 
have better results.

This patch extends HashMap and ArrayList to provide special handling for newly 
created instances that avoids creating the backing array until needed. There is 
a very small additional cost for detecting when to inflate the map or list that 
is measurable in interpreted tests but disappears in JITed code.

http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/

We expect that should this code prove successful in Java 8 it will be 
backported to Java 7 updates.

The unit test may appear to be somewhat unrelated. It was created after 
resolving a bug in an early version of this patch to detect the issue 
encountered (LinkedHashMap.init() was not being called in readObject() when the 
map was empty).

Mike






Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-03-28 Thread Mike Duigou
We have heard back from the performance folks that 85% of empty lists are 
created at default size. The proposed patch is going to be revised to do the 
inflation trick only for default sized maps which will eliminate the need for a 
new field. Something creative involving the use of the existing size field can 
certainly be considered for a future optimization.


On Mar 28 2013, at 04:59 , Doug Lea wrote:

 On 03/27/13 12:17, Martin Buchholz wrote:
 But you can support any requested initial size if stored in the size field
 when list is empty.
 
 In other words: Mike, please do not add a field if your goal is
 to save space!

Part of the analysis was that with current object layouts the added fields did 
not change the memory footprint of either class. This, of course only applies 
to current versions of Hotspot and it's object layout.

 Also, I hope you or the performance testing folks
 have extensive and accurate enough tests to show that the change
 not only helps the empty case but does not hurt the vastly more
 common non-empty case.

They do. I believe that the object layout hides any increase in size.

 Considering that this is the most
 commonly used java.util class, there should be empirical
 evidence that this is a net improvement.

I will try to find how much of the analysis I can share.

 My guess is that it can be done (we did something similar for
 ConcurrentHashMap) but it takes  a lot of care.
 
 -Doug
 
 
 
 
 On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote:
 
 This seems like a good idea. I will follow up with the performance people
 to see if their findings include the requested initial size.
 
 Mike
 
 On Mar 26 2013, at 22:53 , Brian Goetz wrote:
 
 What percentage of the empty lists are default-sized?  I suspect it is
 large, in which case we could apply this trick only for the default-sized
 lists, and eliminate the extra field.
 
 On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote:
 
 Hello all;
 
 This is a review for optimization work that came out of internal
 analysis of Oracle's Java applications. It's based upon analysis that shows
 that in large applications as much as 10% of maps and lists are initialized
 but never receive any entries. A smaller number spend a large proportion of
 their lifetime empty. We've found similar results across other workloads as
 well. This patch is not a substitute for pre-sizing your collections and
 maps--doing so will *always* have better results.
 
 This patch extends HashMap and ArrayList to provide special handling
 for newly created instances that avoids creating the backing array until
 needed. There is a very small additional cost for detecting when to inflate
 the map or list that is measurable in interpreted tests but disappears in
 JITed code.
 
 http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
 
 We expect that should this code prove successful in Java 8 it will be
 backported to Java 7 updates.
 
 The unit test may appear to be somewhat unrelated. It was created after
 resolving a bug in an early version of this patch to detect the issue
 encountered (LinkedHashMap.init() was not being called in readObject() when
 the map was empty).
 
 Mike
 
 
 
 
 



Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-03-27 Thread Remi Forax

On 03/27/2013 06:53 AM, Brian Goetz wrote:

What percentage of the empty lists are default-sized?  I suspect it is large, 
in which case we could apply this trick only for the default-sized lists, and 
eliminate the extra field.


yes, very good idea indeed.

Rémi

  


On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote:


Hello all;

This is a review for optimization work that came out of internal analysis of 
Oracle's Java applications. It's based upon analysis that shows that in large 
applications as much as 10% of maps and lists are initialized but never receive 
any entries. A smaller number spend a large proportion of their lifetime empty. 
We've found similar results across other workloads as well. This patch is not a 
substitute for pre-sizing your collections and maps--doing so will *always* 
have better results.

This patch extends HashMap and ArrayList to provide special handling for newly 
created instances that avoids creating the backing array until needed. There is 
a very small additional cost for detecting when to inflate the map or list that 
is measurable in interpreted tests but disappears in JITed code.

http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/

We expect that should this code prove successful in Java 8 it will be 
backported to Java 7 updates.

The unit test may appear to be somewhat unrelated. It was created after 
resolving a bug in an early version of this patch to detect the issue 
encountered (LinkedHashMap.init() was not being called in readObject() when the 
map was empty).

Mike




Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-03-27 Thread Mike Duigou
This seems like a good idea. I will follow up with the performance people to 
see if their findings include the requested initial size.

Mike

On Mar 26 2013, at 22:53 , Brian Goetz wrote:

 What percentage of the empty lists are default-sized?  I suspect it is large, 
 in which case we could apply this trick only for the default-sized lists, and 
 eliminate the extra field.  
 
 On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote:
 
 Hello all;
 
 This is a review for optimization work that came out of internal analysis of 
 Oracle's Java applications. It's based upon analysis that shows that in 
 large applications as much as 10% of maps and lists are initialized but 
 never receive any entries. A smaller number spend a large proportion of 
 their lifetime empty. We've found similar results across other workloads as 
 well. This patch is not a substitute for pre-sizing your collections and 
 maps--doing so will *always* have better results.
 
 This patch extends HashMap and ArrayList to provide special handling for 
 newly created instances that avoids creating the backing array until needed. 
 There is a very small additional cost for detecting when to inflate the map 
 or list that is measurable in interpreted tests but disappears in JITed 
 code. 
 
 http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
 
 We expect that should this code prove successful in Java 8 it will be 
 backported to Java 7 updates.
 
 The unit test may appear to be somewhat unrelated. It was created after 
 resolving a bug in an early version of this patch to detect the issue 
 encountered (LinkedHashMap.init() was not being called in readObject() when 
 the map was empty).
 
 Mike
 



Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-03-27 Thread Martin Buchholz
But you can support any requested initial size if stored in the size field
when list is empty.


On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote:

 This seems like a good idea. I will follow up with the performance people
 to see if their findings include the requested initial size.

 Mike

 On Mar 26 2013, at 22:53 , Brian Goetz wrote:

  What percentage of the empty lists are default-sized?  I suspect it is
 large, in which case we could apply this trick only for the default-sized
 lists, and eliminate the extra field.
 
  On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote:
 
  Hello all;
 
  This is a review for optimization work that came out of internal
 analysis of Oracle's Java applications. It's based upon analysis that shows
 that in large applications as much as 10% of maps and lists are initialized
 but never receive any entries. A smaller number spend a large proportion of
 their lifetime empty. We've found similar results across other workloads as
 well. This patch is not a substitute for pre-sizing your collections and
 maps--doing so will *always* have better results.
 
  This patch extends HashMap and ArrayList to provide special handling
 for newly created instances that avoids creating the backing array until
 needed. There is a very small additional cost for detecting when to inflate
 the map or list that is measurable in interpreted tests but disappears in
 JITed code.
 
  http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
 
  We expect that should this code prove successful in Java 8 it will be
 backported to Java 7 updates.
 
  The unit test may appear to be somewhat unrelated. It was created after
 resolving a bug in an early version of this patch to detect the issue
 encountered (LinkedHashMap.init() was not being called in readObject() when
 the map was empty).
 
  Mike
 




Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-03-27 Thread Mike Duigou
I started looking at crafty reuse of size but found too many direct references 
to size to attempt getting this right in the current iteration. Reusing size is 
definitely still available to someone who wants to dive in and prepare an 
implementation.

Mike

On Mar 27 2013, at 09:17 , Martin Buchholz wrote:

 But you can support any requested initial size if stored in the size field 
 when list is empty.
 
 
 On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote:
 This seems like a good idea. I will follow up with the performance people to 
 see if their findings include the requested initial size.
 
 Mike
 
 On Mar 26 2013, at 22:53 , Brian Goetz wrote:
 
  What percentage of the empty lists are default-sized?  I suspect it is 
  large, in which case we could apply this trick only for the default-sized 
  lists, and eliminate the extra field.
 
  On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote:
 
  Hello all;
 
  This is a review for optimization work that came out of internal analysis 
  of Oracle's Java applications. It's based upon analysis that shows that in 
  large applications as much as 10% of maps and lists are initialized but 
  never receive any entries. A smaller number spend a large proportion of 
  their lifetime empty. We've found similar results across other workloads 
  as well. This patch is not a substitute for pre-sizing your collections 
  and maps--doing so will *always* have better results.
 
  This patch extends HashMap and ArrayList to provide special handling for 
  newly created instances that avoids creating the backing array until 
  needed. There is a very small additional cost for detecting when to 
  inflate the map or list that is measurable in interpreted tests but 
  disappears in JITed code.
 
  http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
 
  We expect that should this code prove successful in Java 8 it will be 
  backported to Java 7 updates.
 
  The unit test may appear to be somewhat unrelated. It was created after 
  resolving a bug in an early version of this patch to detect the issue 
  encountered (LinkedHashMap.init() was not being called in readObject() 
  when the map was empty).
 
  Mike
 
 
 



Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap

2013-03-26 Thread Brian Goetz
What percentage of the empty lists are default-sized?  I suspect it is large, 
in which case we could apply this trick only for the default-sized lists, and 
eliminate the extra field.  

On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote:

 Hello all;
 
 This is a review for optimization work that came out of internal analysis of 
 Oracle's Java applications. It's based upon analysis that shows that in large 
 applications as much as 10% of maps and lists are initialized but never 
 receive any entries. A smaller number spend a large proportion of their 
 lifetime empty. We've found similar results across other workloads as well. 
 This patch is not a substitute for pre-sizing your collections and 
 maps--doing so will *always* have better results.
 
 This patch extends HashMap and ArrayList to provide special handling for 
 newly created instances that avoids creating the backing array until needed. 
 There is a very small additional cost for detecting when to inflate the map 
 or list that is measurable in interpreted tests but disappears in JITed code. 
 
 http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/
 
 We expect that should this code prove successful in Java 8 it will be 
 backported to Java 7 updates.
 
 The unit test may appear to be somewhat unrelated. It was created after 
 resolving a bug in an early version of this patch to detect the issue 
 encountered (LinkedHashMap.init() was not being called in readObject() when 
 the map was empty).
 
 Mike