Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
--- 604 Arrays.fill(elementData, newSize, size, null); In performance-critical code I would avoid Arrays.fill because it adds a bit of overhead (unless it's intrinsified, which I don't think it is). Last week, I sent few benchmarks I did on array cleaning (zero fill) comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ... Arrays.fill is the winner (always faster than arraycopy which use native code) by 10 - 20% ! I suspect aggressive hotspot optimizations (native code ?) because I agree Arrays.fill looks like a stupid for-loop ! Does somebody have clues explaining the Arrays.fill performance ? Is there an alternative way to clear an array giving the best possible performance (like c memset) ? it could be useful to have in JDK a native array fill implementation (System.fill ?) if it gives better performance. Laurent
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
On Tue, Apr 2, 2013 at 9:27 AM, Laurent Bourgès bourges.laur...@gmail.comwrote: --- 604 Arrays.fill(elementData, newSize, size, null); In performance-critical code I would avoid Arrays.fill because it adds a bit of overhead (unless it's intrinsified, which I don't think it is). Last week, I sent few benchmarks I did on array cleaning (zero fill) comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ... Arrays.fill is the winner (always faster than arraycopy which use native code) by 10 - 20% ! I suspect aggressive hotspot optimizations (native code ?) because I agree Arrays.fill looks like a stupid for-loop ! Does somebody have clues explaining the Arrays.fill performance ? There was at least one round of optimization done by the HotSpot team in mid-2010 - This adds new logic to recognize fill idioms and convert them into a call to an optimized fill routine. Loop predication creates easily matched loops that are simply replaced with calls to the new assembly stubs. Currently only 1,2 and 4 byte primitive types are supported. Objects and longs/double will be supported in a later putback. Tested with runthese, nsk and ctw plus jbb2005. see http://openjdk.5641.n7.nabble.com/review-M-for-4809552-Optimize-Arrays-fill-td10322.html Looks like the change was part of 6u23 http://download.java.net/jdk6/6u23/promoted/b03/changes/JDK6u23.b03.list.html Could not find anything more recent than that (on a quick mail search) Cheers Patrick
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
Thanks for the research. It seems like hotspot is recognizing and optimizing fill loops, rather than intrinsifying calls to Arrays.fill itself (good!). Anyways, I'd still like the simple fill loops in ArrayList to stay unchanged. Using Arrays.fill is only slightly more readable. There would be more of a case for fill if it was an actual array class instance method. On Tue, Apr 2, 2013 at 2:12 AM, Patrick Wright pdoubl...@gmail.com wrote: On Tue, Apr 2, 2013 at 9:27 AM, Laurent Bourgès bourges.laur...@gmail.comwrote: --- 604 Arrays.fill(elementData, newSize, size, null); In performance-critical code I would avoid Arrays.fill because it adds a bit of overhead (unless it's intrinsified, which I don't think it is). Last week, I sent few benchmarks I did on array cleaning (zero fill) comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ... Arrays.fill is the winner (always faster than arraycopy which use native code) by 10 - 20% ! I suspect aggressive hotspot optimizations (native code ?) because I agree Arrays.fill looks like a stupid for-loop ! Does somebody have clues explaining the Arrays.fill performance ? There was at least one round of optimization done by the HotSpot team in mid-2010 - This adds new logic to recognize fill idioms and convert them into a call to an optimized fill routine. Loop predication creates easily matched loops that are simply replaced with calls to the new assembly stubs. Currently only 1,2 and 4 byte primitive types are supported. Objects and longs/double will be supported in a later putback. Tested with runthese, nsk and ctw plus jbb2005. see http://openjdk.5641.n7.nabble.com/review-M-for-4809552-Optimize-Arrays-fill-td10322.html Looks like the change was part of 6u23 http://download.java.net/jdk6/6u23/promoted/b03/changes/JDK6u23.b03.list.html Could not find anything more recent than that (on a quick mail search) Cheers Patrick
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
On Apr 2 2013, at 10:55 , Martin Buchholz wrote: Thanks for the research. It seems like hotspot is recognizing and optimizing fill loops, rather than intrinsifying calls to Arrays.fill itself (good!). Why wouldn't doing both be better? Anyways, I'd still like the simple fill loops in ArrayList to stay unchanged. Using Arrays.fill is only slightly more readable. Part of the goal of the change was to make the intent clearer. I'll improve the comments instead. There would be more of a case for fill if it was an actual array class instance method. Maybe for Java 9 Mike On Tue, Apr 2, 2013 at 2:12 AM, Patrick Wright pdoubl...@gmail.com wrote: On Tue, Apr 2, 2013 at 9:27 AM, Laurent Bourgès bourges.laur...@gmail.comwrote: --- 604 Arrays.fill(elementData, newSize, size, null); In performance-critical code I would avoid Arrays.fill because it adds a bit of overhead (unless it's intrinsified, which I don't think it is). Last week, I sent few benchmarks I did on array cleaning (zero fill) comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ... Arrays.fill is the winner (always faster than arraycopy which use native code) by 10 - 20% ! I suspect aggressive hotspot optimizations (native code ?) because I agree Arrays.fill looks like a stupid for-loop ! Does somebody have clues explaining the Arrays.fill performance ? There was at least one round of optimization done by the HotSpot team in mid-2010 - This adds new logic to recognize fill idioms and convert them into a call to an optimized fill routine. Loop predication creates easily matched loops that are simply replaced with calls to the new assembly stubs. Currently only 1,2 and 4 byte primitive types are supported. Objects and longs/double will be supported in a later putback. Tested with runthese, nsk and ctw plus jbb2005. see http://openjdk.5641.n7.nabble.com/review-M-for-4809552-Optimize-Arrays-fill-td10322.html Looks like the change was part of 6u23 http://download.java.net/jdk6/6u23/promoted/b03/changes/JDK6u23.b03.list.html Could not find anything more recent than that (on a quick mail search) Cheers Patrick
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
On Tue, Apr 2, 2013 at 11:24 AM, Mike Duigou mike.dui...@oracle.com wrote: On Apr 2 2013, at 10:55 , Martin Buchholz wrote: Thanks for the research. It seems like hotspot is recognizing and optimizing fill loops, rather than intrinsifying calls to Arrays.fill itself (good!). Why wouldn't doing both be better? If hotspot recognizes and optimizes fill loops, Arrays.fill is optimized for free. Anyways, I'd still like the simple fill loops in ArrayList to stay unchanged. Using Arrays.fill is only slightly more readable. Part of the goal of the change was to make the intent clearer. I'll improve the comments instead. ArrayList is one of those classes that are important for educational reasons. Studying it will be part of many peoples' university education. So I applaud efforts to improve clarity. But I think the fill loops are sufficiently clear as they stand.
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
On Apr 1 2013, at 22:24 , Martin Buchholz wrote: (Other changes accepted as suggested) If we are willing to make small incompatible changes, how about when serializing, storing capacity as the size, so that reconstituted ArrayLists are pre-trimmed to size? Yes, I found it strange that clone() returns a trimmed copy and serialization preserved capacity. Recording size as capacity would be a difference but not an incompatibility (objects serialized by both old and new implementations could be deserialized correctly by old and new). --- I still believe that with the help of the gc team, one can cook up a trim-to-size when gc-ing fix, but that's going to be very hard to implement. I'm saving my VM RFEs for split arrays, huge arrays, sparse arrays and array pinning. :-) Mike --- On Mon, Apr 1, 2013 at 7:00 PM, Mike Duigou mike.dui...@oracle.com wrote: Hello all; I have posted an updated version of the empty ArrayList and HashMap patch. http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/ This revised implementation introduces *no new fields* to either class. For ArrayList the lazy allocation of the backing array occurs only if the list is created at default size. According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases. For HashMap, creative use is made of the threshold field to track the requested initial size until the bucket array is needed. On the read side the empty map case is tested with isEmpty(). On the write size a comparison of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket array. In readObject there's a little more work to try to choose an efficient initial capacity. Mike On Mar 26 2013, at 17:25 , Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
Thanks for the pointer. I had remembered reading this changeset and it has motivated to use Arrays.fill but I could not have found it. Mike On Apr 2 2013, at 02:12 , Patrick Wright wrote: On Tue, Apr 2, 2013 at 9:27 AM, Laurent Bourgès bourges.laur...@gmail.comwrote: --- 604 Arrays.fill(elementData, newSize, size, null); In performance-critical code I would avoid Arrays.fill because it adds a bit of overhead (unless it's intrinsified, which I don't think it is). Last week, I sent few benchmarks I did on array cleaning (zero fill) comparing Arrays.fill, System.arraycopy, Unsafe.setMemory ... Arrays.fill is the winner (always faster than arraycopy which use native code) by 10 - 20% ! I suspect aggressive hotspot optimizations (native code ?) because I agree Arrays.fill looks like a stupid for-loop ! Does somebody have clues explaining the Arrays.fill performance ? There was at least one round of optimization done by the HotSpot team in mid-2010 - This adds new logic to recognize fill idioms and convert them into a call to an optimized fill routine. Loop predication creates easily matched loops that are simply replaced with calls to the new assembly stubs. Currently only 1,2 and 4 byte primitive types are supported. Objects and longs/double will be supported in a later putback. Tested with runthese, nsk and ctw plus jbb2005. see http://openjdk.5641.n7.nabble.com/review-M-for-4809552-Optimize-Arrays-fill-td10322.html Looks like the change was part of 6u23 http://download.java.net/jdk6/6u23/promoted/b03/changes/JDK6u23.b03.list.html Could not find anything more recent than that (on a quick mail search) Cheers Patrick
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
Hello all; I have posted an updated version of the empty ArrayList and HashMap patch. http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/ This revised implementation introduces *no new fields* to either class. For ArrayList the lazy allocation of the backing array occurs only if the list is created at default size. According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases. For HashMap, creative use is made of the threshold field to track the requested initial size until the bucket array is needed. On the read side the empty map case is tested with isEmpty(). On the write size a comparison of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket array. In readObject there's a little more work to try to choose an efficient initial capacity. Mike On Mar 26 2013, at 17:25 , Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
Overall, this kind of change seems barely worth doing. --- This change: // Let gc do its work -for (int i = 0; i size; i++) -elementData[i] = null; +Arrays.fill(elementData, null); changes the performance of clear from O(size) to O(capacity), which is bad, and unrelated to the empty collection optimization. --- +if(elementData == EMPTY_ELEMENTDATA) { Please use standard jdk style - space after keywords such as if. --- 183 public void ensureCapacity(int minCapacity) { 184 if (minCapacity 0) 185 ensureExplicitCapacity(minCapacity); 186 } 187 188 private void ensureCapacityInternal(int minCapacity) { 189 if(elementData == EMPTY_ELEMENTDATA) { 190 minCapacity = Math.max(10, minCapacity); 191 } 192 193 ensureExplicitCapacity(minCapacity); 194 } It looks to me like, if the user does x = new ArrayList(); x.ensureCapacity(2); then the capacity will be 2, not 10, an observable change from the previous implementation, and sort-of contradicting the spec of ArrayList() --- 604 Arrays.fill(elementData, newSize, size, null); In performance-critical code I would avoid Arrays.fill because it adds a bit of overhead (unless it's intrinsified, which I don't think it is). --- If we are willing to make small incompatible changes, how about when serializing, storing capacity as the size, so that reconstituted ArrayLists are pre-trimmed to size? --- I still believe that with the help of the gc team, one can cook up a trim-to-size when gc-ing fix, but that's going to be very hard to implement. --- On Mon, Apr 1, 2013 at 7:00 PM, Mike Duigou mike.dui...@oracle.com wrote: Hello all; I have posted an updated version of the empty ArrayList and HashMap patch. http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/ This revised implementation introduces *no new fields* to either class. For ArrayList the lazy allocation of the backing array occurs only if the list is created at default size. According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases. For HashMap, creative use is made of the threshold field to track the requested initial size until the bucket array is needed. On the read side the empty map case is tested with isEmpty(). On the write size a comparison of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket array. In readObject there's a little more work to try to choose an efficient initial capacity. Mike On Mar 26 2013, at 17:25 , Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
On 03/28/2013 06:38 PM, Mike Duigou wrote: We have heard back from the performance folks that 85% of empty lists are created at default size. The proposed patch is going to be revised to do the inflation trick only for default sized maps which will eliminate the need for a new field. Something creative involving the use of the existing size field can certainly be considered for a future optimization. This is actually a better approach, since it leaves a gate to non-lazy allocation. As with all lazy things, delaying failure can have undesirable effects. Catching OOME at HashMap construction vs. 1st put... Regards, Peter On Mar 28 2013, at 04:59 , Doug Lea wrote: On 03/27/13 12:17, Martin Buchholz wrote: But you can support any requested initial size if stored in the size field when list is empty. In other words: Mike, please do not add a field if your goal is to save space! Part of the analysis was that with current object layouts the added fields did not change the memory footprint of either class. This, of course only applies to current versions of Hotspot and it's object layout. Also, I hope you or the performance testing folks have extensive and accurate enough tests to show that the change not only helps the empty case but does not hurt the vastly more common non-empty case. They do. I believe that the object layout hides any increase in size. Considering that this is the most commonly used java.util class, there should be empirical evidence that this is a net improvement. I will try to find how much of the analysis I can share. My guess is that it can be done (we did something similar for ConcurrentHashMap) but it takes a lot of care. -Doug On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote: This seems like a good idea. I will follow up with the performance people to see if their findings include the requested initial size. Mike On Mar 26 2013, at 22:53 , Brian Goetz wrote: What percentage of the empty lists are default-sized? I suspect it is large, in which case we could apply this trick only for the default-sized lists, and eliminate the extra field. On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
Hi Mike, Have you considered the possibility to re-constitute the initial capacity from threshold and loadFactor? Regards, Peter On 03/27/2013 05:28 PM, Mike Duigou wrote: I started looking at crafty reuse of size but found too many direct references to size to attempt getting this right in the current iteration. Reusing size is definitely still available to someone who wants to dive in and prepare an implementation. Mike On Mar 27 2013, at 09:17 , Martin Buchholz wrote: But you can support any requested initial size if stored in the size field when list is empty. On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote: This seems like a good idea. I will follow up with the performance people to see if their findings include the requested initial size. Mike On Mar 26 2013, at 22:53 , Brian Goetz wrote: What percentage of the empty lists are default-sized? I suspect it is large, in which case we could apply this trick only for the default-sized lists, and eliminate the extra field. On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
We have heard back from the performance folks that 85% of empty lists are created at default size. The proposed patch is going to be revised to do the inflation trick only for default sized maps which will eliminate the need for a new field. Something creative involving the use of the existing size field can certainly be considered for a future optimization. On Mar 28 2013, at 04:59 , Doug Lea wrote: On 03/27/13 12:17, Martin Buchholz wrote: But you can support any requested initial size if stored in the size field when list is empty. In other words: Mike, please do not add a field if your goal is to save space! Part of the analysis was that with current object layouts the added fields did not change the memory footprint of either class. This, of course only applies to current versions of Hotspot and it's object layout. Also, I hope you or the performance testing folks have extensive and accurate enough tests to show that the change not only helps the empty case but does not hurt the vastly more common non-empty case. They do. I believe that the object layout hides any increase in size. Considering that this is the most commonly used java.util class, there should be empirical evidence that this is a net improvement. I will try to find how much of the analysis I can share. My guess is that it can be done (we did something similar for ConcurrentHashMap) but it takes a lot of care. -Doug On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote: This seems like a good idea. I will follow up with the performance people to see if their findings include the requested initial size. Mike On Mar 26 2013, at 22:53 , Brian Goetz wrote: What percentage of the empty lists are default-sized? I suspect it is large, in which case we could apply this trick only for the default-sized lists, and eliminate the extra field. On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
On 03/27/2013 06:53 AM, Brian Goetz wrote: What percentage of the empty lists are default-sized? I suspect it is large, in which case we could apply this trick only for the default-sized lists, and eliminate the extra field. yes, very good idea indeed. Rémi On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
This seems like a good idea. I will follow up with the performance people to see if their findings include the requested initial size. Mike On Mar 26 2013, at 22:53 , Brian Goetz wrote: What percentage of the empty lists are default-sized? I suspect it is large, in which case we could apply this trick only for the default-sized lists, and eliminate the extra field. On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
But you can support any requested initial size if stored in the size field when list is empty. On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote: This seems like a good idea. I will follow up with the performance people to see if their findings include the requested initial size. Mike On Mar 26 2013, at 22:53 , Brian Goetz wrote: What percentage of the empty lists are default-sized? I suspect it is large, in which case we could apply this trick only for the default-sized lists, and eliminate the extra field. On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
I started looking at crafty reuse of size but found too many direct references to size to attempt getting this right in the current iteration. Reusing size is definitely still available to someone who wants to dive in and prepare an implementation. Mike On Mar 27 2013, at 09:17 , Martin Buchholz wrote: But you can support any requested initial size if stored in the size field when list is empty. On Wed, Mar 27, 2013 at 8:02 AM, Mike Duigou mike.dui...@oracle.com wrote: This seems like a good idea. I will follow up with the performance people to see if their findings include the requested initial size. Mike On Mar 26 2013, at 22:53 , Brian Goetz wrote: What percentage of the empty lists are default-sized? I suspect it is large, in which case we could apply this trick only for the default-sized lists, and eliminate the extra field. On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike
Re: RFR JDK-7143928 : (coll) Optimize for Empty ArrayList and HashMap
What percentage of the empty lists are default-sized? I suspect it is large, in which case we could apply this trick only for the default-sized lists, and eliminate the extra field. On Mar 26, 2013, at 5:25 PM, Mike Duigou wrote: Hello all; This is a review for optimization work that came out of internal analysis of Oracle's Java applications. It's based upon analysis that shows that in large applications as much as 10% of maps and lists are initialized but never receive any entries. A smaller number spend a large proportion of their lifetime empty. We've found similar results across other workloads as well. This patch is not a substitute for pre-sizing your collections and maps--doing so will *always* have better results. This patch extends HashMap and ArrayList to provide special handling for newly created instances that avoids creating the backing array until needed. There is a very small additional cost for detecting when to inflate the map or list that is measurable in interpreted tests but disappears in JITed code. http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ We expect that should this code prove successful in Java 8 it will be backported to Java 7 updates. The unit test may appear to be somewhat unrelated. It was created after resolving a bug in an early version of this patch to detect the issue encountered (LinkedHashMap.init() was not being called in readObject() when the map was empty). Mike