HashMap sizing was a terrible design mistake, but too much software depends on it, e.g. https://google.github.io/guava/releases/23.0/api/docs/com/google/common/collect/Maps.html#newHashMapWithExpectedSize-int-
On Tue, Dec 11, 2018 at 1:26 AM Claes Redestad <claes.redes...@oracle.com> wrote: > Hi Peter, > > On 2018-12-11 10:02, Peter Levart wrote: > > Hi Claes, > > > > Haven't checked all changes yet, although it looks like a > > straightforward swap of Properties for HashMap for intermediary result, > > but I noticed the following in SystemProps: > > > > 265 var cmdProps = new HashMap<String, > > String>((vmProps.length / 2) + Raw.FIXED_LENGTH); > > > > The HashMap(int) constructor is different from Properties(int) in that > > for the former, the argument represents the lower bound on the initial > > size of the table (which is just a rounding of this parameter up to the > > nearest power of 2). The threshold where the table is resized is > > calculated as (initialCapacity rounded up to nearest power of 2) * > > loadFactor. The default load factor is 0.75 which means that the table > > will be resized in worst case after 3/4 * initialCapacity of elements > > are inserted into it. In order to guarantee that the table is not > > resized you have to pass (size * 4 + 2) / 3 to the HashMap constructor, > > where size is the number of elements added... > > > > I hope I'm not misleading you, I just think this is how HashMap has been > > from the beginning. Peeking at HashMap code (line 693) it seems that it > > is doing that: > > > > float ft = (float)newCap * loadFactor; > > newThr = (newCap < MAXIMUM_CAPACITY && ft < > > (float)MAXIMUM_CAPACITY ? > > (int)ft : Integer.MAX_VALUE); > > > > newCap above is holding the initialCapacity constructor parameter > > rounded up to the nearest power of 2. newThr is the threshold at which > > the resize occurs. > > > > The Properties(int) constuctor behaves differently as it passes the > > parameter directly to the underlying ConcurrentHashMap, which says: > > > > * @param initialCapacity The implementation performs internal > > * sizing to accommodate this many elements. > > Fun story, but I noticed this unfortunate difference when I was working > on this very patch and brought it up in the team. I think most agrees > the CHM behavior is the more intuitive and would have loved to > consolidate to that behavior, but the message I've gotten is that it's > probably too late to fix: Whichever way you go you get lots of little > subtle changes to sizes that may lead to incompabilities in > applications/tests that inadvertedly depend on the iteration order or > HashMap etc.. > > That said: Raw typically contains quite a few empty/null values that'll > never be put in the map. Enough so that for the applications I've looked > at the initialCapacity is already a fair bit higher than it needs to be > to avoid resizing. Thus it made little sense to take the maximum > possible size and adjust it up even further by factoring in the default > load factor. > > (Unless you have a *lot* of properties coming in via command line, but I > think we should optimize for the common cases...) > > Thanks! > > /Claes > > > > > Regards, Peter > > > > On 12/10/18 10:17 PM, Claes Redestad wrote: > >> Hi, > >> > >> by inverting the order in which the internal property maps are created, > >> we avoid some classloading and get a slightly more efficient code > >> execution profile in System.initPhase1. > >> > >> Webrev: http://cr.openjdk.java.net/~redestad/8215159/jdk.00/ > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8215159 > >> > >> This reduces bytecode executed in initPhase1 from ~48k to ~36k. Not > >> much by any measure, but minimizing System.initPhase1 is important since > >> certain parts of the VM (JIT etc) are blocked from initializing until > >> it's done. > >> > >> Thanks! > >> > >> /Claes > > >