Webrev: http://webrev.invokedynamic.info/xiaobin.lu/6622432/

6622432: RFE: Performance improvements to java.math.BigDecimal Details:

`This work is targeted to improve the performance of BigDecimal and`

`related classes. Along with the performance improvement, the`

`implementation of many methods has been simplified or complicated for`

`reasons described below. Before delving into details of the webrev, here`

`is the summary of the improvement, for derby benchmark inside`

`SPECjvm2008, we saw about 84.65% improvement measured on a 2 socket`

`2.8GHz, 12G RAM Nehalem machine. Similarly, we saw about 6.57%`

`improvement on SPECjbb2005. The usage of BigDecimal is quite different`

`for derby and SPECjbb2005, the former benchmark emphasizes more on`

`inflated cases and the later is mostly using small numbers (i.e.`

`non-inflated case), so we are trying our best to take care of both in`

`the whole optimization, as a result, the optimization putting in the`

`webrev is made as general as possible. The change is quite significant`

`and I can only try to highlight some general ideas of the optimization.`

`First is some background of the current implementation of BigDecimal.`

`Among all the fields of BigDecimal classes, the two most interesting`

`fields are intCompact which is long type and intVal which is BigInteger`

`type. When the unscaled value of BigDecimal falls into (Long.MIN_VALUE,`

`Long.MAX_VALUE], its intCompact field is set and intVal is set to null`

`most of time until it is used to perform operations with another`

`BigInteger. This is a good thing to do since it will reduce the object`

`allocation significantly and make the operation faster since obviously`

`adding two primitive numbers for example just needs one instruction`

`while adding two BigInteger object requires a method call, get the`

`magnitude array out of the object, a for loop and so on. So one of the`

`big idea of optimization made in the webrev is to take a further step`

`from this idea and try to make some of the hot methods aware of the`

`non-inflated and inflated cases. For example, the constructor of`

`BigDecimal(char[] in, int offset, int len), when the number is not`

`inflated, we came up with some code to make the constructor as efficient`

`as possible. Another example is the multiply(BigDecimal multiplicand)`

`method, when one of the objects is not inflated, we came up with a`

`internal multiply method in BigInteger to accept long parameter.`

`The second big change to BigDecimal is its divide method. In the`

`inflated case, the current implementation calls divide method in`

`BigInteger which makes MutableBigInteger objects for both operators,`

`then perform the division. Our change made to divide method directly`

`makes MutableBigInteger object out of the magnitude array of the intVal`

`and calls divide method directly. When we get rid of the middle man, we`

`again can reduce the object allocation significantly when divide is a`

`hot method.`

`The third change to BigDecimal is the improvement of calculating the`

`precision of a given BigDecimal object. The current implementation`

`actually uses a static tens power array for incoming number to compare`

`with. And when the number is inflated, it keeps dividing itself by 1`

`billion until it becomes very small, every division adds 9 digits to the`

`precision. As you know, the division operation is expensive and the`

`algorithm to compare with the ten's power array can be made more`

`efficiently by correlating the bit length of the number with the index`

`to the array so that the time to compute the precision is O(1) rather`

`than being O(n) (where n is the length of the array). And that is`

`exactly what we are doing in the webrev. First, we get the bit length of`

`the number and then multiply it with log2 (10 base), use the result to`

`index to a dynamically expanded ten's power array so that division`

`operation can be avoided completely.`

`Last but not least, the optimization we've made in layoutChars`

`implementation significantly reduces the array allocation rate by making`

`a thread local StringBuilder object. As a result, the buffer to hold the`

`temporary characters is the same for every thread to call the method.`

`Further using the idea I mentioned, we also make it intCompact aware.`

`When the number is not inflated, we use a thread local character array`

`to hold that number in char(s). This reduces the burden for every method`

`call to allocate that character array.`

Reviewed by: Verified by: JCK regression tests in JDK workspace Also contributed by:

`Doug Lea (thanks so much for a lot of tedious code clean up work you've`

`done, and of course, many fantastic ideas. )`

Joe Darcy contributed most of the tests Regards, -Xiaobin