Re: PING! Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-04-15 Thread Paul Sandoz
Hi Brian,

My inclination is if it ain't broke...  and AFAICT nothing indicates toString 
it is particular broken [*], so perhaps just focus on the cleanup aspects and 
revisit the other when the JMM is updated (and maybe enhanced volatiles is 
ready)?

Paul.

[*] although i still question the use of thread locals, but that is a separate 
issue that requires more investigation and i don't know if if can be improved.

On Apr 10, 2014, at 10:45 PM, Brian Burkhalter brian.burkhal...@oracle.com 
wrote:

 Second day back from vacation so I guess it’s time to beat this horse again …
 
 As there was no response to the message included below I am simply re-posting 
 it.
 
 Thanks,
 
 Brian
 
 On Mar 25, 2014, at 7:19 AM, Brian Burkhalter brian.burkhal...@oracle.com 
 wrote:
 
 On Mar 25, 2014, at 1:58 AM, Paul Sandoz paul.san...@oracle.com wrote:
 
 This is another example of a stable variable.
 
 I would like to re-iterate my scepticism that such changes are necessary in 
 this case (i am not sure if it is possible to create a benchmark that could 
 better exacerbate the concurrent overlap of calls to layoutChars). But, i 
 do agree the discussion has been useful and interesting.
 
 I am happy either to leave the toString() code as it is or to change it to 
 the variant with toStringSlow(). There is however other cleanup in the patch 
 to consider. So it would be good to get consensus on the two points:
 
 1) Change toString() to variant using toStringSlow() or leave it as-is.
 2) Change non-toString() code as indicated in the patch or leave it as-is.
 
 If “as-is” is the answer in both cases, then it’s simply a matter of 
 resolving the enhancement as “not an issue.”
 
 Thanks,
 
 Brian
 



Re: PING! Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-04-15 Thread Brian Burkhalter
Hi Paul,

On Apr 15, 2014, at 12:23 AM, Paul Sandoz paul.san...@oracle.com wrote:

 Hi Brian,
 
 My inclination is if it ain't broke...  and AFAICT nothing indicates 
 toString it is particular broken [*], so perhaps just focus on the cleanup 
 aspects and revisit the other when the JMM is updated (and maybe enhanced 
 volatiles is ready)?
 
 Paul.

Fine by me. I’ll update the webrev. The other aspects could come under this 
issue already on file: https://bugs.openjdk.java.net/browse/JDK-8033491.

Thanks,

Brian

 [*] although i still question the use of thread locals, but that is a 
 separate issue that requires more investigation and i don't know if if can be 
 improved.



Re: PING! Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-04-15 Thread Brian Burkhalter

On Apr 15, 2014, at 9:24 AM, Brian Burkhalter brian.burkhal...@oracle.com 
wrote:

 On Apr 15, 2014, at 12:23 AM, Paul Sandoz paul.san...@oracle.com wrote:
 
 My inclination is if it ain't broke...  and AFAICT nothing indicates 
 toString it is particular broken [*], so perhaps just focus on the cleanup 
 aspects and revisit the other when the JMM is updated (and maybe enhanced 
 volatiles is ready)?

I updated the patch to remove the toString() changes:

http://cr.openjdk.java.net/~bpb/6375303/webrev.02/

Unless there are objections within the next 24 hours or so I will push this 
as-is.

 Fine by me. I’ll update the webrev. The other aspects could come under this 
 issue already on file:https://bugs.openjdk.java.net/browse/JDK-8033491.

I linked this issue to 6375303 and included links to the archive discussion 
thread.

Thanks,

Brian

PING! Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-04-10 Thread Brian Burkhalter
Second day back from vacation so I guess it’s time to beat this horse again …

As there was no response to the message included below I am simply re-posting 
it.

Thanks,

Brian

On Mar 25, 2014, at 7:19 AM, Brian Burkhalter brian.burkhal...@oracle.com 
wrote:

 On Mar 25, 2014, at 1:58 AM, Paul Sandoz paul.san...@oracle.com wrote:
 
 This is another example of a stable variable.
 
 I would like to re-iterate my scepticism that such changes are necessary in 
 this case (i am not sure if it is possible to create a benchmark that could 
 better exacerbate the concurrent overlap of calls to layoutChars). But, i do 
 agree the discussion has been useful and interesting.
 
 I am happy either to leave the toString() code as it is or to change it to 
 the variant with toStringSlow(). There is however other cleanup in the patch 
 to consider. So it would be good to get consensus on the two points:
 
 1) Change toString() to variant using toStringSlow() or leave it as-is.
 2) Change non-toString() code as indicated in the patch or leave it as-is.
 
 If “as-is” is the answer in both cases, then it’s simply a matter of 
 resolving the enhancement as “not an issue.”
 
 Thanks,
 
 Brian



Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-25 Thread Peter Levart

On 03/24/2014 06:48 PM, Brian Burkhalter wrote:

Hi Peter,

On Mar 24, 2014, at 2:21 AM, Peter Levart peter.lev...@gmail.com 
mailto:peter.lev...@gmail.com wrote:


Thanks to Aleksey for re-establishing the access, I bring you results 
of the microbenchmark from his quad-core Cortex-A9:


Thanks to you and Aleksey for taking the initiative to test on that 
platform.


...as can be seen, the double-checked read-then-volatile-read+CAS 
trick is about 15% faster than classic volatile-read+CAS in this case.


Just to be sure, by “double-checked read-then-volaite-read+CAS trick” 
you intend the variant with two methods toString/toStringSlow?”


That's right. I split the code into two methods in both cases to make 
sure it gets inlined for fast-path. The variant in your webrev with 
single method has been demonstrated to exceed the maximum inline 
bytecode size.


Regards, Peter



Thanks,

Brian




Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-25 Thread Paul Sandoz
On Mar 24, 2014, at 8:30 PM, Mike Duigou mike.dui...@oracle.com wrote:
 The other issue here, and why this thread has gone on so long, is that we've 
 been trying to establish what the best idiom is (for 2014-) for this lazy 
 initialization situation so that we can reuse it with less thought and 
 review. In this case the chances of race are fairly low and the cost of extra 
 layoutChars is bearable. In other cases where we are likely to use the same 
 idiom this is less likely true. Peter's current implementation has the 
 advantage that is the best-right answer for lazy initialization even in the 
 contended/expensive case. I look forward to revisiting this again when the 
 new JMM primitives are available. :-)
 

This is another example of a stable variable.

I would like to re-iterate my scepticism that such changes are necessary in 
this case (i am not sure if it is possible to create a benchmark that could 
better exacerbate the concurrent overlap of calls to layoutChars). But, i do 
agree the discussion has been useful and interesting.

Paul.


Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-25 Thread Brian Burkhalter

On Mar 25, 2014, at 1:58 AM, Paul Sandoz paul.san...@oracle.com wrote:

 This is another example of a stable variable.
 
 I would like to re-iterate my scepticism that such changes are necessary in 
 this case (i am not sure if it is possible to create a benchmark that could 
 better exacerbate the concurrent overlap of calls to layoutChars). But, i do 
 agree the discussion has been useful and interesting.

I am happy either to leave the toString() code as it is or to change it to the 
variant with toStringSlow(). There is however other cleanup in the patch to 
consider. So it would be good to get consensus on the two points:

1) Change toString() to variant using toStringSlow() or leave it as-is.
2) Change non-toString() code as indicated in the patch or leave it as-is.

If “as-is” is the answer in both cases, then it’s simply a matter of resolving 
the enhancement as “not an issue.”

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-24 Thread Peter Levart

On 03/20/2014 08:49 AM, Aleksey Shipilev wrote:

On 03/20/2014 11:06 AM, Peter Levart wrote:

I was thinking about last night, for question: Why is this
double-checked non-volatile-then-volatile trick not any faster than pure
volatile variant even on ARM platform where volatile read should have
some penalty compared to normal read?, might be in the fact that
Raspberry Pi is a single-core/single-thread machine. Would anyone with
JVM JIT compiler expertise care to share some insight? I suspect that on
such platform, the compiler optimizes volatile accesses so that they are
performed without otherwise necessary memory fences...

Yes, at least C2 is known to not emit memory fences on uniprocessor
machines. You need to have a multicore ARM. If you are still interested,
contact me privately and I can arrange the access to my personal
quad-core Cortex-A9.

-Aleksey.


Hi,

Thanks to Aleksey for re-establishing the access, I bring you results of 
the microbenchmark from his quad-core Cortex-A9:


JDK 8 options: -client, org.openjdk.jmh.Main parameters: .* -i 10 -r 5 -wi 5 
-w 1 -f 1 [-t 1|max]

--- Baseline, 1-thread ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.t.Bench6375303.testFirstToString avgt1069292.305  299.516 
   ns/op
o.t.Bench6375303.testToString  avgt10*20.003* 0.433
ns/op

--- Baseline, 4-threads ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.t.Bench6375303.testFirstToString avgt10   100390.024 2158.132 
   ns/op
o.t.Bench6375303.testToString  avgt10*20.151* 0.677
ns/op

--- double-checked nonvolatile-then-volatile-read+CAS, 1-thread ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.t.Bench6375303.testFirstToString avgt1069951.406  221.516 
   ns/op
o.t.Bench6375303.testToString  avgt10*19.681* 0.025
ns/op

--- double-checked nonvolatile-then-volatile-read+CAS, 4-threads ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.t.Bench6375303.testFirstToString avgt10   104231.335 3842.095 
   ns/op
o.t.Bench6375303.testToString  avgt10*20.030* 0.595
ns/op

--- classic volatile read+CAS, 1-thread ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.t.Bench6375303.testFirstToString avgt1069753.542  180.110 
   ns/op
o.t.Bench6375303.testToString  avgt10*23.285* 0.267
ns/op

--- classic volatile read+CAS, 4-threads ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.t.Bench6375303.testFirstToString avgt1099664.256 1814.090 
   ns/op
o.t.Bench6375303.testToString  avgt10*23.491* 0.606
ns/op


...as can be seen, the double-checked read-then-volatile-read+CAS trick 
is about 15% faster than classic volatile-read+CAS in this case.



Regards, Peter



Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-24 Thread Brian Burkhalter
Hi Peter,

On Mar 22, 2014, at 2:01 AM, Peter Levart peter.lev...@gmail.com wrote:

 Looks good. Just a nit. In the following method:
 
 3726 private static void matchScale(BigDecimal[] val) {
I concur.

 One of 3 ifs is superfluous. Either the 1st one:
 
 private static void matchScale(BigDecimal[] val) {
 /* if (val[0].scale == val[1].scale) {
 // return
 } else */ if (val[0].scale  val[1].scale) {
 val[0] = val[0].setScale(val[1].scale, ROUND_UNNECESSARY);
 } else if (val[1].scale  val[0].scale) {
 val[1] = val[1].setScale(val[0].scale, ROUND_UNNECESSARY);
 }
 }
I think this one. I’ll update the patch accordingly.

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-24 Thread Brian Burkhalter
Hi Peter,

On Mar 24, 2014, at 2:21 AM, Peter Levart peter.lev...@gmail.com wrote:

 Thanks to Aleksey for re-establishing the access, I bring you results of the 
 microbenchmark from his quad-core Cortex-A9:

Thanks to you and Aleksey for taking the initiative to test on that platform.

 ...as can be seen, the double-checked read-then-volatile-read+CAS trick is 
 about 15% faster than classic volatile-read+CAS in this case.

Just to be sure, by “double-checked read-then-volaite-read+CAS trick” you 
intend the variant with two methods toString/toStringSlow?”

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-24 Thread Peter Levart


On 03/24/2014 06:52 PM, Martin Buchholz wrote:




On Wed, Mar 12, 2014 at 1:45 AM, Peter Levart peter.lev...@gmail.com 
mailto:peter.lev...@gmail.com wrote:



What would the following cost?


private transient String stringCache;

public String toString() {
String sc = stringCache;
if (sc == null) {
sc = (String) U.getObjectVolatile(this,
STRING_CACHE_OFFSET);
if (sc == null) {
sc = layoutChars(true);
if (!U.compareAndSwapObject(this,
STRING_CACHE_OFFSET, null, sc)) {
sc = (String) U.getObjectVolatile(this,
STRING_CACHE_OFFSET);
}
}
}
return sc;
}


I feel I'm missing something.  If read - volatile read - CAS works, 
then why wouldn't read - CAS work and be slightly preferable, because 
races are unlikely?


public String toString() {
  String sc = stringCache;
  if (sc == null) {
  sc = layoutChars(true);
  if (!U.compareAndSwapObject(this, STRING_CACHE_OFFSET, null, 
sc)) {
  sc = (String) U.getObjectVolatile(this, 
STRING_CACHE_OFFSET);

  }
  }
  return sc;
  }


...yeah, I thought about that too. In any case, the overhead of volatile 
re-read is negligible in this case, since it happens on slow-path and it 
might reduce the chance of superfluos calls to layoutChars.


Regards, Peter



Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-24 Thread Mike Duigou

On Mar 24 2014, at 12:25 , Martin Buchholz marti...@google.com wrote:

 
 
 
 On Mon, Mar 24, 2014 at 11:25 AM, Peter Levart peter.lev...@gmail.com wrote:
 
 On 03/24/2014 06:52 PM, Martin Buchholz wrote:
 
 
 
 On Wed, Mar 12, 2014 at 1:45 AM, Peter Levart peter.lev...@gmail.com wrote:
 
 What would the following cost?
 
 
 private transient String stringCache;
 
 public String toString() {
 String sc = stringCache;
 if (sc == null) {
 sc = (String) U.getObjectVolatile(this, STRING_CACHE_OFFSET);
 if (sc == null) {
 sc = layoutChars(true);
 if (!U.compareAndSwapObject(this, STRING_CACHE_OFFSET, null, 
 sc)) {
 sc = (String) U.getObjectVolatile(this, 
 STRING_CACHE_OFFSET);
 }
 }
 }
 return sc;
 }
 
 I feel I'm missing something.  If read - volatile read - CAS works, then 
 why wouldn't read - CAS work and be slightly preferable, because races are 
 unlikely?
 
   public String toString() {
 String sc = stringCache;
 if (sc == null) {
 sc = layoutChars(true);
 if (!U.compareAndSwapObject(this, STRING_CACHE_OFFSET, null, 
 sc)) {
 sc = (String) U.getObjectVolatile(this, STRING_CACHE_OFFSET);
 }
 }
 return sc;
 }
 
 ...yeah, I thought about that too. In any case, the overhead of volatile 
 re-read is negligible in this case, since it happens on slow-path and it 
 might reduce the chance of superfluos calls to layoutChars.
 
 Hmmm OK.  I still slightly prefer my version, but I can see there is a 
 tradeoff, and the difference is very small.

The other issue here, and why this thread has gone on so long, is that we've 
been trying to establish what the best idiom is (for 2014-) for this lazy 
initialization situation so that we can reuse it with less thought and review. 
In this case the chances of race are fairly low and the cost of extra 
layoutChars is bearable. In other cases where we are likely to use the same 
idiom this is less likely true. Peter's current implementation has the 
advantage that is the best-right answer for lazy initialization even in the 
contended/expensive case. I look forward to revisiting this again when the new 
JMM primitives are available. :-)

Mike




Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-22 Thread Peter Levart


On 03/22/2014 02:01 AM, Brian Burkhalter wrote:
On Mar 20, 2014, at 12:49 AM, Aleksey Shipilev 
aleksey.shipi...@oracle.com mailto:aleksey.shipi...@oracle.com wrote:



On 03/20/2014 11:06 AM, Peter Levart wrote:

I was thinking about last night, for question: Why is this
double-checked non-volatile-then-volatile trick not any faster than pure
volatile variant even on ARM platform where volatile read should have
some penalty compared to normal read?, might be in the fact that
Raspberry Pi is a single-core/single-thread machine. Would anyone with
JVM JIT compiler expertise care to share some insight? I suspect that on
such platform, the compiler optimizes volatile accesses so that they are
performed without otherwise necessary memory fences...


Yes, at least C2 is known to not emit memory fences on uniprocessor
machines. You need to have a multicore ARM. If you are still interested,
contact me privately and I can arrange the access to my personal
quad-core Cortex-A9.


This is all very interesting but I would like to have closure so to 
speak on the patches that I posted.


Hi Brian,


I think the questions still unresolved are:

1) Is a change of toString() necessary?


I can't speak of that since I don't have empirical data in what 
proportion of code it would be beneficial. I can imagine situations 
where returned strings are used as keys in HashMap and String's equals 
is optimized for same-instance comparison, but I don't know if such 
situations happen in practice...


2) If #1 is “yes” then which version of the patch? (I think the most 
recent)


Aleksey Shipilev kindly let me access it's quad-code ARM Big Iron and 
before the connection to it broke yesterday, I managed to get the 
results of a single-threaded run of the microbenchmark. While 
differences were not spectacular, they were measurable (sorry I haven't 
saved the results). The variant with volatile stringCache field and CAS 
was a little slower than the double-checked 
nonvolatile-then-volatile-read-and-CAS trick.


It's also important that the fast-path method does not exceed the 
maximum bytecode size for inlining (35 bytes by default, I think) so the 
variant with two methods toString/toStringSlow is necessary to avoid 
performance regression.



3) Are the other changes OK which are unrelated to toString()?


Looks good. Just a nit. In the following method:

3726 private static void matchScale(BigDecimal[] val) {
3727 if (val[0].scale == val[1].scale) {
3728 // return
3729 } else if (val[0].scale  val[1].scale) {
3730 val[0] = val[0].setScale(val[1].scale, ROUND_UNNECESSARY);
3731 } else if (val[1].scale  val[0].scale) {
3732 val[1] = val[1].setScale(val[0].scale, ROUND_UNNECESSARY);
3733 }
3734 }


One of 3 ifs is superfluous. Either the 1st one:

private static void matchScale(BigDecimal[] val) {
*/* if (val[0].scale == val[1].scale) {
// return
} else */*  if (val[0].scale  val[1].scale) {
val[0] = val[0].setScale(val[1].scale, ROUND_UNNECESSARY);
} else if (val[1].scale  val[0].scale) {
val[1] = val[1].setScale(val[0].scale, ROUND_UNNECESSARY);
}
}


...or the last one:

private static void matchScale(BigDecimal[] val) {
if (val[0].scale == val[1].scale) {
// return
} else if (val[0].scale  val[1].scale) {
val[0] = val[0].setScale(val[1].scale, ROUND_UNNECESSARY);
} else*/* if (val[1].scale  val[0].scale) */*  {
val[1] = val[1].setScale(val[0].scale, ROUND_UNNECESSARY);
}
}



Regards, Peter



Thanks,

Brian




Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-21 Thread Brian Burkhalter
On Mar 20, 2014, at 12:49 AM, Aleksey Shipilev aleksey.shipi...@oracle.com 
wrote:

 On 03/20/2014 11:06 AM, Peter Levart wrote:
 I was thinking about last night, for question: Why is this
 double-checked non-volatile-then-volatile trick not any faster than pure
 volatile variant even on ARM platform where volatile read should have
 some penalty compared to normal read?, might be in the fact that
 Raspberry Pi is a single-core/single-thread machine. Would anyone with
 JVM JIT compiler expertise care to share some insight? I suspect that on
 such platform, the compiler optimizes volatile accesses so that they are
 performed without otherwise necessary memory fences...
 
 Yes, at least C2 is known to not emit memory fences on uniprocessor
 machines. You need to have a multicore ARM. If you are still interested,
 contact me privately and I can arrange the access to my personal
 quad-core Cortex-A9.

This is all very interesting but I would like to have closure so to speak on 
the patches that I posted. I think the questions still unresolved are:

1) Is a change of toString() necessary?
2) If #1 is “yes” then which version of the patch? (I think the most recent)
3) Are the other changes OK which are unrelated to toString()?

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-20 Thread Aleksey Shipilev
On 03/20/2014 11:06 AM, Peter Levart wrote:
 I was thinking about last night, for question: Why is this
 double-checked non-volatile-then-volatile trick not any faster than pure
 volatile variant even on ARM platform where volatile read should have
 some penalty compared to normal read?, might be in the fact that
 Raspberry Pi is a single-core/single-thread machine. Would anyone with
 JVM JIT compiler expertise care to share some insight? I suspect that on
 such platform, the compiler optimizes volatile accesses so that they are
 performed without otherwise necessary memory fences...

Yes, at least C2 is known to not emit memory fences on uniprocessor
machines. You need to have a multicore ARM. If you are still interested,
contact me privately and I can arrange the access to my personal
quad-core Cortex-A9.

-Aleksey.


Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-19 Thread Brian Burkhalter

On Mar 14, 2014, at 7:17 AM, Brian Burkhalter brian.burkhal...@oracle.com 
wrote:

 On Mar 14, 2014, at 3:39 AM, Peter Levart wrote:
 
 But in general it would be better to just use ThreadLocalRandom.current() 
 everywhere you use rnd variable. This is precisely it's purpose - a random 
 number generator that is never contended. The overhead of 
 ThreadLocalRandom.current() call is hardly measurable by itself.
 
 I'll update that and re-run some of the benchmarks later.

Following up on the content above and this earlier message in the thread:

http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-March/025676.html

I have posted a revised patch (NB: I know lines 2897-2906 should be elsewhere)

http://cr.openjdk.java.net/~bpb/6375303/webrev.01/

and updated benchmark source (using only ThreadLocalRandom.current())

http://cr.openjdk.java.net/~bpb/6375303/Bench6375303.java

and updated benchmark  results for three different variations

http://cr.openjdk.java.net/~bpb/6375303/6375303-bench-2.html

This version of toString() is from Peter and dispenses with the volatile 
qualifier on stringCache. At least on my system, there is no statistically 
significant micro-performance difference among the three versions tested, viz., 
baseline, toString() change only, toString() change plus other cleanup.

Any comments appreciated.

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-19 Thread Peter Levart


On 03/19/2014 11:01 PM, Brian Burkhalter wrote:


On Mar 14, 2014, at 7:17 AM, Brian Burkhalter 
brian.burkhal...@oracle.com mailto:brian.burkhal...@oracle.com wrote:



On Mar 14, 2014, at 3:39 AM, Peter Levart wrote:

But in general it would be better to just use 
ThreadLocalRandom.current() everywhere you use rnd variable. 
This is precisely it's purpose - a random number generator that is 
never contended. The overhead of ThreadLocalRandom.current() call is 
hardly measurable by itself.


I'll update that and re-run some of the benchmarks later.


Following up on the content above and this earlier message in the thread:

http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-March/025676.html

I have posted a revised patch (NB: I know lines 2897-2906 should be 
elsewhere)


http://cr.openjdk.java.net/~bpb/6375303/webrev.01/ 
http://cr.openjdk.java.net/%7Ebpb/6375303/webrev.01/


and updated benchmark source (using only ThreadLocalRandom.current())

http://cr.openjdk.java.net/~bpb/6375303/Bench6375303.java 
http://cr.openjdk.java.net/%7Ebpb/6375303/Bench6375303.java


and updated benchmark  results for three different variations

http://cr.openjdk.java.net/~bpb/6375303/6375303-bench-2.html 
http://cr.openjdk.java.net/%7Ebpb/6375303/6375303-bench-2.html


This version of toString() is from Peter and dispenses with the 
volatile qualifier on stringCache. At least on my system, there is no 
statistically significant micro-performance difference among the three 
versions tested, viz., baseline, toString() change only, toString() 
change plus other cleanup.


Any comments appreciated.

Thanks,

Brian


Hi Brian,

Here's my promised run of your latest webrev and microbenchmark on ARM 
platform (Raspberry Pi) with just released JDK 8 for ARM (-client 
compiler, since -server does not work on Raspberry Pi):


org.openjdk.jmh.Main parameters: .* -i 10 -r 5 -wi 5 -w 1 -f 1 -t 1

--- Baseline, 1-thread ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.s.Bench6375303.testFirstToString avgt10   330618.266 2211.637 
   ns/op
o.s.Bench6375303.testToString  avgt10   80.5460.134 
   ns/op

--- Proposed webrev, 1-thread ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.s.Bench6375303.testFirstToString avgt10   326588.284 1714.892 
   ns/op
o.s.Bench6375303.testToString  avgt10  102.5820.295 
   ns/op

--- Previous variant with volatile stringCache field, 1-thread ---

Benchmark  Mode   Samples Mean   Mean error 
   Units
o.s.Bench6375303.testFirstToString avgt10   328795.783 2508.173 
   ns/op
o.s.Bench6375303.testToString  avgt10  105.7410.316 
   ns/op


So both variants seem to be more or less the same but slower than baseline.

Why would they be slower? Answer: they have more bytecodes.

If I run with following JVM options: -XX:+UnlockDiagnosticVMOptions -XX:MaxInlineSize=100 
(and only the testToString benchmark), I get:


--- Baseline, 1-thread ---

Benchmark Mode   Samples Mean   Mean error
Units
o.s.Bench6375303.testToString avgt10   80.8390.742
ns/op

--- Proposed webrev, 1-thread ---

Benchmark Mode   Samples Mean   Mean error
Units
o.s.Bench6375303.testToString avgt10   80.8510.771
ns/op

--- Previous variant with volatile stringCache field, 1-thread ---

Benchmark Mode   Samples Mean   Mean error
Units
o.s.Bench6375303.testToString avgt10   80.8340.749
ns/op



So the solution is to reduce number of bytecodes in toString(). For example, 
the following:


public String toString() {
String sc = stringCache;
if (sc == null) {
sc = toStringSlow();
}
return sc;
}

private String toStringSlow() {
String sc = (String) U.getObjectVolatile(this, STRING_CACHE_OFFSET);
if (sc == null) {
sc = layoutChars(true);
if (!U.compareAndSwapObject(this, STRING_CACHE_OFFSET, null, sc)) {
sc = (String) U.getObjectVolatile(this, STRING_CACHE_OFFSET);
}
}
return sc;
}


...gives the good results even without special JVM options:

Benchmark Mode   Samples Mean   Mean error
Units
o.s.Bench6375303.testToString avgt10   80.9250.313
ns/op
 



Regards, Peter



Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-19 Thread Mike Duigou
I
On Mar 19 2014, at 15:01 , Brian Burkhalter brian.burkhal...@oracle.com wrote:

 
 On Mar 14, 2014, at 7:17 AM, Brian Burkhalter brian.burkhal...@oracle.com 
 wrote:
 
 On Mar 14, 2014, at 3:39 AM, Peter Levart wrote:
 
 But in general it would be better to just use ThreadLocalRandom.current() 
 everywhere you use rnd variable. This is precisely it's purpose - a 
 random number generator that is never contended. The overhead of 
 ThreadLocalRandom.current() call is hardly measurable by itself.
 
 I'll update that and re-run some of the benchmarks later.
 
 Following up on the content above and this earlier message in the thread:
 
 http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-March/025676.html
 
 I have posted a revised patch (NB: I know lines 2897-2906 should be elsewhere)
 
 http://cr.openjdk.java.net/~bpb/6375303/webrev.01/
 
 and updated benchmark source (using only ThreadLocalRandom.current())
 
 http://cr.openjdk.java.net/~bpb/6375303/Bench6375303.java
 
 and updated benchmark  results for three different variations
 
 http://cr.openjdk.java.net/~bpb/6375303/6375303-bench-2.html
 
 This version of toString() is from Peter and dispenses with the volatile 
 qualifier on stringCache. At least on my system, there is no statistically 
 significant micro-performance difference among the three versions tested, 
 viz., baseline, toString() change only, toString() change plus other cleanup.

Since the Unsafe.getObjectVolatile() allows use of volatile semantics without 
having to declare the field volatile I think this is a better idiom than what I 
had previously suggested. Hopefully this is a pattern we can use elsewhere.

Are the java.util.concurrent.atomic imports still needed?

I have not reviewed the other changes.

Mike

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-19 Thread Brian Burkhalter

On Mar 19, 2014, at 4:32 PM, Mike Duigou mike.dui...@oracle.com wrote:

 Since the Unsafe.getObjectVolatile() allows use of volatile semantics without 
 having to declare the field volatile I think this is a better idiom than what 
 I had previously suggested. Hopefully this is a pattern we can use elsewhere.
 
 Are the java.util.concurrent.atomic imports still needed?

No they are not. I can remove them (and move the code at lines 2897-2906 to 
following coding conventions) if this is eventually approved.

 I have not reviewed the other changes.

Aside from toString() they are mostly straightforward cleanup.

Thanks,

Brian



Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-19 Thread Brian Burkhalter
Hi Peter,

On Mar 19, 2014, at 4:32 PM, Peter Levart peter.lev...@gmail.com wrote:

 So the solution is to reduce number of bytecodes in toString(). For 
 example, the following:
 
 
 public String toString() {
 String sc = stringCache;
 if (sc == null) {
 sc = toStringSlow();
 }
 return sc;
 }
 
 private String toStringSlow() {
 String sc = (String) U.getObjectVolatile(this, STRING_CACHE_OFFSET);
 if (sc == null) {
 sc = layoutChars(true);
 if (!U.compareAndSwapObject(this, STRING_CACHE_OFFSET, null, sc)) 
 {
 sc = (String) U.getObjectVolatile(this, STRING_CACHE_OFFSET);
 }
 }
 return sc;
 }
 
 
 ...gives the good results even without special JVM options:
 
 Benchmark Mode   Samples Mean   Mean error
 Units
 o.s.Bench6375303.testToString avgt10   80.9250.313
 ns/op

That’s great! I can re-try that version on my system for comparison.

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-19 Thread Brian Burkhalter

On Mar 19, 2014, at 4:41 PM, Brian Burkhalter brian.burkhal...@oracle.com 
wrote:

 Benchmark Mode   Samples Mean   Mean error   
  Units
 o.s.Bench6375303.testToString avgt10   80.9250.313   
  ns/op
 
 That’s great! I can re-try that version on my system for comparison.

Here are the results on my system:

http://cr.openjdk.java.net/~bpb/6375303/6375303-bench-3.html

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-14 Thread Peter Levart

On 03/14/2014 01:29 AM, Brian Burkhalter wrote:


On Mar 12, 2014, at 2:08 AM, Peter Levart wrote:

Huh! This is a ThreadLocalRandom anti-pattern. Thou should not use a 
ThreadLocalRandom instance in a thread that did not obtain it via a 
call to ThreadLocalRandom.current()…


Good point.

You could create a new BigInteger(512,*rnd*) in the static 
initializer and use it to create new BigDecima(bigInteger) in 
testFirstToStrin.


I simply replaced ThreadLocalRandom with Random which probably creates 
a bottleneck but there is no significant difference in the results.


That's probably because the time spent computing the toString 
representation initially (the concurrent part of testFirstToString 
benchmark) takes much more time than computing the random number (the 
synchronized/serialized part). The difference would probably show with 
bigger number of threads on a suitable machine of course.


But in general it would be better to just use 
ThreadLocalRandom.current() everywhere you use rnd variable. This is 
precisely it's purpose - a random number generator that is never 
contended. The overhead of ThreadLocalRandom.current() call is hardly 
measurable by itself.


Regards, Peter



http://cr.openjdk.java.net/~bpb/6375303/Bench6375303.java 
http://cr.openjdk.java.net/%7Ebpb/6375303/Bench6375303.java
http://cr.openjdk.java.net/~bpb/6375303/6375303-bench.html 
http://cr.openjdk.java.net/%7Ebpb/6375303/6375303-bench.html


Thanks,

Brian




Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-14 Thread Peter Levart

On 03/14/2014 01:32 AM, Brian Burkhalter wrote:


On Mar 12, 2014, at 1:45 AM, Peter Levart wrote:


What would the following cost?


   private transient String stringCache;

   public String toString() {
  […]


This version did not show any significant difference in the benchmark 
results (updated)


http://cr.openjdk.java.net/~bpb/6375303/6375303-bench.html 
http://cr.openjdk.java.net/%7Ebpb/6375303/6375303-bench.html


versus the other approaches. This is not to say that the benchmark is 
valid for any of them. This approach does have the advantage or not 
using volatile.


I've forgot to run this on my Raspberry Pi. Stay tuned...

Peter



Thanks,

Brian




Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-14 Thread Brian Burkhalter

On Mar 14, 2014, at 3:39 AM, Peter Levart wrote:

 But in general it would be better to just use ThreadLocalRandom.current() 
 everywhere you use rnd variable. This is precisely it's purpose - a random 
 number generator that is never contended. The overhead of 
 ThreadLocalRandom.current() call is hardly measurable by itself.

I'll update that and re-run some of the benchmarks later.

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-14 Thread Brian Burkhalter

On Mar 14, 2014, at 3:42 AM, Peter Levart wrote:

 This approach does have the advantage or not using volatile.
 
 I've forgot to run this on my Raspberry Pi. Stay tuned...

I used to have a BeagleBoard on a previous project but do not have any of these 
sorts of devices at my disposal at this time.

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-13 Thread Brian Burkhalter

On Mar 12, 2014, at 2:08 AM, Peter Levart wrote:

 Huh! This is a ThreadLocalRandom anti-pattern. Thou should not use a 
 ThreadLocalRandom instance in a thread that did not obtain it via a call to 
 ThreadLocalRandom.current()…

Good point.

 You could create a new BigInteger(512, rnd) in the static initializer and use 
 it to create new BigDecima(bigInteger) in testFirstToStrin.

I simply replaced ThreadLocalRandom with Random which probably creates a 
bottleneck but there is no significant difference in the results.

http://cr.openjdk.java.net/~bpb/6375303/Bench6375303.java
http://cr.openjdk.java.net/~bpb/6375303/6375303-bench.html

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-13 Thread Brian Burkhalter

On Mar 12, 2014, at 1:45 AM, Peter Levart wrote:

 What would the following cost?
 
 
private transient String stringCache;
 
public String toString() {
   […]

This version did not show any significant difference in the benchmark results 
(updated)

http://cr.openjdk.java.net/~bpb/6375303/6375303-bench.html

versus the other approaches. This is not to say that the benchmark is valid for 
any of them. This approach does have the advantage or not using volatile.

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-12 Thread Peter Levart

On 03/11/2014 06:07 PM, Mike Duigou wrote:

You are making stringCache volatile - performance penalty on ARM  PPC.

It is understood that the volatile is not free. For this purpose it was 
believed that the performance cost was a fair trade off to avoid the heap pollution. 
Regardless of the decision in this case the method used would seem to be the best caching 
pattern available. We do need to establish when it is appropriate to use this solution 
and what the tradeoffs are that would make other solutions more appropriate.



Hi,

What would the following cost?


private transient String stringCache;

public String toString() {
String sc = stringCache;
if (sc == null) {
sc = (String) U.getObjectVolatile(this, STRING_CACHE_OFFSET);
if (sc == null) {
sc = layoutChars(true);
if (!U.compareAndSwapObject(this, STRING_CACHE_OFFSET, 
null, sc)) {
sc = (String) U.getObjectVolatile(this, 
STRING_CACHE_OFFSET);

}
}
}
return sc;
}

private static final Unsafe U = Unsafe.getUnsafe();
private static final long STRING_CACHE_OFFSET;
static {
try {
STRING_CACHE_OFFSET = 
U.objectFieldOffset(BigDecimal.class.getDeclaredField(stringCache));

} catch (NoSuchFieldException e) {
throw new Error(e);
}
}



Regards, Peter



Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-12 Thread Peter Levart

On 03/11/2014 06:10 PM, Brian Burkhalter wrote:

Sergey,

On Mar 11, 2014, at 1:18 AM, Sergey Kuksenko wrote:


Could you share your benchmarks?

Of course. Please see:

benchmark source:   
http://cr.openjdk.java.net/~bpb/6375303/Bench6375303.java
benchmark results:  
http://cr.openjdk.java.net/~bpb/6375303/6375303-bench.html

Let us know should you find anything amiss.


Do you have access to an ARM system (like Raspberry Pi ?) On Intel, 
volatile read is practically free - especially in micro benchmarks, when 
compared to normal read. The normal read can't be hoisted out of loops 
(the microbenchmark harness tries hard to prevent that)...


I can try this on my Raspberry Pi when I get home. ;-)

It would also be helpful if the JIT-ed assembly for the toString method 
could be dumped in various scenarios.


Regards, Peter



Thanks,

Brian




Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-12 Thread Peter Levart

On 03/11/2014 06:10 PM, Brian Burkhalter wrote:

Sergey,

On Mar 11, 2014, at 1:18 AM, Sergey Kuksenko wrote:


Could you share your benchmarks?

Of course. Please see:

benchmark source:   
http://cr.openjdk.java.net/~bpb/6375303/Bench6375303.java


public class Bench6375303 {
* static final Random rnd = ThreadLocalRandom.current();*
static final BigDecimal BIG_D = new BigDecimal(new BigInteger(512, rnd));
static final String BIG_D_STRING = BIG_D.toString();

@GenerateMicroBenchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public void testFirstToString(BlackHole bh) {
BigDecimal bigD = new BigDecimal(new BigInteger(512,*rnd*));
bh.consume(bigD.toString());
}

@GenerateMicroBenchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public void testToString(BlackHole bh) {
bh.consume(BIG_D.toString());
}
}


Huh! This is a ThreadLocalRandom anti-pattern. Thou should not use a 
ThreadLocalRandom instance in a thread that did not obtain it via a call 
to ThreadLocalRandom.current()...


You could create a new BigInteger(512, *rnd*) in the static initializer 
and use it to create new BigDecima(bigInteger) in testFirstToStrin.


*
Regards, Peter

*


benchmark results:  
http://cr.openjdk.java.net/~bpb/6375303/6375303-bench.html

Let us know should you find anything amiss.

Thanks,

Brian




Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-12 Thread Paul Sandoz

On Mar 11, 2014, at 9:05 AM, Sergey Kuksenko sergey.kukse...@oracle.com wrote:

 Brian, Mike.
 
 Could you explain what is the problem with the old caching?
 String is immutable, BigDecimal is immutable. So we have data race at that 
 place, but it is safe data race. What is the problem if we create several 
 identical strings?

I was wondering the same. Not convinced it is so very important for toString() 
to guarantee one and only one reference to the same value. I would be inclined 
to leave this alone unless some data indicates otherwise.

I do question the use of the ThreadLocal to in effect cache a StringBuilder and 
char array of 19 characters for memory reuse (that is heap pollution per 
thread! :-) ). This code was introduced ~5 years ago, perhaps it is time to 
revisit and measure using SPEC* benchmarks?

Paul.


 You are making stringCache volatile - performance penalty on ARM  PPC.
 Setting cache via CAS - performance penalty everywhere.
 
 If you insist to use AtomicFieldUpdater - the better way is to use lazySet, 
 not CAS.


Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-11 Thread Sergey Kuksenko

Brian, Mike.

Could you explain what is the problem with the old caching?
String is immutable, BigDecimal is immutable. So we have data race at 
that place, but it is safe data race. What is the problem if we create 
several identical strings?

You are making stringCache volatile - performance penalty on ARM  PPC.
Setting cache via CAS - performance penalty everywhere.

If you insist to use AtomicFieldUpdater - the better way is to use 
lazySet, not CAS.



On 03/04/2014 09:21 PM, Mike Duigou wrote:


On Mar 4 2014, at 07:13 , Peter Levart peter.lev...@gmail.com wrote:


On 03/04/2014 01:14 AM, Brian Burkhalter wrote:

- add AtomicReferenceFieldUpdater-type constant for stringCache initialization


Hi Brian,

By using volatile read and CAS, there's still a chance that multiple concurrent 
threads will be invoking the layoutChars(true) concurrently, but you guarantee 
that only a single instance of String will ever be returned as a result of the 
toString() method. Is that what you were pursuing?


Yes. (I provided the AtomicReferenceFieldUpdater code). The previous 
implementation had a benign data race that could result in a later layoutChars 
result replacing an earlier result and multiple string instances being 
returned. The new implementation, at small cost, prevents multiple different 
instances from being returned.

Mike



--
Best regards,
Sergey Kuksenko


Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-11 Thread Sergey Kuksenko

Brian,
Could you share your benchmarks?

On 03/04/2014 04:14 AM, Brian Burkhalter wrote:

Hello all,

This review request concerns this issue and proposed patch:

issue   https://bugs.openjdk.java.net/browse/JDK-6375303
patch   http://cr.openjdk.java.net/~bpb/6375303/webrev.00/

A review of the current code did not reveal anything clearly amiss in the 
nature of the the problems which instigated the original filing of this issue. 
Examination of the code did however suggest some code style and aesthetic 
improvements as well as a change to how one lazy initialization was implemented 
(here we go again ...).

Summary:

- clean up imports section
- add AtomicReferenceFieldUpdater-type constant for stringCache initialization
- change some existing static final constants to conform to Java coding style
- change name of some local variables so as not to hide instance variables of 
the same name
- add missing @Override annotations
- modify lazy initialization of stringCache in toString() to use 
AtomicReferenceFieldUpdater constant
- various other minor cleanup

Note that the performance of initialization of stringCache was measured for the 
extant double-checked locking approach, an AtomicReferenceString, and the 
AtomicReferenceFieldUpdater and no significant degradation was observed for either of 
the latter versus the baseline for 1 and 4 threads. The updater has the advantage of 
adding only one object per class as opposed to one per instance.

Thanks,

Brian



--
Best regards,
Sergey Kuksenko


Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-11 Thread Mike Duigou

On Mar 11 2014, at 01:05 , Sergey Kuksenko sergey.kukse...@oracle.com wrote:

 Brian, Mike.
 
 Could you explain what is the problem with the old caching?

Concern over heap pollution with extra string copies was the motivation to 
ensure that only a single cached instance was ever returned.

 String is immutable, BigDecimal is immutable. So we have data race at that 
 place, but it is safe data race.

Agreed that it is safe.

 What is the problem if we create several identical strings?

We can still create several identical strings. We will now only return the 
first assigned to stringCache. Other string instances will not escape.

 You are making stringCache volatile - performance penalty on ARM  PPC.

It is understood that the volatile is not free. For this purpose it was 
believed that the performance cost was a fair trade off to avoid the heap 
pollution. Regardless of the decision in this case the method used would seem 
to be the best caching pattern available. We do need to establish when it is 
appropriate to use this solution and what the tradeoffs are that would make 
other solutions more appropriate.  

 Setting cache via CAS - performance penalty everywhere.

This happens only once per BigDecimal (or not at all if the string format is 
never generated).

 If you insist to use AtomicFieldUpdater - the better way is to use lazySet, 
 not CAS.

That would seem to be inappropriate for this usage as lazySet would effectively 
remove that volatile.

Brian did some benchmarking of various alternatives which he can probably 
provide if we think it is relevant to the performance issue you're concerned 
with.

Mike

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-11 Thread Brian Burkhalter
Sergey,

On Mar 11, 2014, at 1:18 AM, Sergey Kuksenko wrote:

 Could you share your benchmarks?

Of course. Please see:

benchmark source:   
http://cr.openjdk.java.net/~bpb/6375303/Bench6375303.java
benchmark results:  
http://cr.openjdk.java.net/~bpb/6375303/6375303-bench.html

Let us know should you find anything amiss.

Thanks,

Brian


Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-07 Thread Brian Burkhalter
On Mar 3, 2014, at 4:14 PM, Brian Burkhalter wrote:

PING-INg-Ing-ing!

 This review request concerns this issue and proposed patch:
 
 issue https://bugs.openjdk.java.net/browse/JDK-6375303
 patch http://cr.openjdk.java.net/~bpb/6375303/webrev.00/

On Mar 4, 2014, at 9:21 AM, Mike Duigou wrote:

 On Mar 4 2014, at 07:13 , Peter Levart peter.lev...@gmail.com wrote:
 
 On 03/04/2014 01:14 AM, Brian Burkhalter wrote:
 - add AtomicReferenceFieldUpdater-type constant for stringCache 
 initialization
 
 Hi Brian,
 
 By using volatile read and CAS, there's still a chance that multiple 
 concurrent threads will be invoking the layoutChars(true) concurrently, but 
 you guarantee that only a single instance of String will ever be returned as 
 a result of the toString() method. Is that what you were pursuing?
 
 Yes. (I provided the AtomicReferenceFieldUpdater code). The previous 
 implementation had a benign data race that could result in a later 
 layoutChars result replacing an earlier result and multiple string instances 
 being returned. The new implementation, at small cost, prevents multiple 
 different instances from being returned.
 
 Mike

If this looks copacetic, would a JDK 9 Reviewer kindly proffer their Imprimatur?

Thanks,

Brian

Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-04 Thread Peter Levart

On 03/04/2014 01:14 AM, Brian Burkhalter wrote:

- add AtomicReferenceFieldUpdater-type constant for stringCache initialization


Hi Brian,

By using volatile read and CAS, there's still a chance that multiple 
concurrent threads will be invoking the layoutChars(true) concurrently, 
but you guarantee that only a single instance of String will ever be 
returned as a result of the toString() method. Is that what you were 
pursuing?


Regards, Peter



Re: JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-04 Thread Mike Duigou

On Mar 4 2014, at 07:13 , Peter Levart peter.lev...@gmail.com wrote:

 On 03/04/2014 01:14 AM, Brian Burkhalter wrote:
 - add AtomicReferenceFieldUpdater-type constant for stringCache 
 initialization
 
 Hi Brian,
 
 By using volatile read and CAS, there's still a chance that multiple 
 concurrent threads will be invoking the layoutChars(true) concurrently, but 
 you guarantee that only a single instance of String will ever be returned as 
 a result of the toString() method. Is that what you were pursuing?

Yes. (I provided the AtomicReferenceFieldUpdater code). The previous 
implementation had a benign data race that could result in a later layoutChars 
result replacing an earlier result and multiple string instances being 
returned. The new implementation, at small cost, prevents multiple different 
instances from being returned.

Mike

JDK 9 RFR of 6375303: Review use of caching in BigDecimal

2014-03-03 Thread Brian Burkhalter
Hello all,

This review request concerns this issue and proposed patch:

issue   https://bugs.openjdk.java.net/browse/JDK-6375303
patch   http://cr.openjdk.java.net/~bpb/6375303/webrev.00/

A review of the current code did not reveal anything clearly amiss in the 
nature of the the problems which instigated the original filing of this issue. 
Examination of the code did however suggest some code style and aesthetic 
improvements as well as a change to how one lazy initialization was implemented 
(here we go again ...).

Summary:

- clean up imports section
- add AtomicReferenceFieldUpdater-type constant for stringCache initialization
- change some existing static final constants to conform to Java coding style
- change name of some local variables so as not to hide instance variables of 
the same name
- add missing @Override annotations
- modify lazy initialization of stringCache in toString() to use 
AtomicReferenceFieldUpdater constant
- various other minor cleanup

Note that the performance of initialization of stringCache was measured for the 
extant double-checked locking approach, an AtomicReferenceString, and the 
AtomicReferenceFieldUpdater and no significant degradation was observed for 
either of the latter versus the baseline for 1 and 4 threads. The updater has 
the advantage of adding only one object per class as opposed to one per 
instance.

Thanks,

Brian