Hi Aleksey,
What is the overall change in memory use for this set of changes ie what
did we use pre TLR merging and what do we use now?
Thanks,
David
On 17/06/2013 7:00 PM, Aleksey Shipilev wrote:
Hi,
This is the respin of the RFE filed a month ago:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-May/016754.html
The webrev is here:
http://cr.openjdk.java.net/~shade/8014233/webrev.02/
Testing:
- JPRT build passes
- Linux x86_64/release passes jdk/java/lang jtreg
- vm.quick.testlist, vm.quick-gc.testlist on selected platforms
- microbenchmarks, see below
The rationale follows.
After we merged ThreadLocalRandom state in the thread, we are now
missing the padding to prevent false sharing on those heavily-updated
fields. While the Thread is already large enough to separate two TLR
states for two distinct threads, we can still get the false sharing with
other thread fields.
There is the benchmark showcasing this:
http://cr.openjdk.java.net/~shade/8014233/threadbench.zip
There are two test cases: first one is only calling its own TLR with
nextInt() and then the current thread's ID, another test calls *another*
thread ID, thus inducing the false sharing against another thread's TLR
state.
On my 2x2 i5 laptop, running Linux x86_64:
same: 355 +- 1 ops/usec
other: 100 +- 5 ops/usec
Note the decrease in throughput because of the false sharing.
With the patch:
same: 359 +- 1 ops/usec
other: 356 +- 1 ops/usec
Note the performance is back. We want to evade these spurious decreases
in performance, due to either unlucky memory layout, or the user code
(un)intentionally ruining the cache line locality for the updater thread.
Thanks,
-Aleksey.