Hi Paul,
On 05/26/2016 01:20 PM, Paul Sandoz wrote:
Hi Peter,
Opportunistically if your LinearProbeHashtable works out then i am wondering if
we could replace the use of CHM within MethodType.ConcurrentWeakInternSet,
which only uses get/putIfAbsent/remove.
Thereby CHM can use VarHandles without inducing a circular dependency.
Paul.
It could be used, yes. LinearProbeHashtable is not scalable to multiple
threads like CHM is for modifications as it is using a single lock for
all modification operations including rehashing, but it is lock-free for
lookups, so for usecases such as caching, where lookups dominate and
modifications are mostly performed in batches from single thread (when
some subsystem initializes), it could be a viable alternative to CHM.
If it is moved to some jdk.internal subpackage and made public, I could
add missing Map methods, mostly to be able to include it in MOAT tests.
What do you think?
Regards, Peter
On 26 May 2016, at 10:59, Peter Levart <peter.lev...@gmail.com> wrote:
Hi Michael,
On 05/23/2016 03:56 PM, Michael Haupt wrote:
I've ran the unpatched version and Peter's two patches once more. The results
are attached (results.txt). They confirm Aleksey's observation.
Regarding the 03 patch (plevart3 column in the results), perfasm output (see
http://cr.openjdk.java.net/~mhaupt/8031043/perfasm.zip) suggests the cost is
mainly accrued in ConcurrentHashMap. The same is the case for the 02 patch
(plevart2 column).
As things stand, I think we can even focus on Peter's 02 patch, as this is the
faster of his two proposals (plevart2 column in the results), reduces the
footprint, and reduces the implementation complexity. Can anything be done to
improve on its performance? (It has slight performance slowdowns for the
single-value case as well.)
I can't think of anything else short of improving performance of CHM itself.
Or replacing CHM with a "better" implementation:
http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04/
This webrev is similar to webrev.02. It's only difference is in ClassValueMap
which extends LinearProbeHashtable instead of ConcurrentHashMap.
LinearProbeHashtable is a simple implementation of a linear-probe hash table.
It's not a full Map implementation. It only implements methods needed in
ClassValue. With this implementation I get a slight boost compared to JDK 9
ClassValue implementation for all sizes and counts:
Benchmark (classCount) (classValueCount) (impl) Mode
Cnt Score Error Units
ClassValueBench.randomAccess 128 1 jdk9 avgt
10 9.079 ± 0.092 ns/op
ClassValueBench.randomAccess 128 4 jdk9 avgt
10 10.615 ± 0.102 ns/op
ClassValueBench.randomAccess 128 16 jdk9 avgt
10 11.665 ± 0.012 ns/op
ClassValueBench.randomAccess 128 256 jdk9 avgt
10 19.151 ± 0.219 ns/op
ClassValueBench.randomAccess 1024 1 jdk9 avgt
10 14.642 ± 0.425 ns/op
ClassValueBench.randomAccess 1024 4 jdk9 avgt
10 22.577 ± 0.093 ns/op
ClassValueBench.randomAccess 1024 16 jdk9 avgt
10 19.864 ± 0.736 ns/op
ClassValueBench.randomAccess 1024 256 jdk9 avgt
10 60.470 ± 0.285 ns/op
ClassValueBench.sequentialAccess 128 1 jdk9 avgt
10 9.741 ± 0.033 ns/op
ClassValueBench.sequentialAccess 128 4 jdk9 avgt
10 8.252 ± 0.029 ns/op
ClassValueBench.sequentialAccess 128 16 jdk9 avgt
10 7.888 ± 1.249 ns/op
ClassValueBench.sequentialAccess 128 256 jdk9 avgt
10 16.493 ± 0.415 ns/op
ClassValueBench.sequentialAccess 1024 1 jdk9 avgt
10 13.376 ± 0.452 ns/op
ClassValueBench.sequentialAccess 1024 4 jdk9 avgt
10 10.023 ± 0.020 ns/op
ClassValueBench.sequentialAccess 1024 16 jdk9 avgt
10 8.029 ± 0.178 ns/op
ClassValueBench.sequentialAccess 1024 256 jdk9 avgt
10 33.472 ± 0.058 ns/op
Benchmark (classCount) (classValueCount) (impl) Mode
Cnt Score Error Units
ClassValueBench.randomAccess 128 1 pl04 avgt
10 8.955 ± 0.055 ns/op
ClassValueBench.randomAccess 128 4 pl04 avgt
10 9.999 ± 0.017 ns/op
ClassValueBench.randomAccess 128 16 pl04 avgt
10 11.615 ± 1.928 ns/op
ClassValueBench.randomAccess 128 256 pl04 avgt
10 17.063 ± 0.460 ns/op
ClassValueBench.randomAccess 1024 1 pl04 avgt
10 12.553 ± 0.086 ns/op
ClassValueBench.randomAccess 1024 4 pl04 avgt
10 16.766 ± 0.221 ns/op
ClassValueBench.randomAccess 1024 16 pl04 avgt
10 18.496 ± 0.051 ns/op
ClassValueBench.randomAccess 1024 256 pl04 avgt
10 41.390 ± 0.321 ns/op
ClassValueBench.sequentialAccess 128 1 pl04 avgt
10 7.854 ± 0.381 ns/op
ClassValueBench.sequentialAccess 128 4 pl04 avgt
10 7.498 ± 0.055 ns/op
ClassValueBench.sequentialAccess 128 16 pl04 avgt
10 9.218 ± 1.000 ns/op
ClassValueBench.sequentialAccess 128 256 pl04 avgt
10 13.593 ± 0.275 ns/op
ClassValueBench.sequentialAccess 1024 1 pl04 avgt
10 8.774 ± 0.037 ns/op
ClassValueBench.sequentialAccess 1024 4 pl04 avgt
10 8.562 ± 0.014 ns/op
ClassValueBench.sequentialAccess 1024 16 pl04 avgt
10 7.596 ± 0.027 ns/op
ClassValueBench.sequentialAccess 1024 256 pl04 avgt
10 24.605 ± 0.143 ns/op
The footprint is even better than with CHM as LPHT does not maintain internal
Map.Entry(s).
How does this implementation compare on your hardware, Michael?
Regards, Peter
_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev