Re: ClassValue perf?

2016-05-31 Thread Michael Haupt
Hi Peter,

thanks! I'll start another test run.

Best,

Michael

> Am 31.05.2016 um 13:05 schrieb Peter Levart :
> 
> 
> 
> On 05/30/2016 11:09 PM, Peter Levart wrote:
>> Will revert that to volatile semantics...
> 
> So, here it is. The latest incarnation - the same as webrev.04.1 but with 
> volatile semantics:
> 
> http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04.2/
> 
> Benchmarks show no change in performance:
> 
> http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04.2.bench_results.txt
> 
> 
> Regards, Peter

-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-31 Thread Peter Levart



On 05/30/2016 11:09 PM, Peter Levart wrote:

Will revert that to volatile semantics...


So, here it is. The latest incarnation - the same as webrev.04.1 but 
with volatile semantics:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04.2/

Benchmarks show no change in performance:

http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04.2.bench_results.txt


Regards, Peter

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-30 Thread Peter Levart



On 05/30/2016 07:47 PM, Aleksey Shipilev wrote:

On 05/30/2016 06:59 PM, Peter Levart wrote:

I also employed get-acquire/put-release memory ordering semantics
instead of SC (volatile) in hope that it might improve a bit the
performance on platforms such as PowerPC or ARM, but this can be changed
back to SC if anyone gets scared of it :-)

Revert, you're playing with fire here. Your _default_ modus operandi
should be "scared" when dealing with concurrency. The correctness with
acq/rel should be proven separately, and not by empirical testing. There
is a reason why putOrdered is not used everywhere.

Thanks,
-Aleksey


Hi Aleksey,

You are right.  Users kind of expect from a general construct like 
ConcurrentMap to behave in a SC manner. Although my use of 
get-acquire/put-release in a Map-like construct could be correct by 
itself, it might be surprising for users to observe things like:


final static LinearProbeHashtable t = new 
LinearProbeHashtable<>();


Thread A:
t.put(1, 10);

Thread B:
t.put(2, t.get(1));

Thread C:
Integer v2 = t.get(2);
Integer v1 = t.get(1);

with a possible outcome of (v1, v2) being (null, 10);

Will revert that to volatile semantics...

Regards, Peter





___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-30 Thread Aleksey Shipilev
On 05/30/2016 06:59 PM, Peter Levart wrote:
> I also employed get-acquire/put-release memory ordering semantics
> instead of SC (volatile) in hope that it might improve a bit the
> performance on platforms such as PowerPC or ARM, but this can be changed
> back to SC if anyone gets scared of it :-)

Revert, you're playing with fire here. Your _default_ modus operandi
should be "scared" when dealing with concurrency. The correctness with
acq/rel should be proven separately, and not by empirical testing. There
is a reason why putOrdered is not used everywhere.

Thanks,
-Aleksey




signature.asc
Description: OpenPGP digital signature
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-30 Thread Peter Levart

Hi Michael,

In the meantime I improved LinearProbeHashtable as I identified a 
weakness in its implementation. In general, if an entry for the same key 
was inserted and removed repeatably, then each such repetition planted a 
"tombstone" at a position of inserted key which means that lookup 
performance for such key deteriorated with each such 
removal/re-insertion until the table was rehashed.


The main trick with LinearProbeHashtable to be able to have lock-free 
lookups is in entries (key/value pairs) that don't move when other 
entries are added or removed from the table. Normally (for example in 
IdentityHashMap that is a similar structure) when an entry is removed 
from linear-probe hash table, entries following it that have their hash 
code point to a point preceeding their position in table, are moved up 
to fill the gap. But such movement is not possible to achieve in 
multithreaded operation without disturbing the lock-free lookup in a way 
that makes it non-functional. So instead of moving the following entries 
up, my implementation plants a "tombstone" key and clears the associated 
value. Such tombstone keys are skipped in lookups, but consider the 
following situation:


hashtable.put(keyX, value1); hastable.remove(keyX);
hashtable.put(keyX, value2); hastable.remove(keyX);
hashtable.put(keyX, value3); hastable.remove(keyX);
hashtable.put(keyX, value4);

// after above modifications and in case they didn't trigger rehashing,
// hashtable.get(keyX) encounters the following state:

i = firstIndex(keyX, table.length);
table[i+0] == Tombstone
table[i+1] == null
table[i+2] == Tombstone
table[i+3] == null
table[i+4] == Tombstone
table[i+5] == null
table[i+6] == keyX
table[i+7] == value4


...so the lookup for keyX must probe 3 tombstones before reaching to the 
matched key. ClassValue usage does not encounter such situations as 
ClassValue.remove actually maps to LinearProbeHashtable.replace(key, 
oldValue, new Removed()) because ClassValue must also track "versions" 
of removals. Where ClassValue calls LinearProbeHashtable.remove is when 
expunging stale keys and each such removed key is never inserted again.


But if LinearProbeHashtable is to be used elsewhere too, we must not 
allow periodic removal/reinsertion of the same key to affect performance 
of lookups.


The solution I discovered was for tombstones to maintain the number of 
consecutive tombstones following the one, so probing can skip an entire 
run of consecutive tombstones in one go. Te above example of repeated 
modifications results in the following state:


i = firstIndex(keyX, table.length);
table[i+0] == Tombstone(skip=3)
table[i+1] == null
table[i+2] == Tombstone(skip=2)
table[i+3] == null
table[i+4] == Tombstone(skip=1)
table[i+5] == null
table[i+6] == keyX
table[i+7] == value4

When lookup encounters a tombstone it can read from it the number of key 
slots to skip. Simple.


Another improvement is in the policy of rehashing. Rehashing is 
triggered just before a new entry is inserted into the table that would 
violate the maximum load factor of 2/3. Tombstones also count as 
occupying the slots, of course, but are removed in rehashing. Rehashing 
can therefore also shrink the table depending on how many tombstones it 
contains. But shrinking it to the minimum length that still satisfies 
the maximum load factor of 2/3 is not the best policy as it can result 
in frequent oscillations around the tipping point that result in 
repeated rehashing. So current strategy reallocates a table in a way 
that guarantees, in worst case, that at least size()/2 entries can get 
inserted into the table before it needs another rehashing, but at the 
same time does not force table to grow any faster when entries are 
inserted only.


I also employed get-acquire/put-release memory ordering semantics 
instead of SC (volatile) in hope that it might improve a bit the 
performance on platforms such as PowerPC or ARM, but this can be changed 
back to SC if anyone gets scared of it :-)


Here's the improved code (only LinearProbeHashtable changed compared to 
webrev.04):


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04.1/

Here are the benchmark results comparing original JDK9 ClassValue with 
this patch:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04.1.bench_results.txt

...which show that it still has a slight (performance of lookup) to big 
(footprint and expunging) advantage over present ClassValue in this 
incarnation too.




Michael, could you re-submit this version into internal testing too? I 
think this modification is essential.



Regards, Peter


On 05/30/2016 10:00 AM, Michael Haupt wrote:

Hi Peter,

the internal tests look good. I'll assign the issue over to you. Thanks!

Best,

Michael

Am 26.05.2016 um 14:42 schrieb Michael Haupt 
mailto:michael.ha...@oracle.com>>:


Hi Peter,

thank you for this wonderful piece of work.

Am 26.05.2016 um 10:59 schrieb Peter Levart 

Re: ClassValue perf?

2016-05-30 Thread Michael Haupt
Hi Peter,

the internal tests look good. I'll assign the issue over to you. Thanks!

Best,

Michael

> Am 26.05.2016 um 14:42 schrieb Michael Haupt :
> 
> Hi Peter,
> 
> thank you for this wonderful piece of work.
> 
>> Am 26.05.2016 um 10:59 schrieb Peter Levart > >:
>> How does this implementation compare on your hardware, Michael?
> 
> Results attached. It improves on the unpatched version in all cases, is in 
> most cases even faster than the "simple solution" (reduce initial size to 1), 
> and reduces complexity of ClassValue. It passes all open and closed 
> jli-related tests as well as the Nashorn tests. Looking really good.
> 
> Let me run the full internal test suite across platforms.
> 
> Best,
> 
> Michael


-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: LinearProbeHashtable Re: ClassValue perf?

2016-05-27 Thread Paul Sandoz
Hi Peter,

> On 27 May 2016, at 12:41, Peter Levart  wrote:
> 
> Hi Paul,
> 
> On 05/26/2016 01:20 PM, Paul Sandoz wrote:
>> Hi Peter,
>> 
>> Opportunistically if your LinearProbeHashtable works out then i am wondering 
>> if we could replace the use of CHM within 
>> MethodType.ConcurrentWeakInternSet, which only uses get/putIfAbsent/remove.
>> 
>> Thereby CHM can use VarHandles without inducing a circular dependency.
>> 
>> Paul.
> 
> It could be used, yes. LinearProbeHashtable is not scalable to multiple 
> threads like CHM is for modifications as it is using a single lock for all 
> modification operations including rehashing, but it is lock-free for lookups, 
> so for usecases such as caching, where lookups dominate and modifications are 
> mostly performed in batches from single thread (when some subsystem 
> initializes), it could be a viable alternative to CHM.
> 

Or say expunging of stale entries?


> If it is moved to some jdk.internal subpackage and made public, I could add 
> missing Map methods, mostly to be able to include it in MOAT tests.
> 
> What do you think?
> 

I think moving to say jdk.internal.util.concurrent makes sense. Keeping it lean 
and focused to the purpose it currently serves seems appropriate, so 
implementing Map might an unwanted embellishment, i would be inclined to write 
some focused tests and also leverage the contextual testing via it’s use within 
ClassValue (and maybe MethodType).

I should investigate updating MethodType and running the Octane benchmark...

Paul.


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: LinearProbeHashtable Re: ClassValue perf?

2016-05-27 Thread Peter Levart

Hi Paul,


On 05/26/2016 01:20 PM, Paul Sandoz wrote:

Hi Peter,

Opportunistically if your LinearProbeHashtable works out then i am wondering if 
we could replace the use of CHM within MethodType.ConcurrentWeakInternSet, 
which only uses get/putIfAbsent/remove.

Thereby CHM can use VarHandles without inducing a circular dependency.

Paul.


It could be used, yes. LinearProbeHashtable is not scalable to multiple 
threads like CHM is for modifications as it is using a single lock for 
all modification operations including rehashing, but it is lock-free for 
lookups, so for usecases such as caching, where lookups dominate and 
modifications are mostly performed in batches from single thread (when 
some subsystem initializes), it could be a viable alternative to CHM.


If it is moved to some jdk.internal subpackage and made public, I could 
add missing Map methods, mostly to be able to include it in MOAT tests.


What do you think?

Regards, Peter




On 26 May 2016, at 10:59, Peter Levart  wrote:

Hi Michael,

On 05/23/2016 03:56 PM, Michael Haupt wrote:

I've ran the unpatched version and Peter's two patches once more. The results 
are attached (results.txt). They confirm Aleksey's observation.

Regarding the 03 patch (plevart3 column in the results), perfasm output (see 
http://cr.openjdk.java.net/~mhaupt/8031043/perfasm.zip) suggests the cost is 
mainly accrued in ConcurrentHashMap. The same is the case for the 02 patch 
(plevart2 column).

As things stand, I think we can even focus on Peter's 02 patch, as this is the 
faster of his two proposals (plevart2 column in the results), reduces the 
footprint, and reduces the implementation complexity. Can anything be done to 
improve on its performance? (It has slight performance slowdowns for the 
single-value case as well.)

I can't think of anything else short of improving performance of CHM itself.

Or replacing CHM with a "better" implementation:

 http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04/

This webrev is similar to webrev.02. It's only difference is in ClassValueMap 
which extends LinearProbeHashtable instead of ConcurrentHashMap. 
LinearProbeHashtable is a simple implementation of a linear-probe hash table. 
It's not a full Map implementation. It only implements methods needed in 
ClassValue. With this implementation I get a slight boost compared to JDK 9 
ClassValue implementation for all sizes and counts:

Benchmark (classCount)  (classValueCount)  (impl)  Mode 
 Cnt   Score   Error  Units
ClassValueBench.randomAccess   128  1jdk9  avgt 
  10   9.079 ± 0.092  ns/op
ClassValueBench.randomAccess   128  4jdk9  avgt 
  10  10.615 ± 0.102  ns/op
ClassValueBench.randomAccess   128 16jdk9  avgt 
  10  11.665 ± 0.012  ns/op
ClassValueBench.randomAccess   128256jdk9  avgt 
  10  19.151 ± 0.219  ns/op
ClassValueBench.randomAccess  1024  1jdk9  avgt 
  10  14.642 ± 0.425  ns/op
ClassValueBench.randomAccess  1024  4jdk9  avgt 
  10  22.577 ± 0.093  ns/op
ClassValueBench.randomAccess  1024 16jdk9  avgt 
  10  19.864 ± 0.736  ns/op
ClassValueBench.randomAccess  1024256jdk9  avgt 
  10  60.470 ± 0.285  ns/op
ClassValueBench.sequentialAccess   128  1jdk9  avgt 
  10   9.741 ± 0.033  ns/op
ClassValueBench.sequentialAccess   128  4jdk9  avgt 
  10   8.252 ± 0.029  ns/op
ClassValueBench.sequentialAccess   128 16jdk9  avgt 
  10   7.888 ± 1.249  ns/op
ClassValueBench.sequentialAccess   128256jdk9  avgt 
  10  16.493 ± 0.415  ns/op
ClassValueBench.sequentialAccess  1024  1jdk9  avgt 
  10  13.376 ± 0.452  ns/op
ClassValueBench.sequentialAccess  1024  4jdk9  avgt 
  10  10.023 ± 0.020  ns/op
ClassValueBench.sequentialAccess  1024 16jdk9  avgt 
  10   8.029 ± 0.178  ns/op
ClassValueBench.sequentialAccess  1024256jdk9  avgt 
  10  33.472 ± 0.058  ns/op

Benchmark (classCount)  (classValueCount)  (impl)  Mode 
 Cnt   Score   Error  Units
ClassValueBench.randomAccess   128  1pl04  avgt 
  10   8.955 ± 0.055  ns/op
ClassValueBench.randomAccess   128  4pl04  avgt 
  10   9.999 ± 0.017  ns/op
ClassValueBench.randomAccess   128 16pl04  avgt 
  10  11.615 ± 1.928  ns/op
ClassValueBench.randomAccess   128256pl04  avgt 
  10  17.063 ± 0.460  ns/op
ClassValueBench.randomAccess  1024  1pl04  avgt 
  10  12.553 ± 0.086  ns/op
ClassValueBench.randomAccess  1024   

Re: ClassValue perf?

2016-05-26 Thread Michael Haupt
Hi Peter,thank you for this wonderful piece of work.Am 26.05.2016 um 10:59 schrieb Peter Levart :
How does this implementation compare on your hardware, Michael?Results attached. It improves on the unpatched version in all cases, is in most cases even faster than the "simple solution" (reduce initial size to 1), and reduces complexity of ClassValue. It passes all open and closed jli-related tests as well as the Nashorn tests. Looking really good.Let me run the full internal test suite across platforms.Best,MichaelBenchmark  (CC) (CVC) plaintwisti   plevart2 plevart4 
CVB.randomAccess   128  1 10.277   9.90511.574   9.788 
CVB.randomAccess   128  4 12.081   11.445   13.758   11.476 
CVB.randomAccess   128  1616.352   16.461   15.201   12.588 
CVB.randomAccess   128  256   24.486   24.365   26.177   21.532 
CVB.randomAccess   1024 1 18.951   16.691   19.439   14.674 
CVB.randomAccess   1024 4 27.497   24.634   27.348   22.818 
CVB.randomAccess   1024 1626.988   26.522   32.034   25.353 
CVB.randomAccess   1024 256   54.643   51.415   45.496   35.947 
CVB.sequentialAccess   128  1 11.276   9.37010.724   8.290 
CVB.sequentialAccess   128  4 9.3029.43410.343   8.577 
CVB.sequentialAccess   128  1610.723   10.734   9.5768.427 
CVB.sequentialAccess   128  256   17.721   17.947   17.351   15.646 
CVB.sequentialAccess   1024 1 15.313   16.217   12.763   9.835 
CVB.sequentialAccess   1024 4 11.737   11.779   10.992   9.752 
CVB.sequentialAccess   1024 168.8208.98310.062   8.776 
CVB.sequentialAccess   1024 256   44.024   43.792   39.478   32.867 
CVEB.redeployPartition N/A  N/A   144.797  151.230  118.095  104.374 
CVEB.redeployPartition N/A  N/A   392.969  445.776  370.319  345.316 
CVEB.redeployPartition N/A  N/A   464.723  419.487  252.764  146.739 
CVEB.redeployPartition N/A  N/A   1646.825 1553.961 773.508  428.923 

-- Dr. Michael Haupt | Principal Member of Technical StaffPhone: +49 331 200 7277 | Fax: +49 331 200 7561Oracle Java Platform Group | LangTools Team | NashornOracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, GermanyORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 MünchenRegistergericht: Amtsgericht München, HRA 95603Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 3543 AS Utrecht, NiederlandeHandelsregister der Handelskammer Midden-Nederland, Nr. 30143697Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val MaherOracle is committed to developing practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


LinearProbeHashtable Re: ClassValue perf?

2016-05-26 Thread Paul Sandoz
Hi Peter,

Opportunistically if your LinearProbeHashtable works out then i am wondering if 
we could replace the use of CHM within MethodType.ConcurrentWeakInternSet, 
which only uses get/putIfAbsent/remove.

Thereby CHM can use VarHandles without inducing a circular dependency.

Paul.

> On 26 May 2016, at 10:59, Peter Levart  wrote:
> 
> Hi Michael,
> 
> On 05/23/2016 03:56 PM, Michael Haupt wrote:
>> I've ran the unpatched version and Peter's two patches once more. The 
>> results are attached (results.txt). They confirm Aleksey's observation.
>> 
>> Regarding the 03 patch (plevart3 column in the results), perfasm output (see 
>> http://cr.openjdk.java.net/~mhaupt/8031043/perfasm.zip) suggests the cost is 
>> mainly accrued in ConcurrentHashMap. The same is the case for the 02 patch 
>> (plevart2 column).
>> 
>> As things stand, I think we can even focus on Peter's 02 patch, as this is 
>> the faster of his two proposals (plevart2 column in the results), reduces 
>> the footprint, and reduces the implementation complexity. Can anything be 
>> done to improve on its performance? (It has slight performance slowdowns for 
>> the single-value case as well.)
> 
> I can't think of anything else short of improving performance of CHM itself.
> 
> Or replacing CHM with a "better" implementation:
> 
> 
> http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04/
> 
> This webrev is similar to webrev.02. It's only difference is in ClassValueMap 
> which extends LinearProbeHashtable instead of ConcurrentHashMap. 
> LinearProbeHashtable is a simple implementation of a linear-probe hash table. 
> It's not a full Map implementation. It only implements methods needed in 
> ClassValue. With this implementation I get a slight boost compared to JDK 9 
> ClassValue implementation for all sizes and counts:
> 
> Benchmark (classCount)  (classValueCount)  (impl)  
> Mode  Cnt   Score   Error  Units
> ClassValueBench.randomAccess   128  1jdk9  
> avgt   10   9.079 ± 0.092  ns/op
> ClassValueBench.randomAccess   128  4jdk9  
> avgt   10  10.615 ± 0.102  ns/op
> ClassValueBench.randomAccess   128 16jdk9  
> avgt   10  11.665 ± 0.012  ns/op
> ClassValueBench.randomAccess   128256jdk9  
> avgt   10  19.151 ± 0.219  ns/op
> ClassValueBench.randomAccess  1024  1jdk9  
> avgt   10  14.642 ± 0.425  ns/op
> ClassValueBench.randomAccess  1024  4jdk9  
> avgt   10  22.577 ± 0.093  ns/op
> ClassValueBench.randomAccess  1024 16jdk9  
> avgt   10  19.864 ± 0.736  ns/op
> ClassValueBench.randomAccess  1024256jdk9  
> avgt   10  60.470 ± 0.285  ns/op
> ClassValueBench.sequentialAccess   128  1jdk9  
> avgt   10   9.741 ± 0.033  ns/op
> ClassValueBench.sequentialAccess   128  4jdk9  
> avgt   10   8.252 ± 0.029  ns/op
> ClassValueBench.sequentialAccess   128 16jdk9  
> avgt   10   7.888 ± 1.249  ns/op
> ClassValueBench.sequentialAccess   128256jdk9  
> avgt   10  16.493 ± 0.415  ns/op
> ClassValueBench.sequentialAccess  1024  1jdk9  
> avgt   10  13.376 ± 0.452  ns/op
> ClassValueBench.sequentialAccess  1024  4jdk9  
> avgt   10  10.023 ± 0.020  ns/op
> ClassValueBench.sequentialAccess  1024 16jdk9  
> avgt   10   8.029 ± 0.178  ns/op
> ClassValueBench.sequentialAccess  1024256jdk9  
> avgt   10  33.472 ± 0.058  ns/op
> 
> Benchmark (classCount)  (classValueCount)  (impl)  
> Mode  Cnt   Score   Error  Units
> ClassValueBench.randomAccess   128  1pl04  
> avgt   10   8.955 ± 0.055  ns/op
> ClassValueBench.randomAccess   128  4pl04  
> avgt   10   9.999 ± 0.017  ns/op
> ClassValueBench.randomAccess   128 16pl04  
> avgt   10  11.615 ± 1.928  ns/op
> ClassValueBench.randomAccess   128256pl04  
> avgt   10  17.063 ± 0.460  ns/op
> ClassValueBench.randomAccess  1024  1pl04  
> avgt   10  12.553 ± 0.086  ns/op
> ClassValueBench.randomAccess  1024  4pl04  
> avgt   10  16.766 ± 0.221  ns/op
> ClassValueBench.randomAccess  1024 16pl04  
> avgt   10  18.496 ± 0.051  ns/op
> ClassValueBench.randomAccess  1024256pl04  
> avgt   10  41.390 ± 0.321  ns/op
> ClassValueBench.sequentialAccess   128  1pl04  
> avgt   10   7.854 ± 0.381  ns/op
> ClassValueBench.sequentialAccess   128  4pl04  
> avgt   10   7.498 ± 0

Re: ClassValue perf?

2016-05-26 Thread Peter Levart

Hi Michael,


On 05/23/2016 03:56 PM, Michael Haupt wrote:
I've ran the unpatched version and Peter's two patches once more. The 
results are attached (results.txt). They confirm Aleksey's observation.


Regarding the 03 patch (plevart3 column in the results), perfasm 
output (see http://cr.openjdk.java.net/~mhaupt/8031043/perfasm.zip 
) suggests 
the cost is mainly accrued in ConcurrentHashMap. The same is the case 
for the 02 patch (plevart2 column).


As things stand, I think we can even focus on Peter's 02 patch, as 
this is the faster of his two proposals (plevart2 column in the 
results), reduces the footprint, and reduces the implementation 
complexity. Can anything be done to improve on its performance? (It 
has slight performance slowdowns for the single-value case as well.)


I can't think of anything else short of improving performance of CHM itself.

Or replacing CHM with a "better" implementation:

http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.04/

This webrev is similar to webrev.02. It's only difference is in 
ClassValueMap which extends LinearProbeHashtable instead of 
ConcurrentHashMap. LinearProbeHashtable is a simple implementation of a 
linear-probe hash table. It's not a full Map implementation. It only 
implements methods needed in ClassValue. With this implementation I get 
a slight boost compared to JDK 9 ClassValue implementation for all sizes 
and counts:


Benchmark (classCount)  (classValueCount) 
(impl)  Mode  Cnt   Score   Error  Units
ClassValueBench.randomAccess   128  1 jdk9  
avgt   10   9.079 ± 0.092  ns/op
ClassValueBench.randomAccess   128  4 jdk9  
avgt   10  10.615 ± 0.102  ns/op
ClassValueBench.randomAccess   128 16 jdk9  
avgt   10  11.665 ± 0.012  ns/op
ClassValueBench.randomAccess   128256 jdk9  
avgt   10  19.151 ± 0.219  ns/op
ClassValueBench.randomAccess  1024  1 jdk9  
avgt   10  14.642 ± 0.425  ns/op
ClassValueBench.randomAccess  1024  4 jdk9  
avgt   10  22.577 ± 0.093  ns/op
ClassValueBench.randomAccess  1024 16 jdk9  
avgt   10  19.864 ± 0.736  ns/op
ClassValueBench.randomAccess  1024256 jdk9  
avgt   10  60.470 ± 0.285  ns/op
ClassValueBench.sequentialAccess   128  1 jdk9  
avgt   10   9.741 ± 0.033  ns/op
ClassValueBench.sequentialAccess   128  4 jdk9  
avgt   10   8.252 ± 0.029  ns/op
ClassValueBench.sequentialAccess   128 16 jdk9  
avgt   10   7.888 ± 1.249  ns/op
ClassValueBench.sequentialAccess   128256 jdk9  
avgt   10  16.493 ± 0.415  ns/op
ClassValueBench.sequentialAccess  1024  1 jdk9  
avgt   10  13.376 ± 0.452  ns/op
ClassValueBench.sequentialAccess  1024  4 jdk9  
avgt   10  10.023 ± 0.020  ns/op
ClassValueBench.sequentialAccess  1024 16 jdk9  
avgt   10   8.029 ± 0.178  ns/op
ClassValueBench.sequentialAccess  1024256 jdk9  
avgt   10  33.472 ± 0.058  ns/op


Benchmark (classCount)  (classValueCount) 
(impl)  Mode  Cnt   Score   Error  Units
ClassValueBench.randomAccess   128  1 pl04  
avgt   10   8.955 ± 0.055  ns/op
ClassValueBench.randomAccess   128  4 pl04  
avgt   10   9.999 ± 0.017  ns/op
ClassValueBench.randomAccess   128 16 pl04  
avgt   10  11.615 ± 1.928  ns/op
ClassValueBench.randomAccess   128256 pl04  
avgt   10  17.063 ± 0.460  ns/op
ClassValueBench.randomAccess  1024  1 pl04  
avgt   10  12.553 ± 0.086  ns/op
ClassValueBench.randomAccess  1024  4 pl04  
avgt   10  16.766 ± 0.221  ns/op
ClassValueBench.randomAccess  1024 16 pl04  
avgt   10  18.496 ± 0.051  ns/op
ClassValueBench.randomAccess  1024256 pl04  
avgt   10  41.390 ± 0.321  ns/op
ClassValueBench.sequentialAccess   128  1 pl04  
avgt   10   7.854 ± 0.381  ns/op
ClassValueBench.sequentialAccess   128  4 pl04  
avgt   10   7.498 ± 0.055  ns/op
ClassValueBench.sequentialAccess   128 16 pl04  
avgt   10   9.218 ± 1.000  ns/op
ClassValueBench.sequentialAccess   128256 pl04  
avgt   10  13.593 ± 0.275  ns/op
ClassValueBench.sequentialAccess  1024  1 pl04  
avgt   10   8.774 ± 0.037  ns/op
ClassValueBench.sequentialAccess  1024  4 pl04  
avgt   10   8.562 ± 0.014  ns/op
ClassValueBench.sequentialAccess  1024 16 pl04  
avgt   10   7.596 ± 0.027  ns/op
Clas

Re: ClassValue perf?

2016-05-23 Thread Michael Haupt
Hi Aleksey,thanks; comments inlined.Am 19.05.2016 um 15:57 schrieb Aleksey Shipilev :On 05/19/2016 03:32 PM, Michael Haupt wrote:It may well be that running the bechmark so few times does not deliver astable enough result. I'd like Aleksey to comment on this: is adoptingPeter's code worthwhile given it improves on footprint and reduces codesize and complexity?Eh, if you pose the question like that, the answer is obviously "yes". Ilike how Peter's version strips down the ClassValue impl.Of course; I forgot to add "... in spite of the observed performance regressions", and you've answered that.But, looking at the data, it would seem we are regressing randomAccesswith low classValueCount?Benchmark (cCount)  (cvCount)   Mode  Cnt    Score    Error  Units# result-plain.txtrandomAccess  1024  1   avgt   10    18.375 ±  0.046  ns/oprandomAccess  1024  4   avgt   10    26.755 ±  0.018  ns/oprandomAccess  1024 16   avgt   10    26.263 ±  0.024  ns/oprandomAccess  1024    256   avgt   10    53.543 ±  0.419  ns/op# result-plevart-03.txtrandomAccess  1024  1   avgt   10    23.315 ±  0.053  ns/oprandomAccess  1024  4   avgt   10    28.323 ±  0.053  ns/oprandomAccess  1024 16   avgt   10    29.514 ±  0.070  ns/oprandomAccess  1024    256   avgt   10    45.339 ±  0.035  ns/opThis seems to go the other direction Michael was pursuing: optimizingthe single-value case. Seems even more pronunciated on low classCount.I'd be more happy if we can at least not regress the performance. Ifthere is a cleaner implementation with the same perf characteristics,I'd be inclined to accept it, of course.I've ran the unpatched version and Peter's two patches once more. The results are attached (results.txt). They confirm Aleksey's observation.Regarding the 03 patch (plevart3 column in the results), perfasm output (see http://cr.openjdk.java.net/~mhaupt/8031043/perfasm.zip) suggests the cost is mainly accrued in ConcurrentHashMap. The same is the case for the 02 patch (plevart2 column).As things stand, I think we can even focus on Peter's 02 patch, as this is the faster of his two proposals (plevart2 column in the results), reduces the footprint, and reduces the implementation complexity. Can anything be done to improve on its performance? (It has slight performance slowdowns for the single-value case as well.)I agree regarding whether there's a point in optimising for single-valuestorage whilst maintaining full flexibility. In a scenario where it isknown that only one value will be associated with a class, it's betterto use static fields.Specialized solutions that can use the knowledge about the externalcondition would always win, given enough effort. The improvements inshared infrastructure are still very welcome, because they break thechicken-and-egg problem: you would not use a shared API if it is slow,and you would not optimize shared API because nobody uses it.Nothing to disagree with,MichaelBenchmark  (CC) (CVC) plainplevart2 plevart3
CVB.randomAccess   128  1 10.292   11.667   12.686
CVB.randomAccess   128  4 12.413   13.790   13.896
CVB.randomAccess   128  1614.754   15.963   15.137
CVB.randomAccess   128  256   24.424   25.972   26.411
CVB.randomAccess   1024 1 18.631   19.517   24.339
CVB.randomAccess   1024 4 27.529   27.567   28.890
CVB.randomAccess   1024 1626.762   31.532   30.088
CVB.randomAccess   1024 256   58.452   45.419   46.016
CVB.sequentialAccess   128  1 11.214   10.739   12.645
CVB.sequentialAccess   128  4 9.31710.269   10.563
CVB.sequentialAccess   128  1610.815   9.7089.787
CVB.sequentialAccess   128  256   18.030   17.278   18.690
CVB.sequentialAccess   1024 1 15.190   12.570   14.429
CVB.sequentialAccess   1024 4 12.529   11.063   13.015
CVB.sequentialAccess   1024 169.03710.024   10.889
CVB.sequentialAccess   1024 256   41.950   38.416   42.341
CVEB.redeployPartition N/A  N/A   148.185  120.086  118.180
CVEB.redeployPartition N/A  N/A   386.839  363.152  380.038
CVEB.redeployPartition N/A  N/A   404.139  264.276  259.995
CVEB.redeployPartition N/A  N/A   1542.053 742.261  757.637

-- Dr. Michael Haupt | Principal Member of Technical StaffPhone: +49 331 200 7277 | Fax: +49 331 200 7561Oracle Java Platform Group | LangTools Team | NashornOracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, GermanyORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 MünchenRegistergericht: Amtsgericht München, HRA 95603Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 3543 AS Utrecht, NiederlandeHandelsregister der Handelskammer Midden-Nederland, Nr. 30143697Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val MaherOracle is committed to developing practices and products that help protect the environment


Re: ClassValue perf?

2016-05-19 Thread Aleksey Shipilev
On 05/19/2016 03:32 PM, Michael Haupt wrote:
> It may well be that running the bechmark so few times does not deliver a
> stable enough result. I'd like Aleksey to comment on this: is adopting
> Peter's code worthwhile given it improves on footprint and reduces code
> size and complexity?

Eh, if you pose the question like that, the answer is obviously "yes". I
like how Peter's version strips down the ClassValue impl.

But, looking at the data, it would seem we are regressing randomAccess
with low classValueCount?

Benchmark (cCount)  (cvCount)   Mode  CntScoreError  Units

# result-plain.txt
randomAccess  1024  1   avgt   1018.375 ±  0.046  ns/op
randomAccess  1024  4   avgt   1026.755 ±  0.018  ns/op
randomAccess  1024 16   avgt   1026.263 ±  0.024  ns/op
randomAccess  1024256   avgt   1053.543 ±  0.419  ns/op

# result-plevart-03.txt
randomAccess  1024  1   avgt   1023.315 ±  0.053  ns/op
randomAccess  1024  4   avgt   1028.323 ±  0.053  ns/op
randomAccess  1024 16   avgt   1029.514 ±  0.070  ns/op
randomAccess  1024256   avgt   1045.339 ±  0.035  ns/op

This seems to go the other direction Michael was pursuing: optimizing
the single-value case. Seems even more pronunciated on low classCount.

I'd be more happy if we can at least not regress the performance. If
there is a cleaner implementation with the same perf characteristics,
I'd be inclined to accept it, of course.

> I agree regarding whether there's a point in optimising for single-value
> storage whilst maintaining full flexibility. In a scenario where it is
> known that only one value will be associated with a class, it's better
> to use static fields.

Specialized solutions that can use the knowledge about the external
condition would always win, given enough effort. The improvements in
shared infrastructure are still very welcome, because they break the
chicken-and-egg problem: you would not use a shared API if it is slow,
and you would not optimize shared API because nobody uses it.

Thanks,
-Aleksey



signature.asc
Description: OpenPGP digital signature
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-19 Thread Michael Haupt
Hi Peter,thank you. As some background info, the machine I'm running the benchmarks on is a 32-core Xeon.Results of running your latest benchmarks with unmodified 9-dev and your two patches (02, 03) are attached. Overall, it seems your solution 03 is a bit slower than 02, and 02 shines especially in the expunging benchmark, but also for random access with large numbers of classes and class values. It appears to be somewhat slower than the unmodified case, though.It may well be that running the bechmark so few times does not deliver a stable enough result. I'd like Aleksey to comment on this: is adopting Peter's code worthwhile given it improves on footprint and reduces code size and complexity?I agree regarding whether there's a point in optimising for single-value storage whilst maintaining full flexibility. In a scenario where it is known that only one value will be associated with a class, it's better to use static fields.Best,MichaelBenchmark (classCount)  (classValueCount)  
(classValuesPerPart)  (classesPerPart)   (impl)  (partitions)  Mode  Cnt 
ScoreError  Units
ClassValueBench.randomAccess   128  1   
N/A   N/A  unknown   N/A  avgt   1010.190 
±  0.014  ns/op
ClassValueBench.randomAccess   128  4   
N/A   N/A  unknown   N/A  avgt   1012.000 
±  0.164  ns/op
ClassValueBench.randomAccess   128 16   
N/A   N/A  unknown   N/A  avgt   1016.131 
±  0.026  ns/op
ClassValueBench.randomAccess   128256   
N/A   N/A  unknown   N/A  avgt   1024.267 
±  0.065  ns/op
ClassValueBench.randomAccess  1024  1   
N/A   N/A  unknown   N/A  avgt   1018.375 
±  0.046  ns/op
ClassValueBench.randomAccess  1024  4   
N/A   N/A  unknown   N/A  avgt   1026.755 
±  0.018  ns/op
ClassValueBench.randomAccess  1024 16   
N/A   N/A  unknown   N/A  avgt   1026.263 
±  0.024  ns/op
ClassValueBench.randomAccess  1024256   
N/A   N/A  unknown   N/A  avgt   1053.543 
±  0.419  ns/op
ClassValueBench.sequentialAccess   128  1   
N/A   N/A  unknown   N/A  avgt   1011.063 
±  0.077  ns/op
ClassValueBench.sequentialAccess   128  4   
N/A   N/A  unknown   N/A  avgt   10 9.384 
±  0.033  ns/op
ClassValueBench.sequentialAccess   128 16   
N/A   N/A  unknown   N/A  avgt   1010.534 
±  0.036  ns/op
ClassValueBench.sequentialAccess   128256   
N/A   N/A  unknown   N/A  avgt   1018.038 
±  0.119  ns/op
ClassValueBench.sequentialAccess  1024  1   
N/A   N/A  unknown   N/A  avgt   1014.862 
±  0.013  ns/op
ClassValueBench.sequentialAccess  1024  4   
N/A   N/A  unknown   N/A  avgt   1011.586 
±  0.027  ns/op
ClassValueBench.sequentialAccess  1024 16   
N/A   N/A  unknown   N/A  avgt   10 8.949 
±  0.116  ns/op
ClassValueBench.sequentialAccess  1024256   
N/A   N/A  unknown   N/A  avgt   1043.170 
±  0.074  ns/op
ClassValueExpungeBench.redeployPartition   N/AN/A   
  8  1024  unknown16ss   16   130.911 
± 10.815  ms/op
ClassValueExpungeBench.redeployPartition   N/AN/A   
  8  4096  unknown16ss   16   435.190 
± 32.679  ms/op
ClassValueExpungeBench.redeployPartition   N/AN/A   
 64  1024  unknown16ss   16   569.942 
± 68.902  ms/op
ClassValueExpungeBench.redeployPartition   N/AN/A   
 64  4096  unknown16ss   16  1485.027 
± 91.200  ms/op
Benchmark (classCount)  (classValueCount)  
(classValuesPerPart)  (classesPerPart)   (impl)  (partitions)  Mode  Cnt
ScoreError  Units
ClassValueBench.randomAccess   128  1   
N/A   N/A  unknown   N/A  avgt   10   11.488

Re: ClassValue perf?

2016-05-19 Thread Michael Haupt
Hi Christian,

> Am 06.05.2016 um 22:35 schrieb Christian Thalinger 
> :
>> Given that one concern with this issue, next to reducing footprint, was to 
>> optimise for the single-value case, I'm still a bit hesitant even though the 
>> sheer amount of code reduction is impressive. I'll evaluate further.
> 
> The main motivation to optimize for the single-value use case is Graal but 
> it’s not super important.  Graal solved this issue in a different way and 
> it’s questionable Graal would go back using ClassValue so don’t worry too 
> much about it.

IIRC Graal has adopted static field storage for its single-value case. Beating 
that with a solution that has to have the flexibility of multiple values as 
well? Fat chance. :-)

Thanks for confirming,

Michael

-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-08 Thread Peter Levart

Hi Michael,


On 05/06/2016 04:48 PM, Michael Haupt wrote:

Hi Peter,

thank you. I've run the full benchmark in my setup and uploaded the 
updated cumulative results to 
http://cr.openjdk.java.net/~mhaupt/8031043/ 
.


The benchmark indeed shows that this latest addition to the group 
slows down random and sequential access, especially for small numbers 
of values and classes. The OpenJDK tests are fine; I'm running a batch 
of internal tests as well.


Given that one concern with this issue, next to reducing footprint, 
was to optimise for the single-value case, I'm still a bit hesitant 
even though the sheer amount of code reduction is impressive. I'll 
evaluate further.


Interesting. I observed quite the opposite on my machine (i7-4771, 8 MiB 
cache) . For sequential access pattern or for random access with small 
number of CV(s) and Class(es) the results are comparable. Only for 256 
CV(s) x 1024 Class(es) and with random access pattern, I observed about 
20% drop of performance which I attributed to the difference in design 
of CHM vs. the 'cache' of JDK 9 ClassValue (worse CPU cache locality for 
CHM):


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/ClassValueBench.java

I doubt that single value per Class instance is something that is 
beneficial to optimize. Such optimization would be very fragile so it 
would not be something to rely on. Typical or even worst case 
performance is more important in my opinion.


The fast-path lookup performance is the most important performance 
aspect of ClassValue, but it is not the only one that can be observed. 
Footprint and consequential GC / expunging overhead is also something to 
consider. The implementation presented in my webrev.02 maintains a 
linked list of weakly-referenced ClassValueMap(s). For each stale 
dequeued key, it probes each map and removes such key from any live 
map(s) containing it. This works optimally when the matrix of 
(ClassValue, Class) pairs is not sparse. I did an experiment with 
alternative expunging design where I maintain an array of 
weakly-referenced ClassValueMap(s) on each key that is inserted in them. 
This has approx. 10% additiona footprint overhead compared to original 
expunging design (but still just half the footprint overhead of jdk 9 
ClassValue design):


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.03/

The situation I envisioned was when a single JVM hosts multiple (say N) 
isolated applications (in an app server for example) and when one such 
application is re-deployed.


In original design (webrev.02) each dequeued ClassValue.key is probed 
against all class maps that remain and belong to the other N-1 
applications. In the alternative expunging design (webrev.03) the 
dequeued key just scans the array of weakly-referenced maps that the key 
was inserted into.


I created a benchmark to exercise such situation(s):

http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/ClassValueExpungeBench.java

It measures the time of a hypothetical redeployment of one application 
in an app server where there are 16 such running applications. The 
measurement includes class-loading, GC time and initialization of 
ClassValue(s). Results show that alternative expunging design 
(webrev.03) doesn't bring any improvements (or that original supposedly 
sub-optimal expunging design (webrev.02) doesn't show any weaknesses) 
for the range of parameters exercised in the benchmark.


What this benchmark shows too is that original jdk 9 ClassValue has at 
least 2x overhead with cleanup compared to my designs (note that 
benchmark includes classloading time too).


 Regards, Peter



Best,

Michael

Am 05.05.2016 um 17:21 schrieb Peter Levart >:


Hi Michael,


On 05/04/2016 06:02 PM, Michael Haupt wrote:

Hi Peter,

thank you for chiming in again! :-) I'll look at this in depth on 
Friday.


Good. Because I found bugs in expunging logic and a discrepancy of 
behavior when a value is installed concurrently by some other thread 
and then later removed while the 1st thread is still calculating the 
value. Current ClassValue re-tries the computation until it can make 
sure there were no concurrent changes to the entry during its 
computation. I fixed both things and verified that the behavior is 
now the same:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.02/

Regards, Peter



--

Oracle 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
OracleJava Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 
Potsdam, Germany


ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, 
D-80992 München

Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 
163/167, 3543 AS Utrecht,

Re: ClassValue perf?

2016-05-06 Thread Christian Thalinger

> On May 6, 2016, at 4:48 AM, Michael Haupt  wrote:
> 
> Hi Peter,
> 
> thank you. I've run the full benchmark in my setup and uploaded the updated 
> cumulative results to http://cr.openjdk.java.net/~mhaupt/8031043/ 
> .
> 
> The benchmark indeed shows that this latest addition to the group slows down 
> random and sequential access, especially for small numbers of values and 
> classes. The OpenJDK tests are fine; I'm running a batch of internal tests as 
> well.
> 
> Given that one concern with this issue, next to reducing footprint, was to 
> optimise for the single-value case, I'm still a bit hesitant even though the 
> sheer amount of code reduction is impressive. I'll evaluate further.

The main motivation to optimize for the single-value use case is Graal but it’s 
not super important.  Graal solved this issue in a different way and it’s 
questionable Graal would go back using ClassValue so don’t worry too much about 
it.

> 
> Best,
> 
> Michael
> 
>> Am 05.05.2016 um 17:21 schrieb Peter Levart > >:
>> 
>> Hi Michael,
>> 
>> 
>> On 05/04/2016 06:02 PM, Michael Haupt wrote:
>>> Hi Peter,
>>> 
>>> thank you for chiming in again! :-) I'll look at this in depth on Friday.
>> 
>> Good. Because I found bugs in expunging logic and a discrepancy of behavior 
>> when a value is installed concurrently by some other thread and then later 
>> removed while the 1st thread is still calculating the value. Current 
>> ClassValue re-tries the computation until it can make sure there were no 
>> concurrent changes to the entry during its computation. I fixed both things 
>> and verified that the behavior is now the same:
>> 
>> 
>> http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.02/ 
>> 
>> 
>> Regards, Peter
> 
> 
> -- 
> 
>  
> Dr. Michael Haupt | Principal Member of Technical Staff
> Phone: +49 331 200 7277 | Fax: +49 331 200 7561
> Oracle Java Platform Group | LangTools Team | Nashorn
> Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, 
> Germany
> 
> ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
> München
> Registergericht: Amtsgericht München, HRA 95603
> 
> Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
> 3543 AS Utrecht, Niederlande
> Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
> Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
>     Oracle is committed to developing 
> practices and products that help protect the environment
> 
> ___
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net 
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev 
> 
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-06 Thread Michael Haupt
Hi Peter,

thank you. I've run the full benchmark in my setup and uploaded the updated 
cumulative results to http://cr.openjdk.java.net/~mhaupt/8031043/.

The benchmark indeed shows that this latest addition to the group slows down 
random and sequential access, especially for small numbers of values and 
classes. The OpenJDK tests are fine; I'm running a batch of internal tests as 
well.

Given that one concern with this issue, next to reducing footprint, was to 
optimise for the single-value case, I'm still a bit hesitant even though the 
sheer amount of code reduction is impressive. I'll evaluate further.

Best,

Michael

> Am 05.05.2016 um 17:21 schrieb Peter Levart :
> 
> Hi Michael,
> 
> 
> On 05/04/2016 06:02 PM, Michael Haupt wrote:
>> Hi Peter,
>> 
>> thank you for chiming in again! :-) I'll look at this in depth on Friday.
> 
> Good. Because I found bugs in expunging logic and a discrepancy of behavior 
> when a value is installed concurrently by some other thread and then later 
> removed while the 1st thread is still calculating the value. Current 
> ClassValue re-tries the computation until it can make sure there were no 
> concurrent changes to the entry during its computation. I fixed both things 
> and verified that the behavior is now the same:
> 
> 
> http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.02/ 
> 
> 
> Regards, Peter


-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-05 Thread Peter Levart

Hi Michael,


On 05/04/2016 06:02 PM, Michael Haupt wrote:

Hi Peter,

thank you for chiming in again! :-) I'll look at this in depth on Friday.


Good. Because I found bugs in expunging logic and a discrepancy of 
behavior when a value is installed concurrently by some other thread and 
then later removed while the 1st thread is still calculating the value. 
Current ClassValue re-tries the computation until it can make sure there 
were no concurrent changes to the entry during its computation. I fixed 
both things and verified that the behavior is now the same:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.02/

Regards, Peter



Best,

Michael

Am 04.05.2016 um 17:50 schrieb Peter Levart >:


Hi,


On 04/29/2016 10:28 AM, Michael Haupt wrote:

All,

see http://cr.openjdk.java.net/~mhaupt/8031043/ 
 for a snapshot of 
what is currently available.


We have three patches:
* Christian's, which simply reduces the HashMap size,
* Peter's, which refactors ClassValueMap into a WeakHashMap,
* mine, which attempts to introduce the single-value storage 
optimisation John had suggested (I worked on performance with 
Aleksey - thanks!).


All of these are collected in the patches subdirectory for 
convenience. (Peter, I adapted your patch to the new Unsafe location.)


I extended Peter's benchmark (thanks!) to cover single-value 
storage; the source code is in the benchmark subdirectory, together 
with raw results from running the benchmark with each of the three 
patches applied. A results-only overview is in benchmark-results.txt.


The three are roughly on par. I'm not sure the single-value storage 
optimisation improves much on footprint given the additional data 
that must be kept around to make transition to map storage safe.


Opinions?


I must admit that my old patch is very complex, so I doubt anyone 
will take time to review it. It is almost a clean-room 
re-implementation of ClassValue API. My main motivation was footprint 
optimization for all sizes - not just one value per class as I doubt 
this will be very common situation anyway. Current ClassValue 
maintains 2 parallel hash-tables per class. A WeakHashMap which is 
accessed with proper synchronization and an optimized "cache" of 
entries for quick access. This makes it consume almost 100 bytes per 
(Class, ClassValue) pair. I managed to almost half the overhead for 
typical situation (1024 classes x 16 ClassValue(s)), but for the 
price of complexity.


Reviving this thread made me think about ClassValue again and I got 
another idea. This is an experiment to see if ConcurrentHashMap could 
be leveraged to implement ClassValue API with little added complexity:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.01/

And here are the results of a benchmark comparing JDK 9 original with 
this alternative:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/ClassValueBench.java

It is a little slower for random access of bigger sizes and #s of 
classes. Most probably a consequence of reduced cache hit ratio as 
CHM is a classical hash table with buckets implemented as linked list 
of entries whereas jdk 9 ClassValue cache is a linear-scan hash table 
which has better cache locality. This is particularly obvious in 
sequential access where CHM behaves on-par. It's a pity that CHM has 
a non-changeable load factor of 0.75 as changing this to 0.5 would 
most certainly improve benchmark results for a little more memory.


Where this version excels is in footprint. I managed to more than 
half the overhead. There's only a single ReferenceQueue needed and 
consequently expunging of stale data is more prompt and thorough. The 
code of ClassValue has been more than halved too.


What do you think?

Regards, Peter



--

Oracle 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
OracleJava Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 
Potsdam, Germany


ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, 
D-80992 München

Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 
163/167, 3543 AS Utrecht, Niederlande

Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
Green Oracle  	Oracle is committed 
to developing practices and products that help protect the environment





___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-04 Thread Michael Haupt
Hi Peter,

thank you for chiming in again! :-) I'll look at this in depth on Friday.

Best,

Michael

> Am 04.05.2016 um 17:50 schrieb Peter Levart :
> 
> Hi,
> 
> On 04/29/2016 10:28 AM, Michael Haupt wrote:
>> All,
>> 
>> see http://cr.openjdk.java.net/~mhaupt/8031043/ 
>>  for a snapshot of what is 
>> currently available.
>> 
>> We have three patches:
>> * Christian's, which simply reduces the HashMap size,
>> * Peter's, which refactors ClassValueMap into a WeakHashMap,
>> * mine, which attempts to introduce the single-value storage optimisation 
>> John had suggested (I worked on performance with Aleksey - thanks!).
>> 
>> All of these are collected in the patches subdirectory for convenience. 
>> (Peter, I adapted your patch to the new Unsafe location.)
>> 
>> I extended Peter's benchmark (thanks!) to cover single-value storage; the 
>> source code is in the benchmark subdirectory, together with raw results from 
>> running the benchmark with each of the three patches applied. A results-only 
>> overview is in benchmark-results.txt.
>> 
>> The three are roughly on par. I'm not sure the single-value storage 
>> optimisation improves much on footprint given the additional data that must 
>> be kept around to make transition to map storage safe.
>> 
>> Opinions?
> 
> I must admit that my old patch is very complex, so I doubt anyone will take 
> time to review it. It is almost a clean-room re-implementation of ClassValue 
> API. My main motivation was footprint optimization for all sizes - not just 
> one value per class as I doubt this will be very common situation anyway. 
> Current ClassValue maintains 2 parallel hash-tables per class. A WeakHashMap 
> which is accessed with proper synchronization and an optimized "cache" of 
> entries for quick access. This makes it consume almost 100 bytes per (Class, 
> ClassValue) pair. I managed to almost half the overhead for typical situation 
> (1024 classes x 16 ClassValue(s)), but for the price of complexity.
> 
> Reviving this thread made me think about ClassValue again and I got another 
> idea. This is an experiment to see if ConcurrentHashMap could be leveraged to 
> implement ClassValue API with little added complexity:
> 
> 
> http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.01/ 
> 
> 
> And here are the results of a benchmark comparing JDK 9 original with this 
> alternative:
> 
> 
> http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/ClassValueBench.java
>  
> 
> 
> It is a little slower for random access of bigger sizes and #s of classes. 
> Most probably a consequence of reduced cache hit ratio as CHM is a classical 
> hash table with buckets implemented as linked list of entries whereas jdk 9 
> ClassValue cache is a linear-scan hash table which has better cache locality. 
> This is particularly obvious in sequential access where CHM behaves on-par. 
> It's a pity that CHM has a non-changeable load factor of 0.75 as changing 
> this to 0.5 would most certainly improve benchmark results for a little more 
> memory.
> 
> Where this version excels is in footprint. I managed to more than half the 
> overhead. There's only a single ReferenceQueue needed and consequently 
> expunging of stale data is more prompt and thorough. The code of ClassValue 
> has been more than halved too.
> 
> What do you think?
> 
> Regards, Peter


-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-04 Thread Michael Haupt
Hi again,

I've uploaded a reformatted results file (and an Excel sheet in case that's 
interesting) that show the results side by side.

Best,

Michael

> Am 02.05.2016 um 10:38 schrieb Michael Haupt :
> 
> Hi Jochen,
> 
> thanks for clarifying. I've added results from running the benchmarks on an 
> the unpatched JDK 9 base (see the CR link). The twisti and plevart patches 
> perform better for large numbers of classes and class values; the mhaupt 
> patch is weaker than the baseline in those settings.
> 
> As pointed out in my reply to Rémi, I'm very much in favour of using 
> Christian's patch too.
> 
> Best,
> 
> Michael
> 
>> Am 29.04.2016 um 19:14 schrieb Jochen Theodorou > >:
>> 
>> Hi,
>> 
>> there was not really any misunderstanding, just that some optimizations have 
>> to be considered carefully. I am no JDK reviewer, so I can give only my 
>> opinion as a user. But I am missing a comparison with the unpatched version. 
>> Comparing the given results and considering the size of the patches and that 
>> they might have to be reconsidered later on would lead me to prefer the 
>> twisti version actually. But since I am missing the compare with the 
>> unpatched version I cannot really judge the performance penalty a resize of 
>> the map will without doubt introduce. 
>> http://cr.openjdk.java.net/~mhaupt/8031043/benchmark/ClassValueBench.java 
>>  
>> contains some numbers, but I cannot tell if they compare or not. At least it 
>> does not contain the numbers I would expect
>> 
>> bye Jochen
>> 
>> On 29.04.2016 15:21, Michael Haupt wrote:
>>> Hi Jochen,
>>> 
 Am 29.04.2016 um 14:42 schrieb Jochen Theodorou >>> 
 >>:
 On 29.04.2016 13:19, Michael Haupt wrote:
> Hi Jochen,
> 
>> Am 29.04.2016 um 12:17 schrieb Jochen Theodorou > 
>> >
>> >>:
>> my fear is a bit that having only a single value, will not be enough
>> if you have for example multiple dynamic languages running... let's
>> say Nashorn, JRuby, and Groovy. Also, if I ever get there, using
>> multiple values would become a normal case in Groovy.
>> 
>> So any size depending optimization looks problematic for me
> 
> I may misunderstand you here - note the patch does not introduce a
> single-value *only* ClassValue. The patch is meant to introduce a
> special case for as long as there is only one value associated with a
> class. As soon as a second value comes in, the ClassValue will
> transition to the usual map storage.
> 
> Please let me know if this is a response to your concern.
 
 how does performance compare to cases of 2-12 values?
>>> 
>>> OK, I'm still not sure if there was a misunderstanding or not, and
>>> whether my response has clarified that. Please let me know.
>>> 
>>> To answer your question, see the numbers reported in
>>> http://cr.openjdk.java.net/~mhaupt/8031043/bench-results.txt 
>>>  - I'm not
>>> going to quote them in full detail here, but overall the numbers for 2,
>>> 4, and 16 values are on par for the randomGenerator and
>>> sequentialGenerator benchmarks, and show slightly better performance (on
>>> the order of 1ns) for the twisti and plevart patches.
>>> 
>>> Best,
>>> 
>>> Michael


-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-04 Thread Peter Levart

Hi,


On 04/29/2016 10:28 AM, Michael Haupt wrote:

All,

see http://cr.openjdk.java.net/~mhaupt/8031043/ 
 for a snapshot of what 
is currently available.


We have three patches:
* Christian's, which simply reduces the HashMap size,
* Peter's, which refactors ClassValueMap into a WeakHashMap,
* mine, which attempts to introduce the single-value storage 
optimisation John had suggested (I worked on performance with Aleksey 
- thanks!).


All of these are collected in the patches subdirectory for 
convenience. (Peter, I adapted your patch to the new Unsafe location.)


I extended Peter's benchmark (thanks!) to cover single-value storage; 
the source code is in the benchmark subdirectory, together with raw 
results from running the benchmark with each of the three patches 
applied. A results-only overview is in benchmark-results.txt.


The three are roughly on par. I'm not sure the single-value storage 
optimisation improves much on footprint given the additional data that 
must be kept around to make transition to map storage safe.


Opinions?


I must admit that my old patch is very complex, so I doubt anyone will 
take time to review it. It is almost a clean-room re-implementation of 
ClassValue API. My main motivation was footprint optimization for all 
sizes - not just one value per class as I doubt this will be very common 
situation anyway. Current ClassValue maintains 2 parallel hash-tables 
per class. A WeakHashMap which is accessed with proper synchronization 
and an optimized "cache" of entries for quick access. This makes it 
consume almost 100 bytes per (Class, ClassValue) pair. I managed to 
almost half the overhead for typical situation (1024 classes x 16 
ClassValue(s)), but for the price of complexity.


Reviving this thread made me think about ClassValue again and I got 
another idea. This is an experiment to see if ConcurrentHashMap could be 
leveraged to implement ClassValue API with little added complexity:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/webrev.01/

And here are the results of a benchmark comparing JDK 9 original with 
this alternative:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative2/ClassValueBench.java

It is a little slower for random access of bigger sizes and #s of 
classes. Most probably a consequence of reduced cache hit ratio as CHM 
is a classical hash table with buckets implemented as linked list of 
entries whereas jdk 9 ClassValue cache is a linear-scan hash table which 
has better cache locality. This is particularly obvious in sequential 
access where CHM behaves on-par. It's a pity that CHM has a 
non-changeable load factor of 0.75 as changing this to 0.5 would most 
certainly improve benchmark results for a little more memory.


Where this version excels is in footprint. I managed to more than half 
the overhead. There's only a single ReferenceQueue needed and 
consequently expunging of stale data is more prompt and thorough. The 
code of ClassValue has been more than halved too.


What do you think?

Regards, Peter



Best,

Michael

--

Oracle 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
OracleJava Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 
Potsdam, Germany


ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, 
D-80992 München

Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 
163/167, 3543 AS Utrecht, Niederlande

Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
Green Oracle  	Oracle is committed 
to developing practices and products that help protect the environment





___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-02 Thread Charles Oliver Nutter
On Wed, May 6, 2015 at 6:36 PM, Jochen Theodorou  wrote:

> Charlie, did you ever get to writing some benchmarks?
>

Unfortunately not but we are getting into a performance phase over the next
couple months. I'll see what I can come up with.

- Charlie
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-02 Thread Michael Haupt
Hi Jochen,

thanks for clarifying. I've added results from running the benchmarks on an the 
unpatched JDK 9 base (see the CR link). The twisti and plevart patches perform 
better for large numbers of classes and class values; the mhaupt patch is 
weaker than the baseline in those settings.

As pointed out in my reply to Rémi, I'm very much in favour of using 
Christian's patch too.

Best,

Michael

> Am 29.04.2016 um 19:14 schrieb Jochen Theodorou :
> 
> Hi,
> 
> there was not really any misunderstanding, just that some optimizations have 
> to be considered carefully. I am no JDK reviewer, so I can give only my 
> opinion as a user. But I am missing a comparison with the unpatched version. 
> Comparing the given results and considering the size of the patches and that 
> they might have to be reconsidered later on would lead me to prefer the 
> twisti version actually. But since I am missing the compare with the 
> unpatched version I cannot really judge the performance penalty a resize of 
> the map will without doubt introduce. 
> http://cr.openjdk.java.net/~mhaupt/8031043/benchmark/ClassValueBench.java 
>  
> contains some numbers, but I cannot tell if they compare or not. At least it 
> does not contain the numbers I would expect
> 
> bye Jochen
> 
> On 29.04.2016 15:21, Michael Haupt wrote:
>> Hi Jochen,
>> 
>>> Am 29.04.2016 um 14:42 schrieb Jochen Theodorou >> 
>>> >>:
>>> On 29.04.2016 13:19, Michael Haupt wrote:
 Hi Jochen,
 
> Am 29.04.2016 um 12:17 schrieb Jochen Theodorou  
> >
> >>:
> my fear is a bit that having only a single value, will not be enough
> if you have for example multiple dynamic languages running... let's
> say Nashorn, JRuby, and Groovy. Also, if I ever get there, using
> multiple values would become a normal case in Groovy.
> 
> So any size depending optimization looks problematic for me
 
 I may misunderstand you here - note the patch does not introduce a
 single-value *only* ClassValue. The patch is meant to introduce a
 special case for as long as there is only one value associated with a
 class. As soon as a second value comes in, the ClassValue will
 transition to the usual map storage.
 
 Please let me know if this is a response to your concern.
>>> 
>>> how does performance compare to cases of 2-12 values?
>> 
>> OK, I'm still not sure if there was a misunderstanding or not, and
>> whether my response has clarified that. Please let me know.
>> 
>> To answer your question, see the numbers reported in
>> http://cr.openjdk.java.net/~mhaupt/8031043/bench-results.txt - I'm not
>> going to quote them in full detail here, but overall the numbers for 2,
>> 4, and 16 values are on par for the randomGenerator and
>> sequentialGenerator benchmarks, and show slightly better performance (on
>> the order of 1ns) for the twisti and plevart patches.
>> 
>> Best,
>> 
>> Michael

-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-05-02 Thread Michael Haupt
Hi Rémi,

thanks for your reply - note that my patch is merely the result of an effort to 
meet the requirement of making the single-value storage suggested in the 
discussion on https://bugs.openjdk.java.net/browse/JDK-8031043 fast. This 
appears to be difficult. I agree that the performance isn't good enough to 
warrant integrating this code, especially given the complexity.

In fact, I think the obvious solution (reducing the initial hash table size to 
2) should be chosen. It introduces no performance penalty, reduces footprint, 
and otherwise leaves the complex code untouched.

Thanks for pointing out the oddity in removeSingleEntry - this I overlooked in 
one of the refactorings. :-)

BTW the VarHandles implementation is on the way to not needing ClassValue any 
more.

Best,

Michael

> Am 29.04.2016 um 17:32 schrieb Remi Forax :
> 
> Hi Mickael,
> the experience has proven that the code of ClassValue is hard to get right, 
> adding any optimization into such code will make it more complex, less 
> readable, more error prone so any optimization introduced as to really worth 
> it.
> I may have not correctly read the perf number, but IMO it's not enough. 
> 
> The other problem is that ClassValue is currently used by the implementations 
> of method handles and var handles and may be used by several other classes in 
> the future, so the optimization is/will not be very reliable.
> 
> I've not seriously review your patch but the first two lines of 
> removeSingleEntry seems rather mysterious.
> 
> cheers,
> Rémi
> 
> De: "Michael Haupt" 
> À: "Da Vinci Machine Project" 
> Envoyé: Vendredi 29 Avril 2016 13:19:32
> Objet: Re: ClassValue perf?
> 
> Hi Jochen,
> 
> Am 29.04.2016 um 12:17 schrieb Jochen Theodorou  <mailto:blackd...@gmx.org>>:
> my fear is a bit that having only a single value, will not be enough if you 
> have for example multiple dynamic languages running... let's say Nashorn, 
> JRuby, and Groovy. Also, if I ever get there, using multiple values would 
> become a normal case in Groovy.
> 
> So any size depending optimization looks problematic for me
> 
> I may misunderstand you here - note the patch does not introduce a 
> single-value *only* ClassValue. The patch is meant to introduce a special 
> case for as long as there is only one value associated with a class. As soon 
> as a second value comes in, the ClassValue will transition to the usual map 
> storage.
> 
> Please let me know if this is a response to your concern.
> 
> Best,
> 
> Michael

-- 

 <http://www.oracle.com/>
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
 <http://www.oracle.com/commitment> Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-04-29 Thread Jochen Theodorou

Hi,

there was not really any misunderstanding, just that some optimizations 
have to be considered carefully. I am no JDK reviewer, so I can give 
only my opinion as a user. But I am missing a comparison with the 
unpatched version. Comparing the given results and considering the size 
of the patches and that they might have to be reconsidered later on 
would lead me to prefer the twisti version actually. But since I am 
missing the compare with the unpatched version I cannot really judge the 
performance penalty a resize of the map will without doubt introduce. 
http://cr.openjdk.java.net/~mhaupt/8031043/benchmark/ClassValueBench.java contains 
some numbers, but I cannot tell if they compare or not. At least it does 
not contain the numbers I would expect


bye Jochen

On 29.04.2016 15:21, Michael Haupt wrote:

Hi Jochen,


Am 29.04.2016 um 14:42 schrieb Jochen Theodorou mailto:blackd...@gmx.org>>:
On 29.04.2016 13:19, Michael Haupt wrote:

Hi Jochen,


Am 29.04.2016 um 12:17 schrieb Jochen Theodorou mailto:blackd...@gmx.org>
>:
my fear is a bit that having only a single value, will not be enough
if you have for example multiple dynamic languages running... let's
say Nashorn, JRuby, and Groovy. Also, if I ever get there, using
multiple values would become a normal case in Groovy.

So any size depending optimization looks problematic for me


I may misunderstand you here - note the patch does not introduce a
single-value *only* ClassValue. The patch is meant to introduce a
special case for as long as there is only one value associated with a
class. As soon as a second value comes in, the ClassValue will
transition to the usual map storage.

Please let me know if this is a response to your concern.


how does performance compare to cases of 2-12 values?


OK, I'm still not sure if there was a misunderstanding or not, and
whether my response has clarified that. Please let me know.

To answer your question, see the numbers reported in
http://cr.openjdk.java.net/~mhaupt/8031043/bench-results.txt - I'm not
going to quote them in full detail here, but overall the numbers for 2,
4, and 16 values are on par for the randomGenerator and
sequentialGenerator benchmarks, and show slightly better performance (on
the order of 1ns) for the twisti and plevart patches.

Best,

Michael

--

Oracle 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
OracleJava Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam,
Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25,
D-80992 München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering
163/167, 3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
Green Oracle    Oracle is committed to
developing practices and products that help protect the environment




___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-04-29 Thread Remi Forax
Hi Mickael, 
the experience has proven that the code of ClassValue is hard to get right, 
adding any optimization into such code will make it more complex, less 
readable, more error prone so any optimization introduced as to really worth 
it. 
I may have not correctly read the perf number, but IMO it's not enough. 

The other problem is that ClassValue is currently used by the implementations 
of method handles and var handles and may be used by several other classes in 
the future, so the optimization is/will not be very reliable. 

I've not seriously review your patch but the first two lines of 
removeSingleEntry seems rather mysterious. 

cheers, 
Rémi 

- Mail original -

> De: "Michael Haupt" 
> À: "Da Vinci Machine Project" 
> Envoyé: Vendredi 29 Avril 2016 13:19:32
> Objet: Re: ClassValue perf?

> Hi Jochen,

> > Am 29.04.2016 um 12:17 schrieb Jochen Theodorou < blackd...@gmx.org >:
> 
> > my fear is a bit that having only a single value, will not be enough if you
> > have for example multiple dynamic languages running... let's say Nashorn,
> > JRuby, and Groovy. Also, if I ever get there, using multiple values would
> > become a normal case in Groovy.
> 

> > So any size depending optimization looks problematic for me
> 

> I may misunderstand you here - note the patch does not introduce a
> single-value *only* ClassValue. The patch is meant to introduce a special
> case for as long as there is only one value associated with a class. As soon
> as a second value comes in, the ClassValue will transition to the usual map
> storage.

> Please let me know if this is a response to your concern.

> Best,

> Michael

> --

> Dr. Michael Haupt | Principal Member of Technical Staff
> Phone: +49 331 200 7277 | Fax: +49 331 200 7561
> Oracle Java Platform Group | LangTools Team | Nashorn
> Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam,
> Germany

> ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992
> München
> Registergericht: Amtsgericht München, HRA 95603

> Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167,
> 3543 AS Utrecht, Niederlande
> Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
> Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher

> Oracle is committed to developing practices and products that help protect
> the environment

> ___
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-04-29 Thread Michael Haupt
Hi Jochen,

> Am 29.04.2016 um 14:42 schrieb Jochen Theodorou :
> On 29.04.2016 13:19, Michael Haupt wrote:
>> Hi Jochen,
>> 
>>> Am 29.04.2016 um 12:17 schrieb Jochen Theodorou >> >:
>>> my fear is a bit that having only a single value, will not be enough
>>> if you have for example multiple dynamic languages running... let's
>>> say Nashorn, JRuby, and Groovy. Also, if I ever get there, using
>>> multiple values would become a normal case in Groovy.
>>> 
>>> So any size depending optimization looks problematic for me
>> 
>> I may misunderstand you here - note the patch does not introduce a
>> single-value *only* ClassValue. The patch is meant to introduce a
>> special case for as long as there is only one value associated with a
>> class. As soon as a second value comes in, the ClassValue will
>> transition to the usual map storage.
>> 
>> Please let me know if this is a response to your concern.
> 
> how does performance compare to cases of 2-12 values?

OK, I'm still not sure if there was a misunderstanding or not, and whether my 
response has clarified that. Please let me know.

To answer your question, see the numbers reported in 
http://cr.openjdk.java.net/~mhaupt/8031043/bench-results.txt - I'm not going to 
quote them in full detail here, but overall the numbers for 2, 4, and 16 values 
are on par for the randomGenerator and sequentialGenerator benchmarks, and show 
slightly better performance (on the order of 1ns) for the twisti and plevart 
patches.

Best,

Michael

-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-04-29 Thread Jochen Theodorou



On 29.04.2016 13:19, Michael Haupt wrote:

Hi Jochen,


Am 29.04.2016 um 12:17 schrieb Jochen Theodorou mailto:blackd...@gmx.org>>:
my fear is a bit that having only a single value, will not be enough
if you have for example multiple dynamic languages running... let's
say Nashorn, JRuby, and Groovy. Also, if I ever get there, using
multiple values would become a normal case in Groovy.

So any size depending optimization looks problematic for me


I may misunderstand you here - note the patch does not introduce a
single-value *only* ClassValue. The patch is meant to introduce a
special case for as long as there is only one value associated with a
class. As soon as a second value comes in, the ClassValue will
transition to the usual map storage.

Please let me know if this is a response to your concern.


how does performance compare to cases of 2-12 values?

bye Jochen


___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-04-29 Thread Michael Haupt
Hi Jochen,

> Am 29.04.2016 um 12:17 schrieb Jochen Theodorou :
> my fear is a bit that having only a single value, will not be enough if you 
> have for example multiple dynamic languages running... let's say Nashorn, 
> JRuby, and Groovy. Also, if I ever get there, using multiple values would 
> become a normal case in Groovy.
> 
> So any size depending optimization looks problematic for me

I may misunderstand you here - note the patch does not introduce a single-value 
*only* ClassValue. The patch is meant to introduce a special case for as long 
as there is only one value associated with a class. As soon as a second value 
comes in, the ClassValue will transition to the usual map storage.

Please let me know if this is a response to your concern.

Best,

Michael

-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-04-29 Thread Jochen Theodorou



On 29.04.2016 10:28, Michael Haupt wrote:

All,

see http://cr.openjdk.java.net/~mhaupt/8031043/ for a snapshot of what
is currently available.

We have three patches:
* Christian's, which simply reduces the HashMap size,
* Peter's, which refactors ClassValueMap into a WeakHashMap,
* mine, which attempts to introduce the single-value storage
optimization John had suggested (I worked on performance with Aleksey -
thanks!).

All of these are collected in the patches subdirectory for convenience.
(Peter, I adapted your patch to the new Unsafe location.)

I extended Peter's benchmark (thanks!) to cover single-value storage;
the source code is in the benchmark subdirectory, together with raw
results from running the benchmark with each of the three patches
applied. A results-only overview is in benchmark-results.txt.

The three are roughly on par. I'm not sure the single-value storage
optimization improves much on footprint given the additional data that
must be kept around to make transition to map storage safe.

Opinions?


my fear is a bit that having only a single value, will not be enough if 
you have for example multiple dynamic languages running... let's say 
Nashorn, JRuby, and Groovy. Also, if I ever get there, using multiple 
values would become a normal case in Groovy.


So any size depending optimization looks problematic for me

bye Jochen

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2016-04-29 Thread Michael Haupt
All,

see http://cr.openjdk.java.net/~mhaupt/8031043/ for a snapshot of what is 
currently available.

We have three patches:
* Christian's, which simply reduces the HashMap size,
* Peter's, which refactors ClassValueMap into a WeakHashMap,
* mine, which attempts to introduce the single-value storage optimisation John 
had suggested (I worked on performance with Aleksey - thanks!).

All of these are collected in the patches subdirectory for convenience. (Peter, 
I adapted your patch to the new Unsafe location.)

I extended Peter's benchmark (thanks!) to cover single-value storage; the 
source code is in the benchmark subdirectory, together with raw results from 
running the benchmark with each of the three patches applied. A results-only 
overview is in benchmark-results.txt.

The three are roughly on par. I'm not sure the single-value storage 
optimisation improves much on footprint given the additional data that must be 
kept around to make transition to map storage safe.

Opinions?

Best,

Michael

-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | LangTools Team | Nashorn
Oracle Deutschland B.V. & Co. KG | Schiffbauergasse 14 | 14467 Potsdam, Germany

ORACLE Deutschland B.V. & Co. KG | Hauptverwaltung: Riesstraße 25, D-80992 
München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V. | Hertogswetering 163/167, 
3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-05-06 Thread Jochen Theodorou

Am 30.04.2015 15:43, schrieb Charles Oliver Nutter:

On Mon, Apr 27, 2015 at 12:50 PM, Jochen Theodorou  wrote:

Am 27.04.2015 19:17, schrieb Charles Oliver Nutter:

Jochen: Is your class-to-metaclass map usable apart from the Groovy
codebase?



Yes. Look for org.codehaus.groovy.reflection.GroovyClassValuePreJava7 which
is normally wrapped by a factory.


Excellent, thank you!


Charlie, did you ever get to writing some benchmarks?

bye Jochen


--
Jochen "blackdrag" Theodorou
blog: http://blackdragsview.blogspot.com/

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-05-03 Thread Peter Levart



On 05/03/2015 11:10 AM, Remi Forax wrote:

Hi Peter,
computeValue() may recursively call get() by example to crawle the 
inheritance hierarchy so i am not sure a lock is a good idea here 
because in that case, it usually takes several millis to complete the 
to level computeValue.


regards,
Rémi


In that case coputeValue() must be called without lock held and then the 
result CAS-ed. As said, this option is simple to get and the change is 
localized to ClassValue$Entry methods:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative/webrev.02/

Regards, Peter

P.S.

Are you also experiencing problems accessing links on 
cr.openjdk.java.net ? I get 404 - Not Found on any URL for 2 days now... 
SFTP works though, so I'm making these http URLs up from the sftp URLs. 
Hope this get fixed soon.




On 05/03/2015 12:32 AM, Peter Levart wrote:

Hi,

I have considered using ClassValue in the past because it is quite 
fast (as fast as a hash table can be) but was afraid of footprint 
overhead, because I saw with debugger a structure being grown before 
my eyes that was quite complicated and that I could not understand 
entirely. Now I took some time to actually try to understand and 
measure it. In a typical scenario ClassValue was designed for 
(initial capacity == 32), initializing for example 16 ClassValues x 
1024 Classes, jmap shows the following interesting entries which all 
amount to overhead (that's on 64bit JVM with compressed OOPS):



 num #instances #bytes  class name
--
   1: 16384 655360 java.util.WeakHashMap$Entry
   2: 16402 524864 java.lang.ClassValue$Entry
   8:  1024 147456 [Ljava.util.WeakHashMap$Entry;
   9:  1025 147480 [Ljava.lang.ClassValue$Entry;
  13:  1024  65536 java.lang.ClassValue$ClassValueMap
  17:  1024  32768 java.lang.ref.ReferenceQueue
  21:  1024  16384 java.lang.ref.ReferenceQueue$Lock
--
Total: 1589848  (97 bytes/entry)


ClassValueMap is a WeakHashMap subclass which contains an array of 
WeakHashMap$Entry objects. In addition it maintains a parallel 
"cache" array of ClassValue$Entry objects. Both of those entry 
objects are WeakReferences. It means that each (Class,ClassValue) 
pair needs 2 WeakReferences (with additional fields) and 2 array 
slots to hold associated value.


So I wondered, would it be possible to simplify CV and make it more 
straight-forward by taking away almost half of overhead to get this:


 num #instances #bytes  class name
--
   1: 16384 655360 java.lang.ClassValue$Entry
   7:  1024 147456 [Ljava.lang.ClassValue$Entry;
  13:  1024  40960 java.lang.ClassValue$ClassValueMap
  15:  1024  32768 java.lang.ref.ReferenceQueue
  19:  1024  16384 java.lang.ref.ReferenceQueue$Lock
--
Total:  892928  (54 bytes/entry)


I tried and came up with the following:

http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative/webrev.01/

It was not easy to keep the performance approximately on the same 
level while re-designing the implementation. But I think I managed to 
get it to perform mostly the same for the fast-path case. This 
alternative implementation also guarantees that, unless remove() is 
used, computeValue() is called exactly once per (Class, ClassValue) 
pair. Original implementation explains that it can redundantly 
compute more than one value and then throw away all but one. This 
alternative implementation could easily be modified to do the same 
(using CAS instead of lock) if anyone is afraid of deadlocks.


Here's a micro benchmark with results measuring original vs. 
alternative implementation. Attached results are for JDK9 on Intel i7 
/ Linux box using 4 concurrent threads for tests:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative/ClassValueBench.java


It would be interesting to see if and how it works for you too (just 
compile and prepend to bootclasspath).


Regards, Peter

On 04/30/2015 03:57 PM, Michael Haupt wrote:

Hi,

I'm looking at JDK-8031043 and would appreciate if you guys could 
send any code you think might benefit from a smaller initial CV 
memory footprint my way. Given what I've read, it could have some 
impact during startup (Groovy?) if the value is reduced to 1.


Best,

Michael

Am 30.04.2015 um 15:43 schrieb Charles Oliver Nutter 
mailto:head...@headius.com>>:


On Mon, Apr 27, 2015 at 12:50 PM, Jochen Theodorou 
mailto:blackd...@gmx.org>> wrote:

Am 27.04.2015 19:17, schrieb Charles Oliver Nutter:

Jochen: Is your class-to-metaclass map usable apart from the Groovy
codebase?



Yes. Look for 
org.codehaus.groovy.reflection.GroovyClassValuePreJava7 which

is n

Re: ClassValue perf?

2015-05-03 Thread Remi Forax

Hi Peter,
computeValue() may recursively call get() by example to crawle the 
inheritance hierarchy so i am not sure a lock is a good idea here 
because in that case, it usually takes several millis to complete the to 
level computeValue.


regards,
Rémi

On 05/03/2015 12:32 AM, Peter Levart wrote:

Hi,

I have considered using ClassValue in the past because it is quite 
fast (as fast as a hash table can be) but was afraid of footprint 
overhead, because I saw with debugger a structure being grown before 
my eyes that was quite complicated and that I could not understand 
entirely. Now I took some time to actually try to understand and 
measure it. In a typical scenario ClassValue was designed for (initial 
capacity == 32), initializing for example 16 ClassValues x 1024 
Classes, jmap shows the following interesting entries which all amount 
to overhead (that's on 64bit JVM with compressed OOPS):



 num #instances #bytes  class name
--
   1: 16384 655360 java.util.WeakHashMap$Entry
   2: 16402 524864 java.lang.ClassValue$Entry
   8:  1024 147456 [Ljava.util.WeakHashMap$Entry;
   9:  1025 147480 [Ljava.lang.ClassValue$Entry;
  13:  1024  65536 java.lang.ClassValue$ClassValueMap
  17:  1024  32768 java.lang.ref.ReferenceQueue
  21:  1024  16384 java.lang.ref.ReferenceQueue$Lock
--
Total: 1589848  (97 bytes/entry)


ClassValueMap is a WeakHashMap subclass which contains an array of 
WeakHashMap$Entry objects. In addition it maintains a parallel "cache" 
array of ClassValue$Entry objects. Both of those entry objects are 
WeakReferences. It means that each (Class,ClassValue) pair needs 2 
WeakReferences (with additional fields) and 2 array slots to hold 
associated value.


So I wondered, would it be possible to simplify CV and make it more 
straight-forward by taking away almost half of overhead to get this:


 num #instances #bytes  class name
--
   1: 16384 655360 java.lang.ClassValue$Entry
   7:  1024 147456 [Ljava.lang.ClassValue$Entry;
  13:  1024  40960 java.lang.ClassValue$ClassValueMap
  15:  1024  32768 java.lang.ref.ReferenceQueue
  19:  1024  16384 java.lang.ref.ReferenceQueue$Lock
--
Total:  892928  (54 bytes/entry)


I tried and came up with the following:

http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative/webrev.01/

It was not easy to keep the performance approximately on the same 
level while re-designing the implementation. But I think I managed to 
get it to perform mostly the same for the fast-path case. This 
alternative implementation also guarantees that, unless remove() is 
used, computeValue() is called exactly once per (Class, ClassValue) 
pair. Original implementation explains that it can redundantly compute 
more than one value and then throw away all but one. This alternative 
implementation could easily be modified to do the same (using CAS 
instead of lock) if anyone is afraid of deadlocks.


Here's a micro benchmark with results measuring original vs. 
alternative implementation. Attached results are for JDK9 on Intel i7 
/ Linux box using 4 concurrent threads for tests:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative/ClassValueBench.java


It would be interesting to see if and how it works for you too (just 
compile and prepend to bootclasspath).


Regards, Peter

On 04/30/2015 03:57 PM, Michael Haupt wrote:

Hi,

I'm looking at JDK-8031043 and would appreciate if you guys could 
send any code you think might benefit from a smaller initial CV 
memory footprint my way. Given what I've read, it could have some 
impact during startup (Groovy?) if the value is reduced to 1.


Best,

Michael

Am 30.04.2015 um 15:43 schrieb Charles Oliver Nutter 
mailto:head...@headius.com>>:


On Mon, Apr 27, 2015 at 12:50 PM, Jochen Theodorou 
mailto:blackd...@gmx.org>> wrote:

Am 27.04.2015 19:17, schrieb Charles Oliver Nutter:

Jochen: Is your class-to-metaclass map usable apart from the Groovy
codebase?



Yes. Look for 
org.codehaus.groovy.reflection.GroovyClassValuePreJava7 which

is normally wrapped by a factory.


Excellent, thank you!

- Charlie



--

Oracle 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
OracleJava Platform Group | HotSpot Compiler Team
Oracle Deutschland B.V. & Co. KG, Schiffbauergasse 14 | 14467 
Potsdam, Germany
Green Oracle  	Oracle is committed 
to developing practices and products that help protect the environment





___
mlvm-dev mailing list
mlvm-dev@

Re: ClassValue perf?

2015-05-02 Thread Peter Levart

Hi,

I have considered using ClassValue in the past because it is quite fast 
(as fast as a hash table can be) but was afraid of footprint overhead, 
because I saw with debugger a structure being grown before my eyes that 
was quite complicated and that I could not understand entirely. Now I 
took some time to actually try to understand and measure it. In a 
typical scenario ClassValue was designed for (initial capacity == 32), 
initializing for example 16 ClassValues x 1024 Classes, jmap shows the 
following interesting entries which all amount to overhead (that's on 
64bit JVM with compressed OOPS):



 num #instances #bytes  class name
--
   1: 16384 655360 java.util.WeakHashMap$Entry
   2: 16402 524864 java.lang.ClassValue$Entry
   8:  1024 147456 [Ljava.util.WeakHashMap$Entry;
   9:  1025 147480 [Ljava.lang.ClassValue$Entry;
  13:  1024  65536 java.lang.ClassValue$ClassValueMap
  17:  1024  32768 java.lang.ref.ReferenceQueue
  21:  1024  16384 java.lang.ref.ReferenceQueue$Lock
--
Total: 1589848  (97 bytes/entry)


ClassValueMap is a WeakHashMap subclass which contains an array of 
WeakHashMap$Entry objects. In addition it maintains a parallel "cache" 
array of ClassValue$Entry objects. Both of those entry objects are 
WeakReferences. It means that each (Class,ClassValue) pair needs 2 
WeakReferences (with additional fields) and 2 array slots to hold 
associated value.


So I wondered, would it be possible to simplify CV and make it more 
straight-forward by taking away almost half of overhead to get this:


 num #instances #bytes  class name
--
   1: 16384 655360 java.lang.ClassValue$Entry
   7:  1024 147456 [Ljava.lang.ClassValue$Entry;
  13:  1024  40960 java.lang.ClassValue$ClassValueMap
  15:  1024  32768 java.lang.ref.ReferenceQueue
  19:  1024  16384 java.lang.ref.ReferenceQueue$Lock
--
Total:  892928  (54 bytes/entry)


I tried and came up with the following:

http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative/webrev.01/

It was not easy to keep the performance approximately on the same level 
while re-designing the implementation. But I think I managed to get it 
to perform mostly the same for the fast-path case. This alternative 
implementation also guarantees that, unless remove() is used, 
computeValue() is called exactly once per (Class, ClassValue) pair. 
Original implementation explains that it can redundantly compute more 
than one value and then throw away all but one. This alternative 
implementation could easily be modified to do the same (using CAS 
instead of lock) if anyone is afraid of deadlocks.


Here's a micro benchmark with results measuring original vs. alternative 
implementation. Attached results are for JDK9 on Intel i7 / Linux box 
using 4 concurrent threads for tests:


http://cr.openjdk.java.net/~plevart/misc/ClassValue.Alternative/ClassValueBench.java


It would be interesting to see if and how it works for you too (just 
compile and prepend to bootclasspath).


Regards, Peter

On 04/30/2015 03:57 PM, Michael Haupt wrote:

Hi,

I'm looking at JDK-8031043 and would appreciate if you guys could send 
any code you think might benefit from a smaller initial CV memory 
footprint my way. Given what I've read, it could have some impact 
during startup (Groovy?) if the value is reduced to 1.


Best,

Michael

Am 30.04.2015 um 15:43 schrieb Charles Oliver Nutter 
mailto:head...@headius.com>>:


On Mon, Apr 27, 2015 at 12:50 PM, Jochen Theodorou > wrote:

Am 27.04.2015 19:17, schrieb Charles Oliver Nutter:

Jochen: Is your class-to-metaclass map usable apart from the Groovy
codebase?



Yes. Look for 
org.codehaus.groovy.reflection.GroovyClassValuePreJava7 which

is normally wrapped by a factory.


Excellent, thank you!

- Charlie



--

Oracle 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
OracleJava Platform Group | HotSpot Compiler Team
Oracle Deutschland B.V. & Co. KG, Schiffbauergasse 14 | 14467 Potsdam, 
Germany
Green Oracle  	Oracle is committed 
to developing practices and products that help protect the environment





___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-30 Thread Michael Haupt
Hi,

I'm looking at JDK-8031043 and would appreciate if you guys could send any code 
you think might benefit from a smaller initial CV memory footprint my way. 
Given what I've read, it could have some impact during startup (Groovy?) if the 
value is reduced to 1.

Best,

Michael

> Am 30.04.2015 um 15:43 schrieb Charles Oliver Nutter :
> 
> On Mon, Apr 27, 2015 at 12:50 PM, Jochen Theodorou  wrote:
>> Am 27.04.2015 19:17, schrieb Charles Oliver Nutter:
>>> Jochen: Is your class-to-metaclass map usable apart from the Groovy
>>> codebase?
>> 
>> 
>> Yes. Look for org.codehaus.groovy.reflection.GroovyClassValuePreJava7 which
>> is normally wrapped by a factory.
> 
> Excellent, thank you!
> 
> - Charlie


-- 

 
Dr. Michael Haupt | Principal Member of Technical Staff
Phone: +49 331 200 7277 | Fax: +49 331 200 7561
Oracle Java Platform Group | HotSpot Compiler Team 
Oracle Deutschland B.V. & Co. KG, Schiffbauergasse 14 | 14467 Potsdam, Germany
  Oracle is committed to developing 
practices and products that help protect the environment

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-30 Thread Charles Oliver Nutter
On Mon, Apr 27, 2015 at 12:50 PM, Jochen Theodorou  wrote:
> Am 27.04.2015 19:17, schrieb Charles Oliver Nutter:
>> Jochen: Is your class-to-metaclass map usable apart from the Groovy
>> codebase?
>
>
> Yes. Look for org.codehaus.groovy.reflection.GroovyClassValuePreJava7 which
> is normally wrapped by a factory.

Excellent, thank you!

- Charlie
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-30 Thread Charles Oliver Nutter
On Wed, Apr 29, 2015 at 4:02 AM, Doug Simon  wrote:
> We considered using ClassValue in Graal for associating each Node with its 
> NodeClass. Accessing the NodeClass is a very common operation in Graal (e.g., 
> it’s used to iterate over a Node’s inputs). However, brief experimentation 
> showed implementing this with ClassValue performed significantly worse than a 
> direct field access[1]. We currently use ClassValue to link Class values with 
> their Graal mirrors. Accessing this link is infrequent enough that the 
> performance trade off against injecting a field to java.lang.Class[2] is 
> acceptable.

That's what I'm banking on too. My case is similar to Groovy's: I need
a way to *initially* get the metaclass for a given JVM class. Unlike
Groovy, however, we still have to wrap Java objects in a JRuby-aware
wrapper, so subsequent accesses of the class via that object are via a
plain field. So the impact of ClassValue will mostly be at the border
between Ruby and Java, when we need to initially build that wrapper
and put some metaclass in it.

Of course the disadvantage of the wrapper is the wrapper itself. If we
could inject our IRubyObject interface into java.lang.Object my life
would be much better. But I digress.

> The memory footprint improvement suggested in JDK-8031043 would still help.

I'll have to take a look at that. We're pretty memory-sensitive since
Ruby's already fairly heap-intensive.

- Charlie
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-29 Thread Doug Simon
We considered using ClassValue in Graal for associating each Node with its 
NodeClass. Accessing the NodeClass is a very common operation in Graal (e.g., 
it’s used to iterate over a Node’s inputs). However, brief experimentation 
showed implementing this with ClassValue performed significantly worse than a 
direct field access[1]. We currently use ClassValue to link Class values with 
their Graal mirrors. Accessing this link is infrequent enough that the 
performance trade off against injecting a field to java.lang.Class[2] is 
acceptable. The memory footprint improvement suggested in JDK-8031043 would 
still help.

-Doug

[1] 
http://hg.openjdk.java.net/graal/graal/file/0b221b4ad707/graal/com.oracle.graal.graph/src/com/oracle/graal/graph/Node.java#l206
[2] http://hg.openjdk.java.net/graal/graal/rev/606959535fd4

> On Apr 27, 2015, at 6:40 PM, Christian Thalinger 
>  wrote:
> 
> 
>> On Apr 24, 2015, at 2:17 PM, John Rose  wrote:
>> 
>> On Apr 24, 2015, at 5:38 AM, Charles Oliver Nutter  
>> wrote:
>>> 
>>> Hey folks!
>>> 
>>> I'm wondering how the performance of ClassValue looks on recent
>>> OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is
>>> one place I'd like to simplify our code a bit.
>>> 
>>> I could measure myself, but I'm guessing some of you have already done
>>> a lot of exploration or have benchmarks handy. So, what say you?
>> 
>> I'm listening too.  We don't have any special optimizations for CVs,
>> and I'm hoping the generic code is a good-enough start.
> 
> A while ago (wow; it’s more than a year already) I was working on:
> 
> [#JDK-8031043] ClassValue's backing map should have a smaller initial size - 
> Java Bug System
> 
> and we had a conversation about it:
> 
> http://mail.openjdk.java.net/pipermail/mlvm-dev/2014-January/005597.html
> 
> It’s not about performance directly but it’s about memory usage and maybe the 
> one-value-per-class optimization John suggests is in fact a performance 
> improvement.  Someone should pick this one up.
> 
>> — John
>> ___
>> mlvm-dev mailing list
>> mlvm-dev@openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
> 
> ___
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-27 Thread Jochen Theodorou

Am 27.04.2015 19:17, schrieb Charles Oliver Nutter:

It seems I may have to write some benchmarks for this then. Just so I
understand, the equivalent non-ClassValue-based store would need to:

* Be atomic; value may calculate more than once but only be set once.
* Be weak; classes given class values must not be rooted as a result
(an external impl like in JRuby or Groovy would have to use weak maps
for this).

Jochen: Is your class-to-metaclass map usable apart from the Groovy codebase?


Yes. Look for org.codehaus.groovy.reflection.GroovyClassValuePreJava7 
which is normally wrapped by a factory. You can do for example



final AtomicInteger counter = new AtomicInteger();
GroovyClassValue classValue = new GroovyClassValuePreJava7(new 
ComputeValue() {
String computeValue(Class type){
counter.incrementAndGet()
return type.name;
}});


normally of course we don't store strings, normally we store ClassInfo 
objects, which then break down to the meta classes and per instance meta 
classes ;)


bye blackdrag

--
Jochen "blackdrag" Theodorou
blog: http://blackdragsview.blogspot.com/

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-27 Thread Charles Oliver Nutter
It seems I may have to write some benchmarks for this then. Just so I
understand, the equivalent non-ClassValue-based store would need to:

* Be atomic; value may calculate more than once but only be set once.
* Be weak; classes given class values must not be rooted as a result
(an external impl like in JRuby or Groovy would have to use weak maps
for this).

Jochen: Is your class-to-metaclass map usable apart from the Groovy codebase?

- Charlie

On Mon, Apr 27, 2015 at 11:40 AM, Christian Thalinger
 wrote:
>
> On Apr 24, 2015, at 2:17 PM, John Rose  wrote:
>
> On Apr 24, 2015, at 5:38 AM, Charles Oliver Nutter 
> wrote:
>
>
> Hey folks!
>
> I'm wondering how the performance of ClassValue looks on recent
> OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is
> one place I'd like to simplify our code a bit.
>
> I could measure myself, but I'm guessing some of you have already done
> a lot of exploration or have benchmarks handy. So, what say you?
>
>
> I'm listening too.  We don't have any special optimizations for CVs,
> and I'm hoping the generic code is a good-enough start.
>
>
> A while ago (wow; it’s more than a year already) I was working on:
>
> [#JDK-8031043] ClassValue's backing map should have a smaller initial size -
> Java Bug System
>
> and we had a conversation about it:
>
> http://mail.openjdk.java.net/pipermail/mlvm-dev/2014-January/005597.html
>
> It’s not about performance directly but it’s about memory usage and maybe
> the one-value-per-class optimization John suggests is in fact a performance
> improvement.  Someone should pick this one up.
>
> — John
> ___
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>
>
>
> ___
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-27 Thread Christian Thalinger

> On Apr 24, 2015, at 2:17 PM, John Rose  wrote:
> 
> On Apr 24, 2015, at 5:38 AM, Charles Oliver Nutter  
> wrote:
>> 
>> Hey folks!
>> 
>> I'm wondering how the performance of ClassValue looks on recent
>> OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is
>> one place I'd like to simplify our code a bit.
>> 
>> I could measure myself, but I'm guessing some of you have already done
>> a lot of exploration or have benchmarks handy. So, what say you?
> 
> I'm listening too.  We don't have any special optimizations for CVs,
> and I'm hoping the generic code is a good-enough start.

A while ago (wow; it’s more than a year already) I was working on:

[#JDK-8031043] ClassValue's backing map should have a smaller initial size - 
Java Bug System 

and we had a conversation about it:

http://mail.openjdk.java.net/pipermail/mlvm-dev/2014-January/005597.html 


It’s not about performance directly but it’s about memory usage and maybe the 
one-value-per-class optimization John suggests is in fact a performance 
improvement.  Someone should pick this one up.

> — John
> ___
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-27 Thread MacGregor, Duncan (GE Energy Management)
On 25/04/2015 13:44, "Remi Forax"  wrote:
>On 04/24/2015 11:17 PM, John Rose wrote:
>> On Apr 24, 2015, at 5:38 AM, Charles Oliver Nutter
>> wrote:
>>> Hey folks!
>>>
>>> I'm wondering how the performance of ClassValue looks on recent
>>> OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is
>>> one place I'd like to simplify our code a bit.
>>>
>>> I could measure myself, but I'm guessing some of you have already done
>>> a lot of exploration or have benchmarks handy. So, what say you?
>> I'm listening too.  We don't have any special optimizations for CVs,
>> and I'm hoping the generic code is a good-enough start.
>> ‹ John
>
>I don't think I have a code that use ClassValue in a fast path .
>I have several codes that uses a ClassValue when a callsite becomes
>megamorphic but in that case, I have found that the perf of
>ClassValue.get() hard to separate from the fact that I also loose
>inlining.

I did have CV code in my fast paths for dispatch that I could do useful
class hierarchy analysis on, but couldn¹t get it to run quite fast enough
to be worthwhile. Whether that was due to the CV part or other factors I
didn¹t fully analyse as CHA was not applicable to as many callsites as I¹d
hoped.

Duncan.


___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-25 Thread Remi Forax


On 04/24/2015 11:17 PM, John Rose wrote:

On Apr 24, 2015, at 5:38 AM, Charles Oliver Nutter  wrote:

Hey folks!

I'm wondering how the performance of ClassValue looks on recent
OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is
one place I'd like to simplify our code a bit.

I could measure myself, but I'm guessing some of you have already done
a lot of exploration or have benchmarks handy. So, what say you?

I'm listening too.  We don't have any special optimizations for CVs,
and I'm hoping the generic code is a good-enough start.
— John


I don't think I have a code that use ClassValue in a fast path .
I have several codes that uses a ClassValue when a callsite becomes 
megamorphic but in that case, I have found that the perf of 
ClassValue.get() hard to separate from the fact that I also loose inlining.


cheers,
Rémi

___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


Re: ClassValue perf?

2015-04-24 Thread John Rose
On Apr 24, 2015, at 5:38 AM, Charles Oliver Nutter  wrote:
> 
> Hey folks!
> 
> I'm wondering how the performance of ClassValue looks on recent
> OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is
> one place I'd like to simplify our code a bit.
> 
> I could measure myself, but I'm guessing some of you have already done
> a lot of exploration or have benchmarks handy. So, what say you?

I'm listening too.  We don't have any special optimizations for CVs,
and I'm hoping the generic code is a good-enough start.
— John
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


ClassValue perf?

2015-04-24 Thread Charles Oliver Nutter
Hey folks!

I'm wondering how the performance of ClassValue looks on recent
OpenJDK 7 and 8 builds. JRuby 9000 will be Java 7+ only, so this is
one place I'd like to simplify our code a bit.

I could measure myself, but I'm guessing some of you have already done
a lot of exploration or have benchmarks handy. So, what say you?

- Charlie
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev