Hi Mandy,
I prepared a preview variant of j.l.r.Proxy using WeakCache (turned into
an interface and a special FlattenedWeakCache implementation in
anticipation to create another variant using two-levels of
ConcurrentHashMaps for backing storage, but with same API) just to
compare performance:
https://dl.dropboxusercontent.com/u/101777488/jdk8-tl/proxy-wc/webrev.01/index.html
As the values (Class objects of proxy classes) must be wrapped in a
WeakReference, the same instance of WeakReference can be re-used as a
key in another ConcurrentHashMap to implement quick look-up for
Proxy.isProxyClass() method eliminating the need to use ClassValue,
which is quite space-hungry.
Comparing the performance, here's a summary of all 3 variants (original,
patched using a field in ClassLoader and this variant):
Summary (4 Cores x 2 Threads i7 CPU):
Test Threads ns/op Original Patched (CL field)
Patched (WeakCache)
======================= ======= ============== ==================
===================
Proxy_getProxyClass 1 2,403.27
163.70 206.88
4 3,039.01
202.77 303.38
8 5,193.58
314.47 442.58
Proxy_isProxyClassTrue 1 95.02
10.78 41.85
4 2,266.29
10.80 42.32
8 4,782.29
20.53 72.29
Proxy_isProxyClassFalse 1 95.02
1.36 1.36
4 2,186.59
1.36 1.37
8 4,891.15
2.72 2.94
Annotation_equals 1 240.10
152.29 193.27
4 1,864.06
153.81 195.60
8 8,639.20
262.09 384.72
The improvement is still quite satisfactory, although a little slower
than the direct-field variant. The scalability is the same as with
direct-field variant.
Space consumption of cache structure, calculated as deep-size of the
structure, ignoring interned Strings, Class and ClassLoader objects
unsing single non-bootstrap ClassLoader for defining the proxy classes
and using 32 bit addressing is the following:
original Proxy code:
proxy size of delta to
classes caches prev.ln.
-------- -------- --------
0 400 400
1 768 368
2 920 152
3 1072 152
4 1224 152
5 1376 152
6 1528 152
7 1680 152
8 1832 152
9 1984 152
10 2136 152
Proxy patched with the variant using FlattenedWeakCache, run on current
JDK8/tl tip (still uses old ConcurrentHashMap implementation with segments):
proxy size of delta to
classes caches prev.ln.
-------- -------- --------
0 560 560
1 936 376
2 1312 376
3 1688 376
4 2064 376
5 2352 288
6 2728 376
7 3016 288
8 3392 376
9 3592 200
10 3872 280
...and the same with current JDK8/lambda tip (using new segment-less
ConcurrentHashMap):
proxy size of delta to
classes caches prev.ln.
-------- -------- --------
0 240 240
1 584 344
2 768 184
3 952 184
4 1136 184
5 1320 184
6 1504 184
7 1688 184
8 1872 184
9 2056 184
10 2240 184
So with new ConcurrentHashMap the patched Proxy uses about 32 bytes more
per proxy class.
Is this satisfactory or should we also try a variant with two-levels of
ConcurrentHashMaps?
Regards, Peter
P.S. Comment to your comment in-line...
On 04/16/2013 12:58 AM, Mandy Chung wrote:
On 4/13/2013 2:59 PM, Peter Levart wrote:
I also devised an alternative caching mechanism with scalability in
mind which uses WeakReferences for keys (for example ClassLoader)
and values (for example Class) that could be used in this situation
in case adding a field to ClassLoader is not an option:
I would also consider any alternative to avoid adding the
proxyClassCache field in ClassLoader as Alan commented previously.
My observation of the typical usage of proxies is to use the
interface's class loader to define the proxy class. So is it
necessary to maintain a per-loader cache? The per-loader cache maps
from the interface names to a proxy class defined by one loader. I
would think it's reasonable to assume the number of loaders to
define proxy class with the same set of interfaces is small. What
if we make the cache as "interface names" as the key to a set of
proxy class suppliers that can have only one proxy class per one
unique defining loader. If the proxy class is being generated i.e.
ProxyClassFactory supplier, the loader is available for comparison.
When there are more than one matching proxy classes, it would have
to iterate all in the set.
I would assume yes, proxy class for a particular set of interfaces is
typically defined by one classloader only. But the API allows to
specify different loaders as long as the interfaces implemented by
proxy class are "visible" from the loader that defines the proxy
class. If we're talking about interface names - as opposed to
interfaces - then the possibility that a particular set of interface
names would want to be used to define proxy classes with different
loaders is even bigger, since an interface name can refer to
different interfaces with same name (think of interfaces deployed as
part of an app in an application server, say a set of annotations
used by different apps but deployed as part of each individual app).
Agree. I was tempted to consider making weak reference to the
interface classes as the key but in any case the overhead of
Class.getClassLoader() is still a performance hog. Let's move
forward with the alternative you propose.
The scheme you're proposing might be possible, though not simple: The
factory Supplier<Class> would become a Function<ClassLoader, Class>
and would have to maintain it's own set of cached proxy classes.
There would be a single ConcurrentMap<List<String>,
Function<ClassLoader, Class>> to map sets of interface names to
factory Functions, but the cached classes in a particular factory
Function would still have to be weakly referenced. I see some
difficulties in implementing such a scheme:
- expunging cleared WeakReferences could only reliably clear the
cache inside each factory Function but removing the entry from the
map of factory Functions when last proxy class for a particular set
of interface names is expunged would become a difficult task if not
impossible with all the scalability constraints in mind (just
thinking about concurrent requests into same factory Function where
one is requesting new proxy class and the other is expunging cleared
WeakReference which represents the last element in the set of cached
proxy classes).
- one of my past ideas of implementing scalable Proxy.isProxyClass()
was to maintain a Set<Class> in each ClassLoader populated with all
the proxy classes defined by a particular ClassLoader. Benchmarking
such solution showed that Class.getClassLoader() is a peformance hog,
so I scraped it in favor of ClassValue<Boolean> that is now
incorporated in the patch. In order to "choose" the right proxy class
from the set of proxy classes inside a particular factory Function,
the Class.getClassLoader() method would have to be used, or entries
would have to (weakly) reference a particular ClassLoader associated
with each proxy class.
Thanks for reminding me your earlier prototype. I suspect the cost of
Class.getClassLoader() is due to its lookup of the caller class every
time it's called.
Even without SecurityManager installed the performance of native
getClassLoader0 was a hog. I don't know why? Isn't there an implicit
reference to defining ClassLoader from every Class object?
Considering all that, such solution starts to look unappealing. It
might even be more space-hungry then the presented WeakCache.
WeakCache is currently the following:
ConcurrentMap<WeakReferenceWithInterfaceNames<ClassLoader>,
WeakReference<Class>>
another alternative would be:
ConcurrentMap<WeakReference<ClassLoader>,
ConcurrentMap<InterfaceNames, WeakReference<Class>>>
...which might need a little less space than WeakCache (only one
WeakReference per proxy class + one per ClassLoader instead of two
WeakReferences per proxy class) but would require two map lookups
during fast-path retrieval. It might not be performance critical and
the expunging could be performed easily too.
I am fine with either of these alternatives. As you noted, the latter
one would save little bit of memory for the cases when several proxy
classes are defined per loader e.g. one per each annotation type.
Mandy