Hi Mandy,

I prepared a preview variant of j.l.r.Proxy using WeakCache (turned into an interface and a special FlattenedWeakCache implementation in anticipation to create another variant using two-levels of ConcurrentHashMaps for backing storage, but with same API) just to compare performance:

https://dl.dropboxusercontent.com/u/101777488/jdk8-tl/proxy-wc/webrev.01/index.html

As the values (Class objects of proxy classes) must be wrapped in a WeakReference, the same instance of WeakReference can be re-used as a key in another ConcurrentHashMap to implement quick look-up for Proxy.isProxyClass() method eliminating the need to use ClassValue, which is quite space-hungry.

Comparing the performance, here's a summary of all 3 variants (original, patched using a field in ClassLoader and this variant):


Summary (4 Cores x 2 Threads i7 CPU):

Test Threads ns/op Original Patched (CL field) Patched (WeakCache) ======================= ======= ============== ================== =================== Proxy_getProxyClass 1 2,403.27 163.70 206.88 4 3,039.01 202.77 303.38 8 5,193.58 314.47 442.58

Proxy_isProxyClassTrue 1 95.02 10.78 41.85 4 2,266.29 10.80 42.32 8 4,782.29 20.53 72.29

Proxy_isProxyClassFalse 1 95.02 1.36 1.36 4 2,186.59 1.36 1.37 8 4,891.15 2.72 2.94

Annotation_equals 1 240.10 152.29 193.27 4 1,864.06 153.81 195.60 8 8,639.20 262.09 384.72


The improvement is still quite satisfactory, although a little slower than the direct-field variant. The scalability is the same as with direct-field variant.

Space consumption of cache structure, calculated as deep-size of the structure, ignoring interned Strings, Class and ClassLoader objects unsing single non-bootstrap ClassLoader for defining the proxy classes and using 32 bit addressing is the following:

original Proxy code:

proxy     size of   delta to
classes   caches    prev.ln.
--------  --------  --------
       0       400       400
       1       768       368
       2       920       152
       3      1072       152
       4      1224       152
       5      1376       152
       6      1528       152
       7      1680       152
       8      1832       152
       9      1984       152
      10      2136       152

Proxy patched with the variant using FlattenedWeakCache, run on current JDK8/tl tip (still uses old ConcurrentHashMap implementation with segments):

proxy     size of   delta to
classes   caches    prev.ln.
--------  --------  --------
       0       560       560
       1       936       376
       2      1312       376
       3      1688       376
       4      2064       376
       5      2352       288
       6      2728       376
       7      3016       288
       8      3392       376
       9      3592       200
      10      3872       280

...and the same with current JDK8/lambda tip (using new segment-less ConcurrentHashMap):

proxy     size of   delta to
classes   caches    prev.ln.
--------  --------  --------
       0       240       240
       1       584       344
       2       768       184
       3       952       184
       4      1136       184
       5      1320       184
       6      1504       184
       7      1688       184
       8      1872       184
       9      2056       184
      10      2240       184


So with new ConcurrentHashMap the patched Proxy uses about 32 bytes more per proxy class.


Is this satisfactory or should we also try a variant with two-levels of ConcurrentHashMaps?


Regards, Peter


P.S. Comment to your comment in-line...

On 04/16/2013 12:58 AM, Mandy Chung wrote:

On 4/13/2013 2:59 PM, Peter Levart wrote:


I also devised an alternative caching mechanism with scalability in mind which uses WeakReferences for keys (for example ClassLoader) and values (for example Class) that could be used in this situation in case adding a field to ClassLoader is not an option:


I would also consider any alternative to avoid adding the proxyClassCache field in ClassLoader as Alan commented previously.

My observation of the typical usage of proxies is to use the interface's class loader to define the proxy class. So is it necessary to maintain a per-loader cache? The per-loader cache maps from the interface names to a proxy class defined by one loader. I would think it's reasonable to assume the number of loaders to define proxy class with the same set of interfaces is small. What if we make the cache as "interface names" as the key to a set of proxy class suppliers that can have only one proxy class per one unique defining loader. If the proxy class is being generated i.e. ProxyClassFactory supplier, the loader is available for comparison. When there are more than one matching proxy classes, it would have to iterate all in the set.

I would assume yes, proxy class for a particular set of interfaces is typically defined by one classloader only. But the API allows to specify different loaders as long as the interfaces implemented by proxy class are "visible" from the loader that defines the proxy class. If we're talking about interface names - as opposed to interfaces - then the possibility that a particular set of interface names would want to be used to define proxy classes with different loaders is even bigger, since an interface name can refer to different interfaces with same name (think of interfaces deployed as part of an app in an application server, say a set of annotations used by different apps but deployed as part of each individual app).


Agree. I was tempted to consider making weak reference to the interface classes as the key but in any case the overhead of Class.getClassLoader() is still a performance hog. Let's move forward with the alternative you propose.

The scheme you're proposing might be possible, though not simple: The factory Supplier<Class> would become a Function<ClassLoader, Class> and would have to maintain it's own set of cached proxy classes. There would be a single ConcurrentMap<List<String>, Function<ClassLoader, Class>> to map sets of interface names to factory Functions, but the cached classes in a particular factory Function would still have to be weakly referenced. I see some difficulties in implementing such a scheme: - expunging cleared WeakReferences could only reliably clear the cache inside each factory Function but removing the entry from the map of factory Functions when last proxy class for a particular set of interface names is expunged would become a difficult task if not impossible with all the scalability constraints in mind (just thinking about concurrent requests into same factory Function where one is requesting new proxy class and the other is expunging cleared WeakReference which represents the last element in the set of cached proxy classes). - one of my past ideas of implementing scalable Proxy.isProxyClass() was to maintain a Set<Class> in each ClassLoader populated with all the proxy classes defined by a particular ClassLoader. Benchmarking such solution showed that Class.getClassLoader() is a peformance hog, so I scraped it in favor of ClassValue<Boolean> that is now incorporated in the patch. In order to "choose" the right proxy class from the set of proxy classes inside a particular factory Function, the Class.getClassLoader() method would have to be used, or entries would have to (weakly) reference a particular ClassLoader associated with each proxy class.


Thanks for reminding me your earlier prototype. I suspect the cost of Class.getClassLoader() is due to its lookup of the caller class every time it's called.

Even without SecurityManager installed the performance of native getClassLoader0 was a hog. I don't know why? Isn't there an implicit reference to defining ClassLoader from every Class object?


Considering all that, such solution starts to look unappealing. It might even be more space-hungry then the presented WeakCache.

WeakCache is currently the following:

ConcurrentMap<WeakReferenceWithInterfaceNames<ClassLoader>, WeakReference<Class>>

another alternative would be:

ConcurrentMap<WeakReference<ClassLoader>, ConcurrentMap<InterfaceNames, WeakReference<Class>>>

...which might need a little less space than WeakCache (only one WeakReference per proxy class + one per ClassLoader instead of two WeakReferences per proxy class) but would require two map lookups during fast-path retrieval. It might not be performance critical and the expunging could be performed easily too.


I am fine with either of these alternatives. As you noted, the latter one would save little bit of memory for the cases when several proxy classes are defined per loader e.g. one per each annotation type.

Mandy

Reply via email to