Re: DN and valueOf( "" ) method

Matthew Swift Mon, 09 Aug 2010 07:27:44 -0700


On 28/07/10 13:07, Emmanuel Lecharny wrote:

 On 7/28/10 11:31 AM, Stefan Seelmann wrote:
I was thinking lately about the DN class. I know that OpenDS (andprobablyUnboundId, but not sure) has a DN.valueOf( "<a DN>" ) factory thatreturns a
DN potentially leveraging a cache associated to a ThreadLocal.
...
I don't think it's such a good idea :
- first, as it's ThreadLocal based, you will have as many cache asyou havethreads processing requests. Not sure it competes with a uniquecache, not
sure either we can't use the memory in a better way...
An advantage to use ThreadLocal is that you don't need to synchronize
access to the cache Could be worth to measure the performance
Using ConcurrentHashMap should not be a major performance penalty. Imean, it *will* be more costly than not having any synchronization butit sounds acceptable.

Unfortunately a CHM won't help either since you need to manage cacheeviction, assuming that you want the cache to have a finite size.LinkedHashMap has an eviction strategy which can be defined byoverriding the removeEldestEntry method, but unfortunately LHM is notthread safe.

Another possibility is to use a CopyOnWriteArraySet, but I'm afraidthat it will crawl if many new DN are added.


Yes - this is not an appropriate collection to use.

difference, I wonder if the OpenDS team did some performance analysis?

I did some testing some time back and I have forgotten the exact figuresthat I got. I do remember finding a substantial performance improvementwhen parsing DNs when caching is enabled - something like 30ns withcaching vs 300ns without for DNs containing 4 RDN components (i.e. aboutan order of magnitude IIRC).

We implement our DNs using a recursive RDN + parent DN structure so weare usually able to fast track the decoding process to just a single RDNfor DNs having a common ancestor (pretty common).

We opted for the ThreadLocal approach due to the synchronizationlimitations of using a single global cache. However, I have oftenworried about this approach as it will not scale for applications havinglarge numbers of threads, resulting in OOM exceptions.

Another approach I have thought about is to use a single globaltwo-level cache comprising of a fixed size array of LinkedHashMaps(think of it as a Map of Maps) each one having its own synchronization.We then distribute the DNs over the LHMs and amortize thesynchronization costs across multiple locks (in a similar manner to CHM).

This idea needs testing. In particular, we'd need to figure out theoptimal array size (i.e. number of locks / LHMs). For example,distributing the cache over 16 LHMs is not going to help much for reallybig multi-threaded apps containing 16000+ threads (1000 threadscontending per lock).

A major problem with this approach if we choose to use it in the OpenDSSDK is that common ancestor DNs (e.g. "dc=com") are going to end up in asingle LHM so, using our current design (RDN + parent DN), all decodingattempts will usually end up contending on the same lock anyway :-( Sowe may need to change our DN implementation to better cope with thiscaching strategy.

We are not alone though: a concurrent Map implementation which can beused for caching in a similar manner to LHM is one of the mostfrequently demanded enhancements to the java.util.concurrent library.

They compared the performances they get with a ThreadLocal cache andno cache : the gain was sensible (I don't have the number for OpenDS).FYI, the DN parsing count for more or less 13% of the whole CPU neededinternally (network excluded) to process a simple search, andnormalization cost an extra 10%. There is most certainly a netpotential gain to implement a DN cache !


Agreed DN parsing can be expensive, especially normalization.

Matt

Re: DN and valueOf( "" ) method

Reply via email to