Re: [tc-dev] Memory problems with com.tc.objectserver.locks.LockStore

Chris Dennis Wed, 28 Apr 2010 08:37:29 -0700

Hi Sergio,

If I remember the details of this correctly here is what is going on.

1. Way back in the mists of time, I did some performance testing ofConcurrentStringMap (which became ConcurrentDistributedMap). Idiscovered that when you coupled the large number of locks a CDMcreates with the default hand back a greedy lock after 1 minute lockGC policy, you run into some major scalability/performance issues.Firstly the lock GC blocks all locking operations while it runs, whichcauses big drops in throughput when you have large numbers of locks(it was effectively STW GC). Secondly many of the read operationswhere being slowed down because the lock GC was eagerly releasing lotsof greedy locks because we weren't hitting the keys often enough andthen we were suffering the latency of re-getting the greedy lock fromthe L2. (Imagine what happens when you have a large key set, and ittakes longer than the lock idle time to cycle through them all).

2. To fix this I introduced the concept of lock pinning. This meantthat any CDM entry whose value was local in a given L1 kept the locallock state pinned in memory in that L1. This has two effects: 1. Itmeans the greedy lock is only handed back to the server if it is askedfor (think of it like manual memory management but for locks). 2: Evenif the server does recall the greedy lock we keep the L1 state objectin memory, and reuse it when we subsequently reuse the lock. When thevalue object for an entry is flushed from the L1s memory, it unpinsthe associated lock, which allows it to be collected by the lock gcalgorithm, which will hand the greedy lock back to the server.

3. Following all this we also redesigned the entire locking system onthe L1 and L2... (which is really of no consequence in this discussion- except that it significantly reduced heap usage and increasedperformance on the L1 and L2)

4. The behavior of the L2 LockStore is that it must keep a lock stateobject in memory for every lock that is greedily held in an L1. Thisbasically means one lock for every entry that is in memory on at leastone L1 (module hash collisions).

5. The behavior that you saw in Ehcache, was due to a bug in the CDM,whereby entries that are removed via the applicator methods (i.e. viaa broadcasted remove from another node) did not correctly unpin theirlocks on the node receiving node. This results in those locksremaining pinned (and greedily held) even thought the associated entryhas been removed from the map. This also means they remain in the L2LockStore, hence the "leak" that Nitin was refering to.

6. The reason this needed fixing in both the core code and tim-concurrent-collections is that were actually two bugs. Firstly weneeded to add the unpin call to the applicator remove method of theCDM, and secondly we had to fix the unpin and pin methods in core towork when called from the applicator thread (as generally lockingmethods called from the applicator thread should be no-ops, and thesepreviously were).

In your case although you have 2M entries, you will only have at mostas many locks in the L2 as there are live values in the L1s. (Theactual number could be less than this due to hash collisions, and thepossibility of live objects that have no associated greedy lock).


Hope this all makes sense.

Chris

On Apr 28, 2010, at 11:14 AM, Sergio Bossa wrote:

On Wed, Apr 28, 2010 at 4:50 PM, Tim Eck <t...@terracottatech.com>wrote:

I believe the lock strategy in ehcache is hashcode based -- thehashcode
of the key determines the lock ID.
[CUT]
Active locks are in memory but they are much smaller in recent TCreleases
(like 3.2.1 for example).


So, it's like saying there will be one lock per key/value, and so, I'm
still curious how you manage, in the latest tim-concurrent-collections
release, to keep the number of locks in LockStore growing and growing:
that is, I have a test with 100% inserts, but the number of locks
grows very slowly (around 200k locks with around 2M entries).

Chris would know better than I would, but I
think locks are in memory for any keys read/put by that VM and forwhich
the value is not flushed. Any other locks should be GC'd.


This isn't clear to me: what do you mean by "flushed"? Maybe you mean
that when a value is flushed from *all* L1s, its lock will be GC'd?
But then what happens when the value is faulted back? Is the lock
created again?

Thanks much!

--
Sergio Bossa
http://www.linkedin.com/in/sergiob
_______________________________________________
tc-dev mailing list
tc-dev@lists.terracotta.org
http://lists.terracotta.org/mailman/listinfo/tc-dev


_______________________________________________
tc-dev mailing list
tc-dev@lists.terracotta.org
http://lists.terracotta.org/mailman/listinfo/tc-dev

Re: [tc-dev] Memory problems with com.tc.objectserver.locks.LockStore

Reply via email to