Hi Sergio,
If I remember the details of this correctly here is what is going on.
1. Way back in the mists of time, I did some performance testing of
ConcurrentStringMap (which became ConcurrentDistributedMap). I
discovered that when you coupled the large number of locks a CDM
creates with the default hand back a greedy lock after 1 minute lock
GC policy, you run into some major scalability/performance issues.
Firstly the lock GC blocks all locking operations while it runs, which
causes big drops in throughput when you have large numbers of locks
(it was effectively STW GC). Secondly many of the read operations
where being slowed down because the lock GC was eagerly releasing lots
of greedy locks because we weren't hitting the keys often enough and
then we were suffering the latency of re-getting the greedy lock from
the L2. (Imagine what happens when you have a large key set, and it
takes longer than the lock idle time to cycle through them all).
2. To fix this I introduced the concept of lock pinning. This meant
that any CDM entry whose value was local in a given L1 kept the local
lock state pinned in memory in that L1. This has two effects: 1. It
means the greedy lock is only handed back to the server if it is asked
for (think of it like manual memory management but for locks). 2: Even
if the server does recall the greedy lock we keep the L1 state object
in memory, and reuse it when we subsequently reuse the lock. When the
value object for an entry is flushed from the L1s memory, it unpins
the associated lock, which allows it to be collected by the lock gc
algorithm, which will hand the greedy lock back to the server.
3. Following all this we also redesigned the entire locking system on
the L1 and L2... (which is really of no consequence in this discussion
- except that it significantly reduced heap usage and increased
performance on the L1 and L2)
4. The behavior of the L2 LockStore is that it must keep a lock state
object in memory for every lock that is greedily held in an L1. This
basically means one lock for every entry that is in memory on at least
one L1 (module hash collisions).
5. The behavior that you saw in Ehcache, was due to a bug in the CDM,
whereby entries that are removed via the applicator methods (i.e. via
a broadcasted remove from another node) did not correctly unpin their
locks on the node receiving node. This results in those locks
remaining pinned (and greedily held) even thought the associated entry
has been removed from the map. This also means they remain in the L2
LockStore, hence the "leak" that Nitin was refering to.
6. The reason this needed fixing in both the core code and tim-
concurrent-collections is that were actually two bugs. Firstly we
needed to add the unpin call to the applicator remove method of the
CDM, and secondly we had to fix the unpin and pin methods in core to
work when called from the applicator thread (as generally locking
methods called from the applicator thread should be no-ops, and these
previously were).
In your case although you have 2M entries, you will only have at most
as many locks in the L2 as there are live values in the L1s. (The
actual number could be less than this due to hash collisions, and the
possibility of live objects that have no associated greedy lock).
Hope this all makes sense.
Chris
On Apr 28, 2010, at 11:14 AM, Sergio Bossa wrote:
On Wed, Apr 28, 2010 at 4:50 PM, Tim Eck <t...@terracottatech.com>
wrote:
I believe the lock strategy in ehcache is hashcode based -- the
hashcode
of the key determines the lock ID.
[CUT]
Active locks are in memory but they are much smaller in recent TC
releases
(like 3.2.1 for example).
So, it's like saying there will be one lock per key/value, and so, I'm
still curious how you manage, in the latest tim-concurrent-collections
release, to keep the number of locks in LockStore growing and growing:
that is, I have a test with 100% inserts, but the number of locks
grows very slowly (around 200k locks with around 2M entries).
Chris would know better than I would, but I
think locks are in memory for any keys read/put by that VM and for
which
the value is not flushed. Any other locks should be GC'd.
This isn't clear to me: what do you mean by "flushed"? Maybe you mean
that when a value is flushed from *all* L1s, its lock will be GC'd?
But then what happens when the value is faulted back? Is the lock
created again?
Thanks much!
--
Sergio Bossa
http://www.linkedin.com/in/sergiob
_______________________________________________
tc-dev mailing list
tc-dev@lists.terracotta.org
http://lists.terracotta.org/mailman/listinfo/tc-dev
_______________________________________________
tc-dev mailing list
tc-dev@lists.terracotta.org
http://lists.terracotta.org/mailman/listinfo/tc-dev