JDBM + MVCC LRUCache concern

Emmanuel Lécharny Wed, 04 Apr 2012 15:22:52 -0700

Hi guys,

since I started to work on index removals last week, I started to getstrange behaviors I put on some wrong modification I have done. Today,as I was removing the last call to the OneLevelIndex to replace it byrdnIndex, the core-integ tests are blocking.


I did a kill -3 to see where I get a blockage, and here is what I got :

"main" prio=5 tid=7fd9db800800 nid=0x10d310000 waiting on condition[10d30d000]

   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at jdbm.helper.LRUCache.put(LRUCache.java:330)

atjdbm.recman.SnapshotRecordManager.update(SnapshotRecordManager.java:401)

        at jdbm.btree.BPage.remove(BPage.java:605)
        at jdbm.btree.BPage.remove(BPage.java:611)
        at jdbm.btree.BTree.remove(BTree.java:464)

atorg.apache.directory.server.core.partition.impl.btree.jdbm.JdbmTable.remove(JdbmTable.java:741)- locked <7c226be90> (aorg.apache.directory.server.core.partition.impl.btree.jdbm.JdbmTable)atorg.apache.directory.server.core.partition.impl.btree.jdbm.JdbmRdnIndex.drop(JdbmRdnIndex.java:157)atorg.apache.directory.server.core.partition.impl.btree.jdbm.JdbmRdnIndex.drop(JdbmRdnIndex.java:49)atorg.apache.directory.server.core.partition.impl.btree.AbstractBTreePartition.delete(AbstractBTreePartition.java:891)

...

The associated code in LRUCache is :

public void put( K key, V value, long newVersion, Serializerserializer,

        boolean neverReplace ) throws IOException, CacheEvictionException
    {
    ...
        while ( true )
        {
        ...
                else
                {
                    entry = this.findNewEntry( key, latchIndex );
                    ...
                }
            }
            catch ( CacheEvictionException e )
            {
                e.printStackTrace(); // Added for debug purposes

sleepForFreeEntry = totalSleepTime <this.MAX_WRITE_SLEEP_TIME;


                ...
            }
            ...

            if ( sleepForFreeEntry )
            {
                try
                {
                    Thread.sleep( sleepInterval );
                ....
                totalSleepTime += sleepInterval;
            }
            else
            {
                break;
            }
        }

Basically, we try to add a new element in the cache, it's full, we thentry to evict one entry, it fails, we get a CacheEvictionException, andwe go to sleep for 600 seconds...

It's systematic, and I guess that the fact we now pond the RdnIndextable way more often than before (just because we don't call anymore theOneLevelIndex) cause the cache to get filled and not released fast enough.

As we don't set any size for the cache, its default size is 1024. Forsome of the tests, this mightnot be enough, as we load a lot of entries(typically the schema elements) plus many others that get added andremoved while running tests in revert mode.


If I increase the default size to 65536, the tests are passing.

Ok, now, I have to admit I haven't - yet - looked at the LRUCache code,and my analysis is just based on what I saw by quickly looking at thecode, the stack traces I have added and some few blind guesses.However, I think we have a serious issue here. As far as I can tel, thecode itself is probably not responsible for this behaviour, but the waywe use it is.

Did I missed something ? Is there anything we can do - except increasethe cache size - to get the tests passing fine ?

I'm more concern about what could occur in real life, when some userswill load the server up to a point it just stop responding...


Anyone ?

Thanks !

--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

JDBM + MVCC LRUCache concern

Reply via email to