Re: [Pharo-dev] [ANN] Neo-Caching

Sven Van Caekenberghe Wed, 11 Dec 2013 04:37:11 -0800

On 11 Dec 2013, at 12:48, Stéphane Ducasse <[email protected]> wrote:


> 
> On Dec 11, 2013, at 12:29 PM, Esteban Lorenzano <[email protected]> wrote:
> 
>> 
>> 
>> 
>> On Tue, Dec 10, 2013 at 11:09 PM, Stéphane Ducasse 
>> <[email protected]> wrote:
>> Sven thanks a lot
>> 
>> yes! really cool!
>>  
>> How can we resist to push this in Pharo 30 :)
>> 
>> over my dead body :)
>> but in Pharo 4… 
> 
> you are saying that you will let squeak be in advance compared to Pharo :)
> I cannot believe that :).

In any case:

  https://pharo.fogbugz.com/f/cases/12398/Update-LRUCache-Implementation

There are currently only 3 LRUCache users, the risk is minimal, I think.

BTW, the reason I started this is because I want better caching for Monticello 
stuff. You know, the multiple MB’s being used/waisted when you work for a 
couple of days in the same image, the result of boundless caching using stupid 
dictionaries. But I am waiting for this reckless young Swiss guy coming back 
after his thesis is written ;-)

> Stef
>>  
>> 
>> Stef
>> 
>> On Dec 10, 2013, at 4:54 PM, Sven Van Caekenberghe <[email protected]> wrote:
>> 
>> > Hi,
>> >
>> > Caching is an important technique to trade time (speed of execution) for 
>> > space (memory consumption) in computation. Used correctly, caching can 
>> > significantly improve an application’s performance.
>> >
>> > Given the key/value nature of a cache, a Dictionary is often the first 
>> > data structure used for implementing it. This is however not a good idea 
>> > since a simple Dictionary is unbounded by definition and can/will lead to 
>> > explosive cache growth.
>> >
>> > Quick and dirty hacks to turn a Dictionary into a least-recently-used 
>> > (LRU) and/or time-to-live (TTL) based limited cache are often incorrect 
>> > and inefficient.
>> >
>> > The LRUCache implementation currently in Pharo is too simple and not 
>> > efficient, being O(n) on its main operation.
>> >
>> > Pharo deserves and needs a good LRU/TTL cache implementation that 
>> > can/should be used uniformly in various places of the system and in code 
>> > built on top of it. That is why I implemented a proposal.
>> >
>> > Neo-Caching is an implementation of both a good LRU cache as well as a TTL 
>> > extension of it. The caches are easy to use, yet have a number of 
>> > interesting features. The implementation is properly O(1) on its main 
>> > operations and is 2 to 3 times faster than LRUCache. The package also 
>> > contains a double linked list implementation. The code comes with a full 
>> > set of unit tests. There are class and method comments and the code is 
>> > easy to read.
>> >
>> > The code (which has no dependencies) can be found here
>> >
>> >  http://mc.stfx.eu/Neo
>> >  http://www.smalltalkhub.com/mc/SvenVanCaekenberghe/Neo/main
>> >
>> > There is only one package to load. The code has been written in Pharo 3.0 
>> > but it should run on older versions as well.
>> >
>> >
>> > Code
>> >
>> >
>> > Let’s create a 16K ordered collection of keys that have some repetitions 
>> > in them with a reasonable distribution:
>> >
>> > data := Array streamContents: [ :out |
>> >  1 to: 4096 do: [ :each |
>> >    each primeFactorsOn: out.
>> >    out nextPut: each ] ].
>> > data := data collect: [ :each | each asWords ].
>> >
>> > By using #asWords the keys have better hash properties. Now, we put all 
>> > this in a cache:
>> >
>> > cache := NeoLRUCache new.
>> > cache maximumWeight: 512.
>> > cache factory: [ :key | key ].
>> > data do: [ :each | cache at: each ].
>> > cache.
>> >
>> > ==> a NeoLRUCache(#512 512/512 [ 1 ] [ :key | key ] 72%)
>> >
>> > We had a 72% hit ratio, which is good. The cache is of course full, 
>> > 512/512 with 512 entries (not necessarily the same thing, see further).
>> >
>> > We can now benchmark this:
>> >
>> > [ data do: [ :each | cache at: each ] ] bench.
>> >
>> > ==> '35.8 per second.’
>> >
>> > And compare it to LRUCache:
>> >
>> > cache := LRUCache size: 512 factory: [ :key | key ].
>> >
>> > [ data do: [ :each | cache at: each ] ] bench.
>> >
>> > ==> '12.6 per second.’
>> >
>> >
>> > Features
>> >
>> >
>> > Instead of just counting the number of entries, which is the default 
>> > behaviour, the concept of weight is used to determine when a cache is 
>> > full. For each cached value, a block or selector is used to compute its 
>> > weight. When adding a new entry causes the weight to exceed a maximum, 
>> > eviction of the least recently used item(s) takes place, until the weight 
>> > is again below the maximum.
>> >
>> > cache
>> >  computeWeight: #sizeInMemory;
>> >  maximumWeight: 16*1024.
>> >
>> > Will keep the cache below 16Kb in real memory usage. This can be very 
>> > useful when caching various sized images or MC packages, for example.
>> >
>> > By default, no concurrent access protection takes place, but optionally a 
>> > semaphore for mutual exclusion can be used. This slows down access.
>> >
>> > cache useSemaphore.
>> >
>> > NeoTTLCache extends NeoLRUCache by maintaining a timestamp for each cached 
>> > value. Upon cache hit, there is a check to see if this timestamp is not 
>> > older than the allowed time to live. If so, the value is stale and will be 
>> > recomputed.
>> >
>> > (cache := NeoTTLCache new)
>> >  timeToLive: 10 minutes;
>> >  factory: [ :key | ZnEasy get: key ];
>> >  maximumWeight: 32*1024;
>> >  computeWeight: #contentLength.
>> >
>> > cache at: ‘http://zn.stfx.eu/zn/numbers.txt'.
>> >
>> > ==> a ZnResponse(200 OK text/plain;charset=utf-8 71B)
>> >
>> > cache
>> >
>> > ==> a NeoTTLCache(#1 71/32768 #contentLength [ :key | ZnEasy get: key ] 0% 
>> > 0:00:10:00)
>> >
>> > Would be a simple HTTP cache, keeping resolved URLs in memory for 10 
>> > minutes maximum before refreshing them. It uses #contentLength on 
>> > ZnResponse to compute the weight. You can see from the print string that 
>> > there is now 1 entry, while the weight is 17 bytes out of 32768.
>> >
>> >
>> > These are the main points, please have a look at the code. Feedback is 
>> > welcome.
>> >
>> > Regards,
>> >
>> > Sven
>> >
>> >

Re: [Pharo-dev] [ANN] Neo-Caching

Reply via email to