On 11 Dec 2013, at 12:48, Stéphane Ducasse <[email protected]> wrote:
> > On Dec 11, 2013, at 12:29 PM, Esteban Lorenzano <[email protected]> wrote: > >> >> >> >> On Tue, Dec 10, 2013 at 11:09 PM, Stéphane Ducasse >> <[email protected]> wrote: >> Sven thanks a lot >> >> yes! really cool! >> >> How can we resist to push this in Pharo 30 :) >> >> over my dead body :) >> but in Pharo 4… > > you are saying that you will let squeak be in advance compared to Pharo :) > I cannot believe that :). In any case: https://pharo.fogbugz.com/f/cases/12398/Update-LRUCache-Implementation There are currently only 3 LRUCache users, the risk is minimal, I think. BTW, the reason I started this is because I want better caching for Monticello stuff. You know, the multiple MB’s being used/waisted when you work for a couple of days in the same image, the result of boundless caching using stupid dictionaries. But I am waiting for this reckless young Swiss guy coming back after his thesis is written ;-) > Stef >> >> >> Stef >> >> On Dec 10, 2013, at 4:54 PM, Sven Van Caekenberghe <[email protected]> wrote: >> >> > Hi, >> > >> > Caching is an important technique to trade time (speed of execution) for >> > space (memory consumption) in computation. Used correctly, caching can >> > significantly improve an application’s performance. >> > >> > Given the key/value nature of a cache, a Dictionary is often the first >> > data structure used for implementing it. This is however not a good idea >> > since a simple Dictionary is unbounded by definition and can/will lead to >> > explosive cache growth. >> > >> > Quick and dirty hacks to turn a Dictionary into a least-recently-used >> > (LRU) and/or time-to-live (TTL) based limited cache are often incorrect >> > and inefficient. >> > >> > The LRUCache implementation currently in Pharo is too simple and not >> > efficient, being O(n) on its main operation. >> > >> > Pharo deserves and needs a good LRU/TTL cache implementation that >> > can/should be used uniformly in various places of the system and in code >> > built on top of it. That is why I implemented a proposal. >> > >> > Neo-Caching is an implementation of both a good LRU cache as well as a TTL >> > extension of it. The caches are easy to use, yet have a number of >> > interesting features. The implementation is properly O(1) on its main >> > operations and is 2 to 3 times faster than LRUCache. The package also >> > contains a double linked list implementation. The code comes with a full >> > set of unit tests. There are class and method comments and the code is >> > easy to read. >> > >> > The code (which has no dependencies) can be found here >> > >> > http://mc.stfx.eu/Neo >> > http://www.smalltalkhub.com/mc/SvenVanCaekenberghe/Neo/main >> > >> > There is only one package to load. The code has been written in Pharo 3.0 >> > but it should run on older versions as well. >> > >> > >> > Code >> > >> > >> > Let’s create a 16K ordered collection of keys that have some repetitions >> > in them with a reasonable distribution: >> > >> > data := Array streamContents: [ :out | >> > 1 to: 4096 do: [ :each | >> > each primeFactorsOn: out. >> > out nextPut: each ] ]. >> > data := data collect: [ :each | each asWords ]. >> > >> > By using #asWords the keys have better hash properties. Now, we put all >> > this in a cache: >> > >> > cache := NeoLRUCache new. >> > cache maximumWeight: 512. >> > cache factory: [ :key | key ]. >> > data do: [ :each | cache at: each ]. >> > cache. >> > >> > ==> a NeoLRUCache(#512 512/512 [ 1 ] [ :key | key ] 72%) >> > >> > We had a 72% hit ratio, which is good. The cache is of course full, >> > 512/512 with 512 entries (not necessarily the same thing, see further). >> > >> > We can now benchmark this: >> > >> > [ data do: [ :each | cache at: each ] ] bench. >> > >> > ==> '35.8 per second.’ >> > >> > And compare it to LRUCache: >> > >> > cache := LRUCache size: 512 factory: [ :key | key ]. >> > >> > [ data do: [ :each | cache at: each ] ] bench. >> > >> > ==> '12.6 per second.’ >> > >> > >> > Features >> > >> > >> > Instead of just counting the number of entries, which is the default >> > behaviour, the concept of weight is used to determine when a cache is >> > full. For each cached value, a block or selector is used to compute its >> > weight. When adding a new entry causes the weight to exceed a maximum, >> > eviction of the least recently used item(s) takes place, until the weight >> > is again below the maximum. >> > >> > cache >> > computeWeight: #sizeInMemory; >> > maximumWeight: 16*1024. >> > >> > Will keep the cache below 16Kb in real memory usage. This can be very >> > useful when caching various sized images or MC packages, for example. >> > >> > By default, no concurrent access protection takes place, but optionally a >> > semaphore for mutual exclusion can be used. This slows down access. >> > >> > cache useSemaphore. >> > >> > NeoTTLCache extends NeoLRUCache by maintaining a timestamp for each cached >> > value. Upon cache hit, there is a check to see if this timestamp is not >> > older than the allowed time to live. If so, the value is stale and will be >> > recomputed. >> > >> > (cache := NeoTTLCache new) >> > timeToLive: 10 minutes; >> > factory: [ :key | ZnEasy get: key ]; >> > maximumWeight: 32*1024; >> > computeWeight: #contentLength. >> > >> > cache at: ‘http://zn.stfx.eu/zn/numbers.txt'. >> > >> > ==> a ZnResponse(200 OK text/plain;charset=utf-8 71B) >> > >> > cache >> > >> > ==> a NeoTTLCache(#1 71/32768 #contentLength [ :key | ZnEasy get: key ] 0% >> > 0:00:10:00) >> > >> > Would be a simple HTTP cache, keeping resolved URLs in memory for 10 >> > minutes maximum before refreshing them. It uses #contentLength on >> > ZnResponse to compute the weight. You can see from the print string that >> > there is now 1 entry, while the weight is 17 bytes out of 32768. >> > >> > >> > These are the main points, please have a look at the code. Feedback is >> > welcome. >> > >> > Regards, >> > >> > Sven >> > >> >
