Re: Caching results of _list functions

2016-05-26 Thread Johs Ensby
Ermouth,
Congratulations with digging out hidden features of CouchDB!
I presume this works with new 2.0 and 1.7 and that these versions now have a 
new feature, Caching:)
johs

> On 26. mai 2016, at 22.17, ermouth  wrote:
> 
> Tested approach, works like a charm )
> 
> Surprisingly, it appears that approach is more suitable not for rendering
> pages, but for requests view->list, that respond with hundreds or thousands
> of small rows. In those cases I observe sometimes 2 orders of magnitude
> gain or even more, ie 300-500ms query server exec time turns into 3-5ms.
> 
> Even adding net latencies, that are around 100ms, gain is significant.
> 
> For page render, net latencies dim the gain, cause there is no visible
> difference between 100 and 150 ms lag at client side.
> 
> ermouth
> 
> 2016-05-19 11:41 GMT+03:00 ermouth :
> 
>>> - a function in a _list of a certain ddoc called directly via the _list
>> endpint
>> 
>> It‘s called via rewrite. Rewrite decides to call list. If rewrite already
>> have a result in cache and the result is not too old, rewrite sends results
>> directly.
>> 
>> If there is no result cached or it‘s too old, rewrite redirects to list.
>> And list, in turn, not only sends result, but caches it as well.
>> 
>>> - could you give us som sinppets
>> 
>> You make branch in ddoc, say .Cache, with code like:
>> 
>> (function(){
>> var cache = {};
>> exports.Cache = {
>>  get: function (key) {return cache[key]},
>>  put: function (key, val) {cache[key] = val;}
>> }
>> })();
>> 
>> In both rewrite and list to access cache you just use
>> 
>> var Cache = require ('Cache').Cache;
>> 
>> Now you can use it like Cache.get(key) and Cache.put(key, val) in rewrite
>> and list. Your cache will persist between requests until you change ddoc or
>> SM crashes.
>> 
>> 
>> ermouth
>> 
>> 2016-05-19 7:42 GMT+03:00 Johs Ensby :
>> 
>>> Hi Ermouth,
>>> this sounds like a bonus feature of significant value:)
>>> Together with the rewrite function this discovery would keep the couchapp
>>> enthusiasts of the community happy for a looong time, I believe.
>>> 
>>> If I understand you right,
>>> - a function in a _list of a certain ddoc called directly via the _list
>>> endpint (NOT via  _rewrite)
>>> - can cache a rendered result (e.g. a html page)
>>> - for a subsequent request via _rewrite (of the same ddoc) to pick up the
>>> cached result from RAM
>>> - using a timestamp inside that _rewrite function to compare against the
>>> timestamped name of the cached result to validate that the age is within a
>>> chosen tolerance
>>> 
>>> Following this 2-step approach (_list, then _rewrite)..
>>> - the _list function would have the full request context
>>> - while the _rewrite would have the limited request object available
>>> - the _rewrite and _list endpoints would need to be available
>>> 
>>> Control questions
>>> - does this mean that you cannot point a vhost to _rewrite and run all
>>> calls to _list via rewrite functions?
>>> - if you have one vhost pointing to _rewrite of a ddoc and another to
>>> e.g. _list would they still share the cache?
>>> 
>>> Testing this
>>> - could you give us som sinppets on how to test this, I would love to see
>>> this in action and test it towards my JS rewrite functions.
>>> 
>>> br
>>> johs
>>> 
>>> 
 On 19. mai 2016, at 00.01, ermouth  wrote:
 
 Since JS rewrites landed in 2.0, and I use JS rewrites heavily, I
>>> discover
 some tricks with the new feature time to time. Despite I use patched
>>> 1.6.1
 with js rewrites, tricks are suitable for at least single node 2.0 as
>>> well.
 
 JS rewrites function runs in the context of a query server instance. QS
 instance is a Spidermonkey process with 64Mb of RAM by default. It‘s
>>> single
 threaded.
 
 Actually, seems that all query server functions of one particular design
 doc are executed inside one dedicated Spidermonkey instance. It means
>>> they
 all share global context (also it means they share one thread).
 
 Ddoc global context persists until Spidermonkey process is restarted. It
 happens when SM instance crashes (runs out of RAM quota, unrecoverable
 error happens etc), or when underlying design doc is changed.
 
 It means _list function can cache results of rendering into RAM, and
 _rewrite function can later quickly respond with cached version,
>>> skipping
 redirect to heavy map/list. Cache container is just JS closure,
 instantiated/accessed using built-in require().
 
 Unlike _list, JS rewrite function does not receive req.info, it would
>>> be
 too slow to tap DB on each request. It means we can not invalidate
>>> cache on
 local sequence change (DB update) – rewrite function just does not know
>>> DB
 was updated.
 
 However we can use timestamp-based caching, even several seconds TTL of
 cache entries is enough to reduce 

Caching results of _list functions

2016-05-18 Thread ermouth
Since JS rewrites landed in 2.0, and I use JS rewrites heavily, I discover
some tricks with the new feature time to time. Despite I use patched 1.6.1
with js rewrites, tricks are suitable for at least single node 2.0 as well.

JS rewrites function runs in the context of a query server instance. QS
instance is a Spidermonkey process with 64Mb of RAM by default. It‘s single
threaded.

Actually, seems that all query server functions of one particular design
doc are executed inside one dedicated Spidermonkey instance. It means they
all share global context (also it means they share one thread).

Ddoc global context persists until Spidermonkey process is restarted. It
happens when SM instance crashes (runs out of RAM quota, unrecoverable
error happens etc), or when underlying design doc is changed.

It means _list function can cache results of rendering into RAM, and
_rewrite function can later quickly respond with cached version, skipping
redirect to heavy map/list. Cache container is just JS closure,
instantiated/accessed using built-in require().

Unlike _list, JS rewrite function does not receive req.info, it would be
too slow to tap DB on each request. It means we can not invalidate cache on
local sequence change (DB update) – rewrite function just does not know DB
was updated.

However we can use timestamp-based caching, even several seconds TTL of
cache entries is enough to reduce average response time significantly (say,
order of magnitude).

I‘ve tested approach a little and found it very promising.

Since this dirty hack is possible with _list fns only, without js rewrites,
may be someone already employs it? Or at least tried?

ermouth