Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-20 Thread Marcel Reutegger
Hi,

On 19/08/14 14:43, Thomas Mueller muel...@adobe.com wrote:
Limiting the cache size by number of entries doesn't make sense. It is a
sure way to run into out of memory, exactly because the sizes of documents
varies a lot.

I agree with Thomas. I think there must always be a way to tell the system
what the maximum memory is to use for caching purposes. If needed we could
introduce an additional limit for the number of cache entries, but this
must not result in higher memory usage than configured.

Regards
 Marcel



Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-20 Thread Thomas Mueller
Hi,

If we need a limit on the number of entries for some other (internal)
reason, like consistency check, then I understand. If we later find a way
to speed up consistency check (or if we don't need it, which I would
prefer), then this is no longer needed. But I also don't know how to limit
by number of entries and memory using the Guava cache API.

 why is 256MB -- the default value -- sufficient/insufficient

We don't know. But how do you know that a cache of 10'000 entries is
sufficient? Specially if each entry can be either 1 KB or 1 MB or 20 MB.
The available memory can be divided into different areas, and each
component is given a part of that. Then you look at performance, and see
which component is slow, and you try to find out why. For example, it also
depends on how expensive a cache miss is.

As for the cache size in amout of memory: the best way to know what a good
number is, is to analyze the performance (how much time is spent reading,
cache hit ratio,...)

 what should the course of action when seeing a lot of cache misses: (a)
notify application team, or (b) increase cache size.

It depends on the reason for the cache misses. There could be a loop over
many nodes somewhere, in which case a larger cache might not really help
(most caches are not scan resistant). There could be other reasons. But I
don't see how the ability to configure the number of entries in the cache
would help.

Regards,
Thomas








On 19/08/14 16:25, Vikas Saurabh vikas.saur...@gmail.com wrote:

 sysadmin can be provided with a rough idea about relation of
(frequently
used) repo nodes using which sysadmin can update cache size.

 I can't follow you, sorry. How would a sysadmin possibly know the number
 of frequently used nodes? And why would he know that, and not the amount
 of memory? And why wouldn't he worry about running into out of memory?

 Even for off-heap caches, I think it's still important to limit the
 memory. Even tought you don't get an out-of-memory exception, you would
 still run out of physical memory, at which point the system would get
 extremely slow (virtual memory trashing).

What I meant was there was no way for me to guess a good number for
document cache (e.g. why is 256MB -- the default value --
sufficient/insufficient) given that I knew what type of load I (as
application engineer) plan to put on an author. I understand that mem
usage is the bottom line and sysadmin must configure that too -- but
from a sysadmin point of view what should the course of action when
seeing a lot of cache misses: (a) notify application team, or (b)
increase cache size. Yes, at the end of the day there would be balance
between these 2 options -- but from app engineer point of view, I've
no idea what/how much cache size is useful/sufficient or even how to
map a given size in bytes to the kind of access I'd plan on this
repository which kind of nullifies option (a). I don't know, for sure,
about general deployments, but in our case engineer team does
recommend heap size and other JVM settings (and possibly tweak levels)
to sysadmin team -- I thought that's how setups usually are done.

Thanks,
Vikas



Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-20 Thread Vikas Saurabh
 why is 256MB -- the default value -- sufficient/insufficient

 We don't know. But how do you know that a cache of 10'000 entries is
 sufficient? Specially if each entry can be either 1 KB or 1 MB or 20 MB.
 The available memory can be divided into different areas, and each
 component is given a part of that. Then you look at performance, and see
 which component is slow, and you try to find out why. For example, it also
 depends on how expensive a cache miss is.

Yes, I agree... I was thinking more around this and realized that what
I was missing while debugging our issue wasn't configuration freedom
(in terms of entries) rather an easier way to analyze. Cache
hits/misses felt a bit inadequate -- but that can be because we were
hitting an entirely different issue and cache hit/miss numbers weren't
an issue for us. For us, increasing cache size improved performance
which was the red herring -- the performance soon dropped back and we
were left wondering if 2GB (that's the size we went for) isn't cutting
it. We wondered if there is a way to empirically figure out if we are
shooting blank arrows or not.
So, yes, I agree that may be configuration in terms of entries isn't
very useful (from a outside oak world point of view... internal reason
might still apply)... but, I also feel that we are lacking some
tooling (I can't really comment what would be useful or not) to
investigate. I feel hit/miss numbers aren't apparent enough.

 As for the cache size in amout of memory: the best way to know what a good
 number is, is to analyze the performance (how much time is spent reading,
 cache hit ratio,...)
We were running simple page loads -- some of them as trivial as single
page rendering based on one resource (and lots of js,css + page
rendering scripts + page and component nodes). We often found that
first rendering took significantly more time and subsequent reads
improved -- this is what directed us towards suspecting cache being
the issue. We did reset cache stats and soon cache hit reached around
82%-85% (while on a local dev setup it hung above 98). We tried
checking out what documents exists in the cache (using script console
and a small groovy script) and there were lots which I wasn't touching
directly (e.g. index etc)... and I didn't know if they were even
relevant for our page rendering or not. I agree I could really start
analyzing subset of cache entries but there were lots (around 31k
afair) and I just didn't have enough energy to delve further.
So, it might that I could debug the issue further to confirm that
cache wasn't the issue -- but on the other hand, I felt that there
should be better ways to understand what's going on.

 what should the course of action when seeing a lot of cache misses: (a)
notify application team, or (b) increase cache size.

 It depends on the reason for the cache misses. There could be a loop over
 many nodes somewhere, in which case a larger cache might not really help
 (most caches are not scan resistant). There could be other reasons. But I
 don't see how the ability to configure the number of entries in the cache
 would help.
I agree that I don't have enough understanding of internal of how
caches work -- but, as I mentioned above, I couldn't quite figure out
a clean way to identify cache access pattern -- the best I could
figure out was what all existed there.

Thanks,
Vikas


Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-19 Thread Chetan Mehrotra
Hi Vikas,

Sizing the cache can be done by either number of entries or the size
taken by cache. Currently in Oak we limit by size however as you
mentioned limit by count is more deterministic. We use Guava Cache and
it supports either limiting by size or by number of entries i.e. the
two policies are exclusive.

So at minimum if you can provide a patch which allows the admin to
choose between the two it would allow us to experiment and later see
how we can put a max cap on cache size.
Chetan Mehrotra


On Mon, Aug 18, 2014 at 7:55 PM, Vikas Saurabh vikas.saur...@gmail.com wrote:
 we can probably have both and cache respects whichever constraint hits
 first (sort of min(byte size, entry size)).
 First of all I don't know MongoNS implementation details so I can be wrong.

 I'd rather keep the size in bytes as it gives me much more control over
 the memory I have and what I decide to provide to the application. If we
 say, to take an extreme example, 1 document only in cache and then this
 single document exceed the amount of available memory I fear an OOM. On
 the other hand having bytes ensure us the application keeps working and
 it will be task of a sysadmin to monitor the eventual hit/miss ratio to
 adjust the cache accordingly.

 Yes, sysadmin can modify cache size in bytes if miss ratio increases.
 But, in current scenario, I couldn't figure out a neat way
 (heuristic/guesswork) to figure
 out if it's application mis-behavior or lack of cache size (notice our
 issue didn't happen
 to be related to cache size... but still the question did bug us). On
 the other hand, an
 sysadmin can be provided with a rough idea about relation of
 (frequently used) repo nodes
 using which sysadmin can update cache size.
 Also, I do take the point of avoiding OOMs in case of pretty large
 documents which is why
 we can have both properties(byte size and entry count) with byte
 constraint being a fail safe.

 Thanks,
 Vikas


Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-19 Thread Thomas Mueller
Hi,

Limiting the cache size by number of entries doesn't make sense. It is a
sure way to run into out of memory, exactly because the sizes of documents
varies a lot.

 as you mentioned limit by count is more deterministic.

How, or in what way, is it more deterministic?

 sysadmin can be provided with a rough idea about relation of (frequently
used) repo nodes using which sysadmin can update cache size.

I can't follow you, sorry. How would a sysadmin possibly know the number
of frequently used nodes? And why would he know that, and not the amount
of memory? And why wouldn't he worry about running into out of memory?

Even for off-heap caches, I think it's still important to limit the
memory. Even tought you don't get an out-of-memory exception, you would
still run out of physical memory, at which point the system would get
extremely slow (virtual memory trashing).

Regards,
Thomas



On 19/08/14 08:30, Chetan Mehrotra chetan.mehro...@gmail.com wrote:

Hi Vikas,

Sizing the cache can be done by either number of entries or the size
taken by cache. Currently in Oak we limit by size however as you
mentioned limit by count is more deterministic. We use Guava Cache and
it supports either limiting by size or by number of entries i.e. the
two policies are exclusive.

So at minimum if you can provide a patch which allows the admin to
choose between the two it would allow us to experiment and later see
how we can put a max cap on cache size.
Chetan Mehrotra


On Mon, Aug 18, 2014 at 7:55 PM, Vikas Saurabh vikas.saur...@gmail.com
wrote:
 we can probably have both and cache respects whichever constraint hits
 first (sort of min(byte size, entry size)).
 First of all I don't know MongoNS implementation details so I can be
wrong.

 I'd rather keep the size in bytes as it gives me much more control over
 the memory I have and what I decide to provide to the application. If
we
 say, to take an extreme example, 1 document only in cache and then this
 single document exceed the amount of available memory I fear an OOM. On
 the other hand having bytes ensure us the application keeps working and
 it will be task of a sysadmin to monitor the eventual hit/miss ratio to
 adjust the cache accordingly.

 Yes, sysadmin can modify cache size in bytes if miss ratio increases.
 But, in current scenario, I couldn't figure out a neat way
 (heuristic/guesswork) to figure
 out if it's application mis-behavior or lack of cache size (notice our
 issue didn't happen
 to be related to cache size... but still the question did bug us). On
 the other hand, an
 sysadmin can be provided with a rough idea about relation of
 (frequently used) repo nodes
 using which sysadmin can update cache size.
 Also, I do take the point of avoiding OOMs in case of pretty large
 documents which is why
 we can have both properties(byte size and entry count) with byte
 constraint being a fail safe.

 Thanks,
 Vikas



Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-19 Thread Chetan Mehrotra
Hi Thomas,

On Tue, Aug 19, 2014 at 6:13 PM, Thomas Mueller muel...@adobe.com wrote:
 How, or in what way, is it more deterministic?

Missed providing some context there so here are the details. Currently
we limit the cache by total size taken. Now given a system where you
have say 32 GB RAM available to you and admin needs to decide how much
memory one should allocate to DocumentNodeStore. Currently we do not
have a definitive way to tell that as there are couple of factors to
consider

1. Number of entries in Document Cache - The Document cache (which
caches the NodeDocuments) is the most critical cache and is currently
allocated 70% of the cache size. We would like to give it as much
memory as possible but then we also need to take into account the time
taken to perform consistency check for the entries present in the
cache. If time taken to perform consistency checks take more than 1
sec then it would delay the background job in DocumentNodeStore and
hence root node version would become stale.

Now time taken to perform consistency check ~= f(n) where n = number
of entries in the cache and not the size of cache.

With Mongo we can have good estimate of time taken to query modCount
for 'n' node. Going forward would add some stats collection in this
logic to determine how much time is being spent in cache consistency
check

2. Effect of GC with larger heaps - As we run JVM with higher heap
size we need to take into account delays that might occur with such
large heaps

So if we can have a cache policy which can put up a max cap on memory
taken and also allow limit of number of entries then that would give a
more deterministic control on tuning the cache

Chetan Mehrotra


Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-19 Thread Vikas Saurabh
 sysadmin can be provided with a rough idea about relation of (frequently
used) repo nodes using which sysadmin can update cache size.

 I can't follow you, sorry. How would a sysadmin possibly know the number
 of frequently used nodes? And why would he know that, and not the amount
 of memory? And why wouldn't he worry about running into out of memory?

 Even for off-heap caches, I think it's still important to limit the
 memory. Even tought you don't get an out-of-memory exception, you would
 still run out of physical memory, at which point the system would get
 extremely slow (virtual memory trashing).

What I meant was there was no way for me to guess a good number for
document cache (e.g. why is 256MB -- the default value --
sufficient/insufficient) given that I knew what type of load I (as
application engineer) plan to put on an author. I understand that mem
usage is the bottom line and sysadmin must configure that too -- but
from a sysadmin point of view what should the course of action when
seeing a lot of cache misses: (a) notify application team, or (b)
increase cache size. Yes, at the end of the day there would be balance
between these 2 options -- but from app engineer point of view, I've
no idea what/how much cache size is useful/sufficient or even how to
map a given size in bytes to the kind of access I'd plan on this
repository which kind of nullifies option (a). I don't know, for sure,
about general deployments, but in our case engineer team does
recommend heap size and other JVM settings (and possibly tweak levels)
to sysadmin team -- I thought that's how setups usually are done.

Thanks,
Vikas


Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-19 Thread Vikas Saurabh
 We use Guava Cache and
 it supports either limiting by size or by number of entries i.e. the
 two policies are exclusive.
Hmm I totally missed this point.

 So at minimum if you can provide a patch which allows the admin to
 choose between the two it would allow us to experiment and later see
 how we can put a max cap on cache size.
Ok, that can be done, but both the constraints have advantages. Having
just entry limit would make memory usage uncontrollable. I haven't
looked at guava, but should it not be possible to extend it to have
this type of functionality (constraining both size and entry count)

Thanks,
Vikas


[Document Cache Size] Is it better to have cache size using number of entries

2014-08-18 Thread Vikas Saurabh
Hi,

We were struggling past couple of weeks with severe performance issues
on AEM6/Oak/MongoNS -- fortunately the issue was due to VM we were
using. So, all seems well for now.

BUT, during investigation, one of the things that we were worried
about was document cache missing hits... we tried modifying its value
to increase cache size to be put at 1GB.

Although, that didn't quite help as the issue wasn't about cache
size... but what worried us more was there was no good way for us to
specify a cache size. While I agree that cache should be a memory
hog... but entry size of a document in cache is quite variable in
nature and as an admin I can make guesses about JCR nodes and their
access patters. Document size, otoh, would vary even during the
lifetime of repository.

Moreover, the only downside (assuming RAM is cheap) of increasing
cache to large number is making cache invalidation expensize.

So, IMHO, document caches should be limited in terms of number of
entries in cache. And if we still want to have byte size limitation,
we can probably have both and cache respects whichever constraint hits
first (sort of min(byte size, entry size)).

We can log an issue for this and provide a patch too -- but it seemed
better to have a conversation going before that.

Thought?

Thanks,
Vikas


Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-18 Thread Davide Giannella
Hello Vikas,

On 18/08/2014 12:05, Vikas Saurabh wrote:
 Hi,

 ...
 specify a cache size. While I agree that cache should be a memory
 hog... but entry size of a document in cache is quite variable in
 nature and as an admin I can make guesses about JCR nodes and their
 access patters. Document size, otoh, would vary even during the
 lifetime of repository.

 Moreover, the only downside (assuming RAM is cheap) of increasing
 cache to large number is making cache invalidation expensize.

 So, IMHO, document caches should be limited in terms of number of
 entries in cache. And if we still want to have byte size limitation,
 we can probably have both and cache respects whichever constraint hits
 first (sort of min(byte size, entry size)).
First of all I don't know MongoNS implementation details so I can be wrong.

I'd rather keep the size in bytes as it gives me much more control over
the memory I have and what I decide to provide to the application. If we
say, to take an extreme example, 1 document only in cache and then this
single document exceed the amount of available memory I fear an OOM. On
the other hand having bytes ensure us the application keeps working and
it will be task of a sysadmin to monitor the eventual hit/miss ratio to
adjust the cache accordingly.

About cache invalidation I'm not sure but it could be that the MongoNS
implementation uses off-heap for caching.

Cheers
Davide




Re: [Document Cache Size] Is it better to have cache size using number of entries

2014-08-18 Thread Vikas Saurabh
 we can probably have both and cache respects whichever constraint hits
 first (sort of min(byte size, entry size)).
 First of all I don't know MongoNS implementation details so I can be wrong.

 I'd rather keep the size in bytes as it gives me much more control over
 the memory I have and what I decide to provide to the application. If we
 say, to take an extreme example, 1 document only in cache and then this
 single document exceed the amount of available memory I fear an OOM. On
 the other hand having bytes ensure us the application keeps working and
 it will be task of a sysadmin to monitor the eventual hit/miss ratio to
 adjust the cache accordingly.

Yes, sysadmin can modify cache size in bytes if miss ratio increases.
But, in current scenario, I couldn't figure out a neat way
(heuristic/guesswork) to figure
out if it's application mis-behavior or lack of cache size (notice our
issue didn't happen
to be related to cache size... but still the question did bug us). On
the other hand, an
sysadmin can be provided with a rough idea about relation of
(frequently used) repo nodes
using which sysadmin can update cache size.
Also, I do take the point of avoiding OOMs in case of pretty large
documents which is why
we can have both properties(byte size and entry count) with byte
constraint being a fail safe.

Thanks,
Vikas