Re: [Document Cache Size] Is it better to have cache size using number of entries
Hi, On 19/08/14 14:43, Thomas Mueller muel...@adobe.com wrote: Limiting the cache size by number of entries doesn't make sense. It is a sure way to run into out of memory, exactly because the sizes of documents varies a lot. I agree with Thomas. I think there must always be a way to tell the system what the maximum memory is to use for caching purposes. If needed we could introduce an additional limit for the number of cache entries, but this must not result in higher memory usage than configured. Regards Marcel
Re: [Document Cache Size] Is it better to have cache size using number of entries
Hi, If we need a limit on the number of entries for some other (internal) reason, like consistency check, then I understand. If we later find a way to speed up consistency check (or if we don't need it, which I would prefer), then this is no longer needed. But I also don't know how to limit by number of entries and memory using the Guava cache API. why is 256MB -- the default value -- sufficient/insufficient We don't know. But how do you know that a cache of 10'000 entries is sufficient? Specially if each entry can be either 1 KB or 1 MB or 20 MB. The available memory can be divided into different areas, and each component is given a part of that. Then you look at performance, and see which component is slow, and you try to find out why. For example, it also depends on how expensive a cache miss is. As for the cache size in amout of memory: the best way to know what a good number is, is to analyze the performance (how much time is spent reading, cache hit ratio,...) what should the course of action when seeing a lot of cache misses: (a) notify application team, or (b) increase cache size. It depends on the reason for the cache misses. There could be a loop over many nodes somewhere, in which case a larger cache might not really help (most caches are not scan resistant). There could be other reasons. But I don't see how the ability to configure the number of entries in the cache would help. Regards, Thomas On 19/08/14 16:25, Vikas Saurabh vikas.saur...@gmail.com wrote: sysadmin can be provided with a rough idea about relation of (frequently used) repo nodes using which sysadmin can update cache size. I can't follow you, sorry. How would a sysadmin possibly know the number of frequently used nodes? And why would he know that, and not the amount of memory? And why wouldn't he worry about running into out of memory? Even for off-heap caches, I think it's still important to limit the memory. Even tought you don't get an out-of-memory exception, you would still run out of physical memory, at which point the system would get extremely slow (virtual memory trashing). What I meant was there was no way for me to guess a good number for document cache (e.g. why is 256MB -- the default value -- sufficient/insufficient) given that I knew what type of load I (as application engineer) plan to put on an author. I understand that mem usage is the bottom line and sysadmin must configure that too -- but from a sysadmin point of view what should the course of action when seeing a lot of cache misses: (a) notify application team, or (b) increase cache size. Yes, at the end of the day there would be balance between these 2 options -- but from app engineer point of view, I've no idea what/how much cache size is useful/sufficient or even how to map a given size in bytes to the kind of access I'd plan on this repository which kind of nullifies option (a). I don't know, for sure, about general deployments, but in our case engineer team does recommend heap size and other JVM settings (and possibly tweak levels) to sysadmin team -- I thought that's how setups usually are done. Thanks, Vikas
Re: [Document Cache Size] Is it better to have cache size using number of entries
why is 256MB -- the default value -- sufficient/insufficient We don't know. But how do you know that a cache of 10'000 entries is sufficient? Specially if each entry can be either 1 KB or 1 MB or 20 MB. The available memory can be divided into different areas, and each component is given a part of that. Then you look at performance, and see which component is slow, and you try to find out why. For example, it also depends on how expensive a cache miss is. Yes, I agree... I was thinking more around this and realized that what I was missing while debugging our issue wasn't configuration freedom (in terms of entries) rather an easier way to analyze. Cache hits/misses felt a bit inadequate -- but that can be because we were hitting an entirely different issue and cache hit/miss numbers weren't an issue for us. For us, increasing cache size improved performance which was the red herring -- the performance soon dropped back and we were left wondering if 2GB (that's the size we went for) isn't cutting it. We wondered if there is a way to empirically figure out if we are shooting blank arrows or not. So, yes, I agree that may be configuration in terms of entries isn't very useful (from a outside oak world point of view... internal reason might still apply)... but, I also feel that we are lacking some tooling (I can't really comment what would be useful or not) to investigate. I feel hit/miss numbers aren't apparent enough. As for the cache size in amout of memory: the best way to know what a good number is, is to analyze the performance (how much time is spent reading, cache hit ratio,...) We were running simple page loads -- some of them as trivial as single page rendering based on one resource (and lots of js,css + page rendering scripts + page and component nodes). We often found that first rendering took significantly more time and subsequent reads improved -- this is what directed us towards suspecting cache being the issue. We did reset cache stats and soon cache hit reached around 82%-85% (while on a local dev setup it hung above 98). We tried checking out what documents exists in the cache (using script console and a small groovy script) and there were lots which I wasn't touching directly (e.g. index etc)... and I didn't know if they were even relevant for our page rendering or not. I agree I could really start analyzing subset of cache entries but there were lots (around 31k afair) and I just didn't have enough energy to delve further. So, it might that I could debug the issue further to confirm that cache wasn't the issue -- but on the other hand, I felt that there should be better ways to understand what's going on. what should the course of action when seeing a lot of cache misses: (a) notify application team, or (b) increase cache size. It depends on the reason for the cache misses. There could be a loop over many nodes somewhere, in which case a larger cache might not really help (most caches are not scan resistant). There could be other reasons. But I don't see how the ability to configure the number of entries in the cache would help. I agree that I don't have enough understanding of internal of how caches work -- but, as I mentioned above, I couldn't quite figure out a clean way to identify cache access pattern -- the best I could figure out was what all existed there. Thanks, Vikas
Re: [Document Cache Size] Is it better to have cache size using number of entries
Hi Vikas, Sizing the cache can be done by either number of entries or the size taken by cache. Currently in Oak we limit by size however as you mentioned limit by count is more deterministic. We use Guava Cache and it supports either limiting by size or by number of entries i.e. the two policies are exclusive. So at minimum if you can provide a patch which allows the admin to choose between the two it would allow us to experiment and later see how we can put a max cap on cache size. Chetan Mehrotra On Mon, Aug 18, 2014 at 7:55 PM, Vikas Saurabh vikas.saur...@gmail.com wrote: we can probably have both and cache respects whichever constraint hits first (sort of min(byte size, entry size)). First of all I don't know MongoNS implementation details so I can be wrong. I'd rather keep the size in bytes as it gives me much more control over the memory I have and what I decide to provide to the application. If we say, to take an extreme example, 1 document only in cache and then this single document exceed the amount of available memory I fear an OOM. On the other hand having bytes ensure us the application keeps working and it will be task of a sysadmin to monitor the eventual hit/miss ratio to adjust the cache accordingly. Yes, sysadmin can modify cache size in bytes if miss ratio increases. But, in current scenario, I couldn't figure out a neat way (heuristic/guesswork) to figure out if it's application mis-behavior or lack of cache size (notice our issue didn't happen to be related to cache size... but still the question did bug us). On the other hand, an sysadmin can be provided with a rough idea about relation of (frequently used) repo nodes using which sysadmin can update cache size. Also, I do take the point of avoiding OOMs in case of pretty large documents which is why we can have both properties(byte size and entry count) with byte constraint being a fail safe. Thanks, Vikas
Re: [Document Cache Size] Is it better to have cache size using number of entries
Hi, Limiting the cache size by number of entries doesn't make sense. It is a sure way to run into out of memory, exactly because the sizes of documents varies a lot. as you mentioned limit by count is more deterministic. How, or in what way, is it more deterministic? sysadmin can be provided with a rough idea about relation of (frequently used) repo nodes using which sysadmin can update cache size. I can't follow you, sorry. How would a sysadmin possibly know the number of frequently used nodes? And why would he know that, and not the amount of memory? And why wouldn't he worry about running into out of memory? Even for off-heap caches, I think it's still important to limit the memory. Even tought you don't get an out-of-memory exception, you would still run out of physical memory, at which point the system would get extremely slow (virtual memory trashing). Regards, Thomas On 19/08/14 08:30, Chetan Mehrotra chetan.mehro...@gmail.com wrote: Hi Vikas, Sizing the cache can be done by either number of entries or the size taken by cache. Currently in Oak we limit by size however as you mentioned limit by count is more deterministic. We use Guava Cache and it supports either limiting by size or by number of entries i.e. the two policies are exclusive. So at minimum if you can provide a patch which allows the admin to choose between the two it would allow us to experiment and later see how we can put a max cap on cache size. Chetan Mehrotra On Mon, Aug 18, 2014 at 7:55 PM, Vikas Saurabh vikas.saur...@gmail.com wrote: we can probably have both and cache respects whichever constraint hits first (sort of min(byte size, entry size)). First of all I don't know MongoNS implementation details so I can be wrong. I'd rather keep the size in bytes as it gives me much more control over the memory I have and what I decide to provide to the application. If we say, to take an extreme example, 1 document only in cache and then this single document exceed the amount of available memory I fear an OOM. On the other hand having bytes ensure us the application keeps working and it will be task of a sysadmin to monitor the eventual hit/miss ratio to adjust the cache accordingly. Yes, sysadmin can modify cache size in bytes if miss ratio increases. But, in current scenario, I couldn't figure out a neat way (heuristic/guesswork) to figure out if it's application mis-behavior or lack of cache size (notice our issue didn't happen to be related to cache size... but still the question did bug us). On the other hand, an sysadmin can be provided with a rough idea about relation of (frequently used) repo nodes using which sysadmin can update cache size. Also, I do take the point of avoiding OOMs in case of pretty large documents which is why we can have both properties(byte size and entry count) with byte constraint being a fail safe. Thanks, Vikas
Re: [Document Cache Size] Is it better to have cache size using number of entries
Hi Thomas, On Tue, Aug 19, 2014 at 6:13 PM, Thomas Mueller muel...@adobe.com wrote: How, or in what way, is it more deterministic? Missed providing some context there so here are the details. Currently we limit the cache by total size taken. Now given a system where you have say 32 GB RAM available to you and admin needs to decide how much memory one should allocate to DocumentNodeStore. Currently we do not have a definitive way to tell that as there are couple of factors to consider 1. Number of entries in Document Cache - The Document cache (which caches the NodeDocuments) is the most critical cache and is currently allocated 70% of the cache size. We would like to give it as much memory as possible but then we also need to take into account the time taken to perform consistency check for the entries present in the cache. If time taken to perform consistency checks take more than 1 sec then it would delay the background job in DocumentNodeStore and hence root node version would become stale. Now time taken to perform consistency check ~= f(n) where n = number of entries in the cache and not the size of cache. With Mongo we can have good estimate of time taken to query modCount for 'n' node. Going forward would add some stats collection in this logic to determine how much time is being spent in cache consistency check 2. Effect of GC with larger heaps - As we run JVM with higher heap size we need to take into account delays that might occur with such large heaps So if we can have a cache policy which can put up a max cap on memory taken and also allow limit of number of entries then that would give a more deterministic control on tuning the cache Chetan Mehrotra
Re: [Document Cache Size] Is it better to have cache size using number of entries
sysadmin can be provided with a rough idea about relation of (frequently used) repo nodes using which sysadmin can update cache size. I can't follow you, sorry. How would a sysadmin possibly know the number of frequently used nodes? And why would he know that, and not the amount of memory? And why wouldn't he worry about running into out of memory? Even for off-heap caches, I think it's still important to limit the memory. Even tought you don't get an out-of-memory exception, you would still run out of physical memory, at which point the system would get extremely slow (virtual memory trashing). What I meant was there was no way for me to guess a good number for document cache (e.g. why is 256MB -- the default value -- sufficient/insufficient) given that I knew what type of load I (as application engineer) plan to put on an author. I understand that mem usage is the bottom line and sysadmin must configure that too -- but from a sysadmin point of view what should the course of action when seeing a lot of cache misses: (a) notify application team, or (b) increase cache size. Yes, at the end of the day there would be balance between these 2 options -- but from app engineer point of view, I've no idea what/how much cache size is useful/sufficient or even how to map a given size in bytes to the kind of access I'd plan on this repository which kind of nullifies option (a). I don't know, for sure, about general deployments, but in our case engineer team does recommend heap size and other JVM settings (and possibly tweak levels) to sysadmin team -- I thought that's how setups usually are done. Thanks, Vikas
Re: [Document Cache Size] Is it better to have cache size using number of entries
We use Guava Cache and it supports either limiting by size or by number of entries i.e. the two policies are exclusive. Hmm I totally missed this point. So at minimum if you can provide a patch which allows the admin to choose between the two it would allow us to experiment and later see how we can put a max cap on cache size. Ok, that can be done, but both the constraints have advantages. Having just entry limit would make memory usage uncontrollable. I haven't looked at guava, but should it not be possible to extend it to have this type of functionality (constraining both size and entry count) Thanks, Vikas
[Document Cache Size] Is it better to have cache size using number of entries
Hi, We were struggling past couple of weeks with severe performance issues on AEM6/Oak/MongoNS -- fortunately the issue was due to VM we were using. So, all seems well for now. BUT, during investigation, one of the things that we were worried about was document cache missing hits... we tried modifying its value to increase cache size to be put at 1GB. Although, that didn't quite help as the issue wasn't about cache size... but what worried us more was there was no good way for us to specify a cache size. While I agree that cache should be a memory hog... but entry size of a document in cache is quite variable in nature and as an admin I can make guesses about JCR nodes and their access patters. Document size, otoh, would vary even during the lifetime of repository. Moreover, the only downside (assuming RAM is cheap) of increasing cache to large number is making cache invalidation expensize. So, IMHO, document caches should be limited in terms of number of entries in cache. And if we still want to have byte size limitation, we can probably have both and cache respects whichever constraint hits first (sort of min(byte size, entry size)). We can log an issue for this and provide a patch too -- but it seemed better to have a conversation going before that. Thought? Thanks, Vikas
Re: [Document Cache Size] Is it better to have cache size using number of entries
Hello Vikas, On 18/08/2014 12:05, Vikas Saurabh wrote: Hi, ... specify a cache size. While I agree that cache should be a memory hog... but entry size of a document in cache is quite variable in nature and as an admin I can make guesses about JCR nodes and their access patters. Document size, otoh, would vary even during the lifetime of repository. Moreover, the only downside (assuming RAM is cheap) of increasing cache to large number is making cache invalidation expensize. So, IMHO, document caches should be limited in terms of number of entries in cache. And if we still want to have byte size limitation, we can probably have both and cache respects whichever constraint hits first (sort of min(byte size, entry size)). First of all I don't know MongoNS implementation details so I can be wrong. I'd rather keep the size in bytes as it gives me much more control over the memory I have and what I decide to provide to the application. If we say, to take an extreme example, 1 document only in cache and then this single document exceed the amount of available memory I fear an OOM. On the other hand having bytes ensure us the application keeps working and it will be task of a sysadmin to monitor the eventual hit/miss ratio to adjust the cache accordingly. About cache invalidation I'm not sure but it could be that the MongoNS implementation uses off-heap for caching. Cheers Davide
Re: [Document Cache Size] Is it better to have cache size using number of entries
we can probably have both and cache respects whichever constraint hits first (sort of min(byte size, entry size)). First of all I don't know MongoNS implementation details so I can be wrong. I'd rather keep the size in bytes as it gives me much more control over the memory I have and what I decide to provide to the application. If we say, to take an extreme example, 1 document only in cache and then this single document exceed the amount of available memory I fear an OOM. On the other hand having bytes ensure us the application keeps working and it will be task of a sysadmin to monitor the eventual hit/miss ratio to adjust the cache accordingly. Yes, sysadmin can modify cache size in bytes if miss ratio increases. But, in current scenario, I couldn't figure out a neat way (heuristic/guesswork) to figure out if it's application mis-behavior or lack of cache size (notice our issue didn't happen to be related to cache size... but still the question did bug us). On the other hand, an sysadmin can be provided with a rough idea about relation of (frequently used) repo nodes using which sysadmin can update cache size. Also, I do take the point of avoiding OOMs in case of pretty large documents which is why we can have both properties(byte size and entry count) with byte constraint being a fail safe. Thanks, Vikas