Hi, If we need a limit on the number of entries for some other (internal) reason, like consistency check, then I understand. If we later find a way to speed up consistency check (or if we don't need it, which I would prefer), then this is no longer needed. But I also don't know how to limit by number of entries and memory using the Guava cache API.
> why is 256MB -- the default value -- sufficient/insufficient We don't know. But how do you know that a cache of 10'000 "entries" is sufficient? Specially if each entry can be either 1 KB or 1 MB or 20 MB. The available memory can be divided into different areas, and each component is given a part of that. Then you look at performance, and see which component is slow, and you try to find out why. For example, it also depends on how expensive a cache miss is. As for the cache size in amout of memory: the best way to know what a good number is, is to analyze the performance (how much time is spent reading, cache hit ratio,...) > what should the course of action when seeing a lot of cache misses: (a) >notify application team, or (b) increase cache size. It depends on the reason for the cache misses. There could be a loop over many nodes somewhere, in which case a larger cache might not really help (most caches are not scan resistant). There could be other reasons. But I don't see how the ability to configure the number of entries in the cache would help. Regards, Thomas On 19/08/14 16:25, "Vikas Saurabh" <[email protected]> wrote: >>> sysadmin can be provided with a rough idea about relation of >>>(frequently >>>used) repo nodes using which sysadmin can update cache size. >> >> I can't follow you, sorry. How would a sysadmin possibly know the number >> of frequently used nodes? And why would he know that, and not the amount >> of memory? And why wouldn't he worry about running into out of memory? >> >> Even for off-heap caches, I think it's still important to limit the >> memory. Even tought you don't get an out-of-memory exception, you would >> still run out of physical memory, at which point the system would get >> extremely slow (virtual memory trashing). > >What I meant was there was no way for me to guess a good number for >document cache (e.g. why is 256MB -- the default value -- >sufficient/insufficient) given that I knew what type of load I (as >application engineer) plan to put on an author. I understand that mem >usage is the bottom line and sysadmin must configure that too -- but >from a sysadmin point of view what should the course of action when >seeing a lot of cache misses: (a) notify application team, or (b) >increase cache size. Yes, at the end of the day there would be balance >between these 2 options -- but from app engineer point of view, I've >no idea what/how much cache size is useful/sufficient or even how to >map a given size in bytes to the kind of access I'd plan on this >repository which kind of nullifies option (a). I don't know, for sure, >about general deployments, but in our case engineer team does >recommend heap size and other JVM settings (and possibly tweak levels) >to sysadmin team -- I thought that's how setups usually are done. > >Thanks, >Vikas
