Hi Håvard,

Unbound does not have a clean strategy for cache.
Records are not evicted based on their TTL status.
Instead Unbound will try to fill all of the configured memory with data.
Then when a new entry needs to be cached and there is no space left, the earliest used entry (based on an LRU list) will be dropped off to free space for the new entry.

Unfortunately the size of the cache is not something trivial to solve because it heavily depends on client traffic patterns.

Monitoring the cache-hit rate with different memory configurations could give a preferred size for specific installations.

Best regards,
-- Yorgos

On 30/08/2024 18:15, Havard Eidnes via Unbound-users wrote:
Hm,

no response to the earlier message?

The host in question here is a (backup) recursive resolver node
for our customers, and it has daily query volume peaks somewhere
between 1000 and 1500 qps.

Since my previous message, I've upgraded to unbound 1.21.0, and in
conjuntion with that I've started collecting some of the "cache
statistics" from unbound using "unbound-control stats_noreset".

I'm running with the default "cache-max-ttl" setting of 86400, and
analyzing a cache dump shows that all the RRSets have a TTL less than
this threshold.  See attached plot #1 for the distribution per 60s
bin.  No big surprise there.

Now, one would therefore expect that the RRsets cached during one
day would no longer occupy the cache the following day?

That is however not the behaviour I observe.  If I plot
mem.cache.rrset and rrset.cache.count over time, I get a steady
increase, see the attached plots #2 and #3 respectively.  As a result,
I get ever increasing memory consumption, instead of a stabilized
memory usage after a while.

As stated earlier, my unbound is configured with

   # Put some limits on virtual memory consumption
   # in attempt to avoid being killed due to "out of swap"...
   rrset-cache-size: 8G
   msg-cache-size: 4G
   key-cache-size: 800m

OK, so I've not yet reached any of those thresholds; the RRset
cache memory is at the moment closing in on 1.4G, while unbound
itself as a process is nearing 3.6G:

   PID USERNAME PRI NICE   SIZE   RES STATE       TIME   WCPU    CPU COMMAND
   124 unbound   85    0  3594M 3457M kqueue/0  906:08 12.06% 12.06% unbound


In my perhaps naive assumptions as an operator, I would have
thought that the memory used to store a now-expired RRset would
be released, and the count of cached RRsets would be decremented,
but despite having a cap on TTL, the count of cached RRsets seems
to be ever increasing, and the same goes for the consumed memory.
Is the releasing of the memory only going to happen when one of
the configured sizes are being approached or exceeded?

I'm trying to understand how I as an operator am supposed to
configure unbound so as to not grow ever larger, falling victim
to having to use swap (slow!), or eventually exceeding the
configured swap space (ouch!).

It's not behaving the way I anticipated.

"Help!"

Best regards,

- Håvard

Reply via email to