On 23.11.2018 18:36, via Unbound-users wrote:

 however, those concerns are in a way off topic for this mailing list, so allow me to ask a more direct unbound question. why does the cache bloat? you're using LRU replacement, and these records are never accessed. therefore while they can push other more vital things out of the cache, decreasing cache hit rate, they should be primary targets for replacement whenever other data is looking for a place to land. i understand that this cache churn has a cost, in bandwidth and in CPU, but not in memory -- once the cache reaches its working set maximum, it ought to grow no further. what could i be misunderstanding about this?

Your understanding is correct I trust and bloating been a misdirection indeed. Referring to initial post: "Since I am observing a lot of DNS Tunnel “users” , the cache started to store totally useless records of type TXT and NULL."

And in this context those queries, which to my understanding can be of high frequency in a DNS tunnel (depending on its purpose), are replacing legitimate records once the max. cache size is reached. And as you stated churning the cache comes at a cost. I am wondering what legitimate purpose it is for the resolver not only to cache NULL records but even serve them to clients other than perhaps some corporate edge/niche cases considering that at least rfc1035 does not specify a legitimate purpose for NULL records (as of today).

a second unbound-related topic is cache management itself. it is unusual for the splay between a name and its descendants to number in the millions. it happens for arpa, and popular TLD's such as COM, NET, ORG, and DE. as a cache management strategy, consider whether to more rapidly discard descendants of a high splay apex, unless they are accessed at least once. and in defiance my fear-related argument above, when the cache is full beyond some threshold like 90%, consider using the "splay is high, subsequent access of descendants is zero" as a signal to (a) not cache new descendant data, and (b) syslog it. there isn't a dnstap message-tag for this condition yet, but there ought to be. splay is easy to keep track of unless your cache is flat.

After reading it I thought that something like Rate-limiting Fetches Per Zone as implemented in BIND would be helpful to have in unbound too: "which defines the maximum number of simultaneous iterative queries to any one domain that the server will permit before blocking new queries for data in or beneath that zone."



Reply via email to