On 09/05/18 10:21, Dominik DL6ER wrote:
> Dear Geert and mailinglist members,
>> Thing I wonder about is how the cache size clipping was discovered.
> I recently sent a SIGUSR1 to our dnsmasq because a user said that some
> queries have continuously been answered NXDOMAIN although they
> shouldn't. As they have been answered within less than a millisecond, I
> thought it must be an issue with the cache. Make a long story short it
> was their fault (an ordinary typo).
> However, by chance, I found
> May 4 18:06:13 dnsmasq: cache size 150, 2191/57999 cache
> insertions re-used unexpired cache entries.
> which seemed odd to me. I then looked at the man page and increased the
> cache size limit to 100,000 in the config but the startup message told
> me that the cache size is only 10,000. The way to the location where
> this clipping is done wasn't long from here and I figured removing might
> just be the best solution.
So with the previous cache size, less than 4% of queries cause eviction
of a name from the cache before the time-to-live expires and it
disappears anyway. Only a small fraction of those will be re-queried so
the number of extra cache-misses is in the noise. I'd say that the cache
size you had before was pretty much optimal, but in response you tried
to increase it to more than the total number of cache insertions ever,
during that run of dnsmasq.
Of course you might have been restarting dnsmasq frequently, so that's a
sample over a short period, but I'd guess otherwise, those numbers say
that 100,000 makes no sense at all. Whenever I've looked at this, the
optimal cache size is smaller that most people think.
The local copy of dnsmasq here, which is handling a network of a dozen
machines, has a tiny cache, and very similar stats. It just doesn't need
to be any bigger. (and that instance is doing DNSSEC, so the cache
holding all sorts of DNSSEC records too.)
cache size 600, 50/28526 cache insertions re-used unexpired cache entries.
>> I'm trying to tell that the performance penality that Simon warns us about,
>> might by canceled by high computing power.
> I agree, but you should probably not be running a caching DNS server
> with hundreds of active clients on a really low-power embedded machine
> like the good old Raspberry Pi in its first version.
> We have a dedicated (small) server for DNS, DHCP and email. dnsmasq is
> able to handle DNS blazingly fast - even with a maximum cache size of
> 100,000 (I may even want to increase this further if it prevents
> deletion of unexpired cache entries).
Be careful, the cache might have been set to 100,000 but based on the
numbers above, the average number of names in the cache will have much
been lower. The absolute upper-bound is 57999, but expiry of TTL and
re-querying of common names with short TTLs will make it smaller still.
We can't be sure that 100,000 is OK unless we make tests to actually
fill the cache with 100,000 unique names with long TTLs, and then test
_reverse_ lookups on that cache.
> I'm just trying to make clear that removing this artificial limit may
> improve the situation for those on beefier hardware but not impact the
> others as they are responsible for what they set when they decide to
> manually tweak their settings in this regard. It's a value where I think
> the hand-holding dnsmasq is doing for possibly supporting embedded
> devices better is just too much. In the end, Simon has to say if or not
> this artificial clipping can be removed or not. I think yes, because it
> doesn't affect anyone who has not changed the default value and allows
> the others to use any value for cache size them deem right for their
> hardware and application.
The problem is that people naturally think "more is better" but my
experience is that more is not necessarily better, and too big is
definitely worse, though I don't have an up-to-date idea about how big
is too big on the various classes of modern hardware.
> Best regards,
> Dnsmasq-discuss mailing list
Dnsmasq-discuss mailing list