These suggestions - like most performance articles - are oriented toward
achieving the highest performance with large configurations. E.g. "How
big can/should you go to support big loads?"
That's useful for many users. But there are also many people who run
smaller operations, where the goal is to provide adequate (or even
exceptional) performance with a minimum footprint. When BIND is one of
many services, overall performance can be improved by minimizing BIND's
resource requirements. This is also true in embedded applications,
where footprint matters.
So a discussion about how to optimize for the smaller cases - what do
you trade-off? What knobs can one turn down - and how far? would be a
useful part of or complement to the proposed article. E.g. "How small
can/should you go when your loads are smaller?"
FWIW, a wizard - even just a spreadsheet - that encapsulates known
performance results might also be useful. E.g. Given a processor,
number/size of zones, query rate, & type, produce a memory size, disk &
network I/O rates, and starting configuration parameters... Obviously,
this could become arbitrarily complicated, but a simple spreadsheet with
configuration (hardware & software) and performance data that's
searchable would give people a good starting point. Especially if it's
real-world. (It can be challenging to map artificial
"performance"/stress tests done in a development/verification
environment to the real world...) While full automation can be fun,
it's amazing how much one can get out of a spreadsheet with/autofilter.
(For the next level, pivot tables and/or charts...)
Timothe Litt
ACM Distinguished Engineer
--
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
On 07-Jul-20 21:57, Victoria Risk wrote:
> A while ago we created a KB article with tips on how to improve your
> performance with our Kea dhcp server. The tips were fairly obvious to
> our developers and this was pretty successful. We would like to do
> something similar for BIND, provide a dozen or so tips for how to
> maximize your throughput with BIND. However, as usual, everything is
> more complicated with BIND.
>
> Can those of you who care about performance, who have worked to
> improve your performance, share some of your suggestions that have the
> most impact? Please also comment if you think any of these ideas
> below are stupid or dangerous. I have combined advice for resolvers
> and for authoritative servers, I hope it is clear which is which...
>
> The ideas we have fall into four general categories:
>
> System design
> 1a) Use a load balancerto specialize your resolvers and maximize your
> cache hit ratio. A load balancer is traditionally designed to spread
> the traffic out evenly among a pool of servers, but it can also be
> used to concentrate related queries on one server to make its cache as
> hot as possible. For example, if all queries for domains in .info are
> sent to one server in a pool, there is a better chance that an answer
> will be in the cache there.
>
> 1b) If you have a large authoritative system with many servers,
> consider dedicating some machines to propagate transfers. These
> machines, called transfer servers, would not answer client queries,
> but just send notifies and process IXFR requests.
> 1c) Deploy ghost secondaries. If you store copies of authoritative
> zones on resolvers (resolvers as undelegated secondaries), you can
> avoid querying those authoritative zones. The most obvious uses of
> this would be mirroring the root zone locally or mirroring your own
> authoritative zones on your resolver.
>
> we have other system design ideas that we suspect would help, but we
> are not sure, so I will wait to see if anyone suggests them.
>
> OS settings and the system environment
> 2a) Run on bare metal if possible, not on virtual machines or in the
> cloud. (any idea how much difference this makes? the only reference we
> can cite is pretty out of date
> -
> https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf
> )
>
> 2b) Consider using with-tuning-large.
> (https://kb.isc.org/docs/aa-01314) This is a compile time option, so
> not something you can switch on and off during production.
>
> 2c) Consider which R/W lock choice you want to use -
> https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named
> For the highest tested query rates (> 100,000 queries per second),
> pthreads read-write locks with hyper-threading /enabled/ seem to be
> the best-performing choice by far.
>
> 2d) Pay attention to your choice of NIC cards. We have found wide
> variations in their performance. (Can anyone suggest what specifically
> to look for?)
>
> 2e) Make sure your socket send buffers are big enough. (not sure if
> this is obsolete advice, do we need to tell people how to tell if
> their buffers are causing delays?)
>
> 2f) When the number of C