> Hello Nicolas,
> The choices made for dnsmasq sound overly complex, peculiar and
> subject to incompatibilities with the vast majority of other
> softwares.
> What's wrong with listening only on a single interface when asked to?
>
> For instance, when nginx is configured to be listening only on the
> loopback interface, it does not "take over" all interfaces but listen
> only on 127.0.0.1 and ::1.
Dnsmasq works the way it does because of choices made 20 years ago, but
those are still valid today.
A DHCPv4 server doesn't work entirely at the IP level of the network
stack: it exists at level 2 (transport) as well. That's because a DHCP
server has to interact with a client _before_ the client has an IP
address and before the client knows the address of the DHCP server.
In practice this means that the first thing a client does is to send a
packet which is broadcast (at layer 2) to the universal IP broadcast
address (255.255.255.255) with source address 0.0.0.0
A server which is not listening to the wildcard address will not receive
that packet from the IP stack on most OSs, so if you want the DHCP
server to operate purely at IP level, that's what it has to do, and
that's what dnsmasq does.
There is another way around this, which the ISC dhcpd uses (and I think
Kea also, but I'm not sure) which allows it to listen on unicast
addresses, but it has signficant other downsides. The technique is to
bypass the OS's IP stack to receive the problematic first packets. This
is done by opening a RAW socket, which returns every packet that arrives
at the machine, complete with all the transport and IP level headers.
The DHCP server can then look through these, discard most of them and
find the broadcast packets to 255.255.255.255.
This method has lots of downsides:
1) The DHCP server sees _every_ IP packet the machine receives and the
kernel has to to do a context switch to the DHCP server for each of
those packets. On a server which is heavily loaded doing lots of stuff
which isn't DHCP, that's lots of extra CPU cycles. 20 years ago this was
an insoluble problem. Today, you could use BPF to select the packets you
want, since Linux now supports BPF.
2) The DHCP server sees raw packets with all the headers, so it has to
understand the format of all those headers. It's no good just adding
support for a new transport layer to the kernel: you have to teach the
DHCP server about it too.
3) Raw sockets (and kernel packet filters like BPF) are pretty
non-portable, so you have to carry and maintain different code for each
platform you support, rather than relying on the POSIX APIs.
4) The code to handle RAW sockets and layer 2 packet formats in
non-trivial in complexity and size. Early dnsmasq was small and simple.
It's not so much, these days, but it still attempts to do a lot with a
small amount of resources.
Using raw sockets does have the upside that it makes it easy to run
multiple DHCP servers on a single machine.
I made the tradeoff that multiple DHCP servers on a single machine is
not a common or useful enough feature to take the hits needed to do it.
ISC had jumped the other way, which made my decision easier: anyone
wanting that feature could run ISC.
This fundamental tradeoff has persisted to this day. Over time, however,
adaptations have been made which cover most of the reasons to have more
than one DHCP server, without taking the hit of moving to raw sockets.
1) It's possible to run multiple dnsmasq instances on a single machine
that do DHCP as long as each one is configured to do DHCP on exactly one
interface. This uses code to bind IP sockets to interfaces and the
SO_REUSEADDR and SO_REUSEPORT ioctls. This covers the Openstack use-case.
2) A dnsmasq instance can act as a DHCP server for some subnets and a
DHCP relay for others. That covers the case of needing a server and a
relay on the same machine, and gives a strategy when you really need to
mix dnsmasq with another DHCP server: run the other server on a
different machine or VM and get dnsmasq to relay to it.
TL,DR; Dnsmasq not doing what you want is not a bug, it's feature. It
reflects the results of a valid tradeoff. There are downsides as well as
upsides to changing it, and we don't feel it's useful to do so at this time.
Cheers,
Simon.
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss