> Hello Nicolas,
> The choices made for dnsmasq sound overly complex, peculiar and
> subject to incompatibilities with the vast majority of other
> softwares.
> What's wrong with listening only on a single interface when asked to?
>
> For instance, when nginx is configured to be listening only on the
> loopback interface, it does not "take over" all interfaces but listen
> only on 127.0.0.1 and ::1.

Dnsmasq works the way it does because of choices made 20 years ago, but those are still valid today.

A DHCPv4 server doesn't work entirely at the IP level of the network stack: it exists at level 2 (transport) as well. That's because a DHCP server has to interact with a client _before_ the client has an IP address and before the client knows the address of the DHCP server.

In practice this means that the first thing a client does is to send a packet which is broadcast (at layer 2) to the universal IP broadcast address (255.255.255.255) with source address 0.0.0.0

A server which is not listening to the wildcard address will not receive that packet from the IP stack on most OSs, so if you want the DHCP server to operate purely at IP level, that's what it has to do, and that's what dnsmasq does.

There is another way around this, which the ISC dhcpd uses (and I think Kea also, but I'm not sure) which allows it to listen on unicast addresses, but it has signficant other downsides. The technique is to bypass the OS's IP stack to receive the problematic first packets. This is done by opening a RAW socket, which returns every packet that arrives at the machine, complete with all the transport and IP level headers. The DHCP server can then look through these, discard most of them and find the broadcast packets to 255.255.255.255.

This method has lots of downsides:

1) The DHCP server sees _every_ IP packet the machine receives and the kernel has to to do a context switch to the DHCP server for each of those packets. On a server which is heavily loaded doing lots of stuff which isn't DHCP, that's lots of extra CPU cycles. 20 years ago this was an insoluble problem. Today, you could use BPF to select the packets you want, since Linux now supports BPF.

2) The DHCP server sees raw packets with all the headers, so it has to understand the format of all those headers. It's no good just adding support for a new transport layer to the kernel: you have to teach the DHCP server about it too.

3) Raw sockets (and kernel packet filters like BPF) are pretty non-portable, so you have to carry and maintain different code for each platform you support, rather than relying on the POSIX APIs.

4) The code to handle RAW sockets and layer 2 packet formats in non-trivial in complexity and size. Early dnsmasq was small and simple. It's not so much, these days, but it still attempts to do a lot with a small amount of resources.

Using raw sockets does have the upside that it makes it easy to run multiple DHCP servers on a single machine.

I made the tradeoff that multiple DHCP servers on a single machine is not a common or useful enough feature to take the hits needed to do it. ISC had jumped the other way, which made my decision easier: anyone wanting that feature could run ISC.

This fundamental tradeoff has persisted to this day. Over time, however, adaptations have been made which cover most of the reasons to have more than one DHCP server, without taking the hit of moving to raw sockets.

1) It's possible to run multiple dnsmasq instances on a single machine that do DHCP as long as each one is configured to do DHCP on exactly one interface. This uses code to bind IP sockets to interfaces and the SO_REUSEADDR and SO_REUSEPORT ioctls. This covers the Openstack use-case.

2) A dnsmasq instance can act as a DHCP server for some subnets and a DHCP relay for others. That covers the case of needing a server and a relay on the same machine, and gives a strategy when you really need to mix dnsmasq with another DHCP server: run the other server on a different machine or VM and get dnsmasq to relay to it.



TL,DR; Dnsmasq not doing what you want is not a bug, it's feature. It reflects the results of a valid tradeoff. There are downsides as well as upsides to changing it, and we don't feel it's useful to do so at this time.


Cheers,

Simon.


_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

Reply via email to