Lutz Pressler wrote:
Hello,
at least 2.25 (and probably since 2.23 as then sending a server identifier
with DHCPNAK had been introduced)
I checked: it's only 2.25 which has this problem: 2.23 and 2.24 are OK.
The trigger for the problem was moving the call to narrow_context()
earlier in the code, to cope with interfaces which have more than one
address, each on a different subnet. That happened in 2.25.
crashes "in case of a machine moving whilst
it has a lease". Reason (rfc2131.c line 631ff):
if (!(context = narrow_context(context, mess->yiaddr)))
{
/* If a machine moves networks whilst it has a lease, we catch that
here. */
message = _("wrong network");
/* ensure we broadcast NAK */
unicast_dest = 0;
}
... so in this case "context" is NULL.
But in the section starting line 664 "context" is dereferenced:
if (message)
{
log_packet("NAK", &mess->yiaddr, chaddr, iface_name, message);
mess->siaddr.s_addr = mess->yiaddr.s_addr = 0;
bootp_option_put(mess, NULL, NULL);
p = option_put(p, end, OPTION_MESSAGE_TYPE, 1, DHCPNAK);
--> p = option_put(p, end, OPTION_SERVER_IDENTIFIER, INADDRSZ,
ntohl(context->local.s_addr));
p = option_put_string(p, end, OPTION_MESSAGE, message);
/* This fixes a problem with the DHCP spec, broadcasting a NAK to a
host on
a distant subnet which unicast a REQ to us won't work. */
if (!unicast_dest || mess->giaddr.s_addr != 0 ||
--> mess->ciaddr.s_addr == 0 || is_same_net(context->local, mess->ciaddr,
context->netmask))
{
mess->flags |= htons(0x8000); /* broadcast */
mess->ciaddr.s_addr = 0;
}
}
The second case is no problem because of lazy evaluation as
"unicast_dest" is zero and tested earlier - but the first case is:
The attached patch changes "ntohl(context->local.s_addr)" into
"context ? ntohl(context->local.s_addr) : NULL" and fixes the
problem for me - but probably sending a NULL server identifier
is not correct. As I understand RFC 2131, the option has to be
included, but maybe one should use a local address of the
interface the request came in?
That's a great analysis, thanks. The correct way to cope with
context == NULL as it use the value of context->local from before
the call to narrow_context(). That will give a good value for an address
of the local interface.
Since this is a relatively bad bug, and only in 2.25 which is a week
old, I think the best course is to release 2.26, and deprecate 2.25.
Have a nice day,
Thanks for your input.
Cheers,
Simon.