During testing, Xiang hit an interesting race. In particular, while
testing out the DHCP test address logic, occasionally the DHCP client
fails with:
open_ip_lif: bge1: cannot bring up IPMP group interface: Invalid argument
unable to open socket for bge1: Invalid argument
Specifically, to bring the IPMP interface up, the DHCP client calls
SIOCGLIFFLAGS to load the current flags, then immediately sets IFF_UP and
issues an SIOCSLIFFLAGS. However, suppose that between the SIOCGLIFFLAGS
and the SIOCSLIFFLAGS, one of the flags in IFF_IPMP_CANTCHANGE changes (in
this case, IFF_FAILED). Then, from the point of view of the kernel, it
appears that dhcpagent is attempting to clear IFF_FAILED, so it fails the
SIOCSLIFFLAGS with EINVAL.
Note that this can't happen with the (existing) list of IFF_CANTCHANGE
flags, because the kernel simply ignores those, rather than returning an
error. I decided to return EINVAL with IFF_IPMP_CANTCHANGE in the
interest of making it clear to the application that its request was
invalid, but it seems that's easier said than done.
I see two non-hacky possibilities:
1. We change IFF_IPMP_CANTCHANGE to work like IFF_CANTCHANGE.
That is, the kernel silently strips off any flags in
IFF_IPMP_CANTCHANGE and processes the rest of the request.
This will fix the problem, but is a bit confusing since
e.g. a request to set IFF_STANDBY on the IPMP interface
will appear to succeed but in fact do nothing.
2. We split IFF_IPMP_CANTCHANGE into IFF_IPMP_CANTCHANGE and
IFF_IPMP_INVALID. The first set (IFF_RUNNING and IFF_FAILED)
act like IFF_CANTCHANGE, in that they are silently ignored.
The second set (all the other flags, which can *never* be
set on an IPMP interface) explicitly return EINVAL.
Thoughts? Other alternatives?
--
meem