having spent some time in looking at everyone's postings on this thread,
some responses and a proposal up for bashing.
sowmini> Correct, so why bother to include the flags at all (since the only info
sowmini> the routing socket message can have is "ce1", and not
sowmini> something like "ce1:5",
meem>
meem> Aside from the desire to have the structure definition be compatible with
meem> BSD? Yes, there are a lot of rough edges with our current routing socket
meem> implementation.
BSD does not have an IFA_UP. the only ifa_flags that I could see
(from browsing freebsd) is RTF_CLONING, IFA_ROUTE (aliases to RTF_UP).
IFA_ROUTE is set if the interface route has been installed and marked
RTF_UP.
thus if compatibility is what we want, then we should have reported IFA_ROUTE
with RTM_*ADDR instead of IFF_UP. But that, of course, is an incompatible
change
at this point, for legacy applications that currently muddle with
IFF_UP on NEWADDR, so the compatibility issue is with our own history
in this case.
meem> What are the BSD semantics for RTM_IFANNOUNCE?
The data passed is
struct if_announcemsghdr {
u_short ifan_msglen; /* to skip over non-understood messages */
u_char ifan_version; /* future binary compatibility */
u_char ifan_type; /* message type */
u_short ifan_index; /* index for associated ifp */
char ifan_name[IFNAMSIZ]; /* if name, e.g. "en0" */
u_short ifan_what; /* what type of announcement */
};
ifan_what can be IFAN_ARRIVAL (sent from if_attach()) or IFAN_DEPARTURE
(from if_detach()). Thus this could correspond to plumb/unplumb events
on Solaris, with the interface being up if it is plumbed.
It would provide a useful way to track if the interface is present (as
in "plumbed for IP") or not, but does not solve the existing issues
with the intertwining of IFF_UP and the 0'th logical interface
(discussed further below).
meem> In general, with routing sockets I think someone needs to step back and
meem> figure out whether there's an overall semantic model that both provides
meem> BSD compatibility and deals properly with logical interfaces. (I can't
meem> see it, but maybe there is an approach.) If there is no such approach,
the first thing that we'd need to do is to fix the bugs, and
stream-line the number of routing socket messages to something
meaningful. Today we end up sending confusing messages for simple
actions. E.g., 'ifconfig ce1 plumb' sends up RTM_IFINFO, RTM_DELETE,
RTM_DELADDR. I would only have expected one RTM_IFINFO (or RTM_IFANNOUNCE,
if we choose that path). 'ifconfig ce1 up' then sends up another 6 RTM
messages including several RTM*ADDR messages even though the only thing
that I changed is the interface flag.
A logical interface is an address. So it should really only trigger
RTM*ADDR messages. The only exception I can think of would be the
case when the last logical address goes down (or the first one comes up)
which may affect the interface flags "enabled" itself, in which case
we should send a RTM_IFINFO.
Then we would come to the hard questions, which gets us back
to the original issues list. As I understand, the issues were
- deleted vs disabled addresses. Points that have been brought up:
- We sometimes want to disable addreses (as a light-weight way of
putting them out of commision, as in Meem's Cluster example).
- Jim pointed out some issues about IFF_UP's history which were the
reasons for reusing IFF_UP as the enable/disable flag for DAD.
The ramification of this choice is that one needs to mark an interface
as "up" to retry DAD, while the kernel will keep marking it down
as long as DAD keep failing.
In general (for everything other than the 0'th address) the IFF_UP
flag is good enough for enable/disable, and the removeif
defines address deletion. Address 0 is the anomaly - for this one,
we can't delete without taking all the other addresses down as well..
and tweaking the UP/DOWN flag on the 0'th address is misleading for
an application that does a GLIFFLAGS on net0 and finds it ~UP
even though other addresses (and the interface itself) are up.
A possible long term solution would be
- have the 0'th address to be AF_LINK. The IFF_UP flag on this AF_LINK
is the interface flag. The IFF_UP flag on every subsequent IP
address (starting at :1) is the "enabled"/"disabled" flag on the address.
Turning off IFF_UP on the AF_LINK implies that all atttached addresses
are also disabled.
GLIFCONF and getifaddr will report only UP addresses when
LIFC_NEW_APP is set. Suppress the AF_LINK in GLIFCONF output
when LIFC_NEW_APP is not set, so that legacy programs will see all
the IP addresses (including the downed ones, but not the AF_LINK) as
they do today.
Any IP address can be deleted in this model, without implying unplumb
of the entire interface. Deletion of the AF_LINK address implies
unplumbing of the interface.
Caveat:
Implementing all this is non-trivial given that there are pervasive
assumptions in the kernel about the 0'th ipif.
A short term solution:
- libipadm will fake the deletion of the 0'th address by writing
0.0.0.0 or :: into it (and also marking it ~IFF_UP).
- Any address can be disabled by marking it ~IFF_UP
Caveat:
This still leaves the confusion around a disabled net0, while net0:1
is IFF_UP. The long term solution should solve that.
- can we actually hide logical interface details from the SIOC* input and
output? If we can't, how to report logical interfaces in routing sockets
and functions such as if_indextoname
I don't have a good answer for this one. My own take is that we should
hide the logical interface details as far as possible (hence my
suggestion that the particular issue that Anders flagged in
http://www.opensolaris.org/jive/thread.jspa?threadID=101449&tstart=0
should be handled by having some flexibility in the zoneid checks for ioctls
in non-global zones).
We should provide some library function that takes a sockaddr as
input and returns a logical interface name as output to help out
applications that have to muddle through GLIFCONF today.
And while we are about it, we should also support getifaddrs() as Meem
suggested. I'd venture that the ifa_name returned should be the
logical interface name, since this is just a better GLIFCONF.
Note that BSD's getifaddrs is itself essentially a wrapper around
GIFCONF, and actually reports AF_LINK addresses as well (i.e., it
would be a LIFC_NEW_APP in my proposal above).
thoughts?
--Sowmini