This is a proposal to change the expected behavior of DLPI providers
with respect to looping back copies of transmitted packets to the
same or other streams that are sharing the device. Brief summary:
Never loopback a packet to it sending stream.
Always loopback a packet to any other stream that should recieve
that packet if it arrived from the medium.
DLPI Improved Loopback Handling
Mike Ditto
DRAFT
Thu Mar 9 01:01:38 PST 2006
In this document, "user" means a user of a data link service (DLS),
implemented by one stream connected to a data link service provider, as
described in the Data Link Provider Interface specification.
The problem:
Current DLPI device drivers do not allow two users of the same DLS provider
(e.g. a NIC) to communicate with each other. All DLS users can send and
receive packets on the physical media, and two DLS users speaking independent
protocols (e.g. Appletalk and TCP/IP) can share access to a single provider
without any need to see each other's packets. But if two simultaneous DLS
users want to participate in the same protocol, for example having two
different implementations of the same network protocol in simultaneous use,
they can not see each other's packets, even when the packets are properly
addressed to the local host's (NIC's) MAC address.
All current Solaris DLPI drivers provide a special hack that allows a
promiscuous listening user to see the packets sent by other users.
Promiscuous listening has a specific intended function, namely to disable some
portion of the inbound packet selection filter, widening the subset of
received traffic that will be passed to the associated user. The fact that it
also causes a change in the transmit processing of other users is a practical
wart, created to support monitoring tools such as snoop(1m). This special
loopback handling is not explicitly part of the DLPI standard, but is expected
and universally provided by Solaris drivers.
Aside from the snoop case, the need for one user to see another user's packets
is rare -- thus the longstanding lack of this capability on Solaris without
much complaint. A new application, Ethernet bridging, exposes this lack and
calls for a solution.
Complications:
If DLS users in general are to receive a copy of every transmitted packet that
matches their protocol of interest, there is a degenerate case where users
will receive copies of their own transmitted packets that happen to match
their receive filter, such as packets sent to broadcast or multicast
addresses. This would require extra processing in the DLS user, and might
break existing network layer protocol implementations.
Proposed solution:
I would like to solve the problem by requiring drivers to provide orthogonal
loopback behavior. By this I mean that any transmitted packet will--in
principle--be presented to all other users of that link interface, whether in
promiscuous mode or not, but subject to their respective receive filters, of
course. By "in principle" I mean that the behavior must be as if this was
done, although there might be optimizations that avoid this extra work when it
wouldn't result in a packet being seen by any other users.
Note that we won't ever loop a packet back to the same user that requested its
transmission, only to other users of the same link provider. Aside from being
the most useful and performant behavior, this mimics the behavior of most LAN
hardware.
As a driver implementation note, current logic that considers performing
transmit packet loopback whenever any user has one or more promiscuous levels
enabled must be changed to perform the loopback whenever any of these
conditions are true:
At least one user is in DL_PROMISC_PHYS mode.
At least one user is in DL_PROMISC_MULTI mode and the destination
address of the present packet is a multicast address.
At least one user is in DL_PROMISC_SAP mode and the destination
address of the present packet is equal to the local MAC address.
At least one user other than the current user is bound to a SAP
that matches the present packet.
To efficiently test these conditions, some data can be pre-computed for fast
access in the transmit path.
Analysis of Compatibility Issues:
It is hopeless to expect that all drivers will be enhanced to support this new
functionality. We do expect to enhance all the relevant drivers provided by
Sun, and we will implement the functionality in the GLD framework which will
automatically enhance many, but not all existing non-Sun drivers. Any Solaris
feature that might benefit from the new behavior must be implemented in a way
that still works on old drivers, or must clearly communicate to the customer
that the feature only works with certain drivers, with the latter option not
available to existing committed features.
The new requirement that packets not be looped back to the user that
transmitted them is a change in behavior for DL_PROMISC_PHYS users. If a user
is in DL_PROMISC_PHYS mode and transmits a packet, a copy will no longer be
queued for reading. Applications that both receive promiscuously and transmit
are unusual, but some might exist that are impacted by this change. In the
case of the bridge module, this is a beneficial change.
There is one potentially incompatible behavior change resulting from the IP
and ARP streams now being able to receive packets transmitted by other users.
An application that uses DLPI to transmit IP packets will have those packets
delivered to the local IP stack if their destination MAC address is the local
MAC address or the broadcast address or a multicast address currently of
interest to IP. Uses of such applications are unusual and probably don't send
to the local MAC address. If they do, the new behavior is likely to be a
useful improvement, but there could be some situations where it causes a
problem.
Practical Considerations for the Ethernet Bridging Project:
Without any of the changes proposed by this case and without any changes to IP
Ethernet bridging is possible, but the local host will be unable to
communicate with some of the remote hosts on the bridged LAN -- namely, the
ones "across" the bridge from the interface which is plumbed to IP. The
feature might still be useful in some situations even with this limitation --
for example one could build a "stealth mode" bridging firewall which was not
meant to allow any packets to reach the firewall's IP stack anyway. But it
would not be useful in at least one of the major intended uses: Solaris as
the OS for Xen domain 0. In that environment we need for the (domain 0) OS's
IP stack to have full connectivity with the hosts "across" the bridge, which
are in fact the guest domains.
We could work around the DLPI driver issue by modifying IP to put its stream
in promiscuous (DL_PROMISC_PHYS) mode. This would allow it to receive the
packets coming across the bridge when using a plain ordinary NIC driver. But
it would drastically increase the amount of traffic that IP must classify and
discard. It would also trigger some slower-path processing in most--if not
all--existing drivers. So this could not be made the default operating mode
for IP; it would have to be enabled when needed, perhaps manually or perhaps
automatically triggered by some action by the bridge module.
With the changes proposed by this case, IP will see all of the bridge's
transmissions that it's supposed to see, and the bridge won't get useless
copies of its own transmissions. If one driver does not have these changes,
we don't need to do anything special to keep basic bridging functionality
working (except rely on the existing code that discards the useless copies of
every transmitted packet that will passed up by the DLPI driver). But in that
case, IP is subject to the same limitations in the two preceding paragraphs -
either the local IP stack will be isolated from a portion of the LAN, or we
must cause the IP stack to use DL_PROMISC_PHYS mode with the resulting
performance penalties.
For at least the initial implementation of Solaris on Xen domain 0, bridging
is a key component in the inter-domain network architecture. Even though the
inter-domain traffic travels over an in-memory virtual LAN segment between
domains, it is actually the driver for the physical NIC that will determine
whether IP sees packets from the user domains ("across the bridge"). So it
seems that Xen Solaris would have to use some form of the IP-in-promiscuous-
mode approach when older NIC drivers are in use. However, the Clearview
project plans to deliver a "DLPI shim" component that allows (causes) all
non-GLD DLPI drivers to be act as MAC drivers beneath the GLD framework,
so that will probably solve this problem.
An alternative design for Ethernet bridging -- the one taken by the free
bridge-utils implementation for Linux -- modifies the plumbing and
administration procedures for IP such that it appears that IP is connected to
a pseudointerface provided by the bridge instead of to the NIC. So far, we
have choosen not to use this approach because of its disruptive change to the
networking administration model. It would be undesirable for the network
configuration procedures (specifically the interface name used by ifconfig et
al) to substantially change just because you reboot Solaris with (or without)
Xen, even when you are not actually using any Xen user domains or features.
_______________________________________________
networking-discuss mailing list
[email protected]