Consider a threaded IKE daemon that has a socket open and bound to a local
port 500.  (We'll leave out 4500 because the same issues apply there modulo
setting UDP_NAT_T_ENDPOINT...)

Now let's consider that this threaded daemon must not just receive all
packets for local port 500, but send them back with the same
source/destination addresses as received.

An existing open-source IKE implementation has provided these functions:

extern int recvfromto(int s, void *buf, size_t buflen, int,
    struct sockaddr *from, int *fromlen, struct sockaddr *to, int *tolen);
extern int sendfromto(int s, const void *buf, size_t buflen,
    struct sockaddr *from, struct sockaddr *to);

(Please ignore the missing "flags" for now... ;)

The implementation of recvfromto() is relatively straightforward using X/Open
sockets and the IP_RECVDSTADDR or the IPV6_RECVPKTINFO for IPv4 and IPv6
respectively.

The implementation of sendfromto() poses a tricky problem.  There is no
explicit set-local-source option in the IPv4 codepath, and IPv6 has hints
about local-address selection (via IPV6_SRC_PREFERENCES), but nothing
explicit.  The open-source code chose to create a new socket, bind to the
local port-and-address, and then perform a sendto().  The problem with this
approach is documented in their comments:

                /*
                 * Use newly opened socket for sending packets.
                 * NOTE: this is unsafe, because if the peer is quick enough
                 * the packet from the peer may be queued into sendsock.
                 * Better approach is to prepare bind'ed udp sockets for
                 * each of the interface addresses.
                 */

Their "better approach" is employed by our IKEv1 daemon, but it has problems
with file-descriptor limits (when many local addresses exist), and needing to
monitor routing-socket behavior for local-address additions and deletions.

What I would like to know is how one can build sendfromto() without having to
resort one socket per local address.  Some ideas that occur to me, all of
which involve employing a secondary transmission socket, and all of which may
be utterly bogus, include:

        - After a successful sendto(), call shutdown() on the dedicated
          sending socket, then read any lingering datagrams using
          recvfromto() and inject them into the packet-processing code.

                This assumes that I understand how shutdown() works, which
                may be completely wrong.  I thought shutdown() detatched the
                socket from the network (in other words, nuked the conn_t)
                but kept any data queued up intact.

        - Insert something into the sendmsg() path that explicitly sets the
          local IP & port for a single datagram.

                This involves kernel modifications, which would be awful if
                anyone requested such a daemon to run on older kernels.

        - Use a raw socket and construct the whole IP and UDP headers.

                Can we do this for packet transmission?  I know we can for
                ICMP headers ala. ping(1M), but can we for UDP as well?

Any clues are appreciated.

Thanks,
Dan
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to