On Tue, Nov 22, 2022 at 8:00 PM Ryota Ozaki <ozaki.ry...@gmail.com> wrote: > > On Tue, Nov 22, 2022 at 12:49 AM Greg Troxel <g...@lexort.com> wrote: > > > > > > Ryota Ozaki <ozaki.ry...@gmail.com> writes: > > > > > In the specification DLT_NULL assumes a protocol family in the host > > > byte order followed by a payload. Interfaces of DLT_NULL uses > > > bpf_mtap_af to pass a mbuf prepending a protocol family. All interfaces > > > follow the spec and work well. > > > > > > OTOH, bpf_write to interfaces of DLT_NULL is a bit of a sad situation. > > > A writing data to an interface of DLT_NULL is treated as a raw data > > > (I don't know why); the data is passed to the interface's output routine > > > as is with dst (sa_family=AF_UNSPEC). tun seems to be able > > > to handle such raw data but the others can't handle the data (probably > > > the data will be dropped like if_loop). > > > > Summarizing and commenting to make sure I'm not confused > > > > on receive/read, DLT_NULL prepends AF in host byte order > > on transmit/write, it just sends with AF_UNSPCE > > > > This seems broken as it is asymmetric, and is bad because it throws > > away information that is hard to reliably recreate. On the other hand > > this is for link-layer formats, and it seems that some interfaces have > > an AF that is not really part of what is transmitted, even though > > really it is. For example tun is using an IP proto byte to specify AF > > and really this is part of the link protocol. Except we pretend it > > isn't. > > I found the following sentence in bpf.4: > > A packet can be sent out on the network by writing to a bpf file > descriptor. The writes are unbuffered, meaning only one packet can be > processed per write. Currently, only writes to Ethernets and SLIP links > are supported. > > So bpf_write to interfaces of DLT_NULL may be simply unsupported on > NetBSD... > > > > > > Correcting bpf_write to assume a prepending protocol family will > > > save some interfaces like gif and gre but won't save others like stf > > > and wg. Even worse, the change may break existing users of tun > > > that want to treat data as is (though I don't know if users exist). > > > > > > BTW, prepending a protocol family on tun is a different protocol from > > > DLT_NULL of bpf. tun has three protocol modes and doesn't always prepend > > > a protocol family. (And also the network byte order is used on tun > > > as gert says while DLT_NULL assumes the host byte order.) > > > > wow. > > > > > So my fix will: > > > - keep DLT_NULL of if_loop to not break bpf_mtap_af, and > > > - unchange DLT_NULL handling in bpf_write except for if_loop to bother > > > existing users. > > > The patch looks like this: > > > > > > @@ -447,6 +448,14 @@ bpf_movein(struct uio *uio, int linktype, > > > uint64_t mtu, struct mbuf **mp, > > > m0->m_len -= hlen; > > > } > > > > > > + if (linktype == DLT_NULL && ifp->if_type == IFT_LOOP) { > > > + uint32_t af; > > > + memcpy(&af, mtod(m0, void *), sizeof(af)); > > > + sockp->sa_family = af; > > > + m0->m_data += sizeof(af); > > > + m0->m_len -= sizeof(af); > > > + } > > > + > > > *mp = m0; > > > return (0); > > > > That seems ok to me. > > Thanks. > > > > > > > I think the long-term right fix is to define DLT_AF which has an AF word > > in host order on receive and transmit always, and to modify interfaces > > to use it whenever they are AF aware at all. In this case tun would > > fill in the AF word from the IP proto field, and you'd get a > > transformed/regularized AF word when really the "link layer packet" had > > the IP proto field. But that's ok as it's just cleanup and reversible. > > I think introducing DLT_AF is a bit of a tough task because DLT_* definitions > are managed by us. ^ are NOT managed, I meant to say...
ozaki-r