On Wed, Oct 12, 2011 at 4:01 PM, Ben Pfaff <[email protected]> wrote:
> diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c
> new file mode 100644
> index 0000000..a6e5439
> --- /dev/null
> +++ b/datapath/vport-vxlan.c
> +#define VXLAN_DST_PORT 49170
> +#define VXLAN_IPSEC_SRC_PORT 49171
I think ideally the IPsec source port should fall outside the range
used for hashing. It doesn't matter all that much right now because
we don't allow simultaneous encrypted and un-encrypted connections to
the same remote host but it will be easier to add support for it in
the future if we separate them.
> +#define VXLAN_FLAGS 0x08000000 /* struct vxlanhdr.vx_flags required value. */
The original CAPWAP implementation did something like this where it
exactly matched all the fields, including the ones that it didn't care
about. In retrospect, I think that was a mistake because it can
unnecessarily hinder protocol changes. When I modified it for the key
extension I made it skip irrelevant fields, which I think is a little
nicer and I would be inclined to do the same here. The VXLAN header
also isn't nearly as bad as the CAPWAP header as far as having odd
size fields, which is why originally used all fixed values.
> +static struct sk_buff *vxlan_update_header(const struct vport *vport,
> + const struct tnl_mutable_config
> *mutable,
> + struct dst_entry *dst,
> + struct sk_buff *skb)
> +{
> + struct udphdr *udph = udp_hdr(skb);
> + struct vxlanhdr *vxh = (struct vxlanhdr *)(udph + 1);
> +
> + if (mutable->flags & TNL_F_OUT_KEY_ACTION)
> + vxh->vx_vni = htonl(be64_to_cpu(OVS_CB(skb)->tun_id) << 8);
> +
> + udph->source = get_src_port(skb, mutable);
> + udph->len = htons(skb->len - skb_transport_offset(skb));
Since VXLAN relies on IP fragmentation instead of an internal
mechanism like CAPWAP, we need to allow local fragmentation like we do
with GRE.
> +/* Called with rcu_read_lock and BH disabled. */
> +static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
> +{
> + struct vport *vport;
> + struct vxlanhdr *vxh;
> + const struct tnl_mutable_config *mutable;
> + struct iphdr *iph;
> + __be64 key;
> +
> + if (unlikely(!pskb_may_pull(skb, VXLAN_HLEN + ETH_HLEN)))
> + goto error;
> +
> + vxh = vxlan_hdr(skb);
> + if (unlikely(vxh->vx_flags != htonl(VXLAN_FLAGS) ||
> + vxh->vx_vni & htonl(0xff)))
> + goto error;
> +
> + __skb_pull(skb, VXLAN_HLEN);
> + skb_postpull_rcsum(skb, skb_transport_header(skb), VXLAN_HLEN +
> ETH_HLEN);
Can we move this down below the port lookup to avoid doing unnecessary
work for bad packets?
> +static int vxlan_init(void)
> +{
> + int err;
> + struct sockaddr_in sin;
> +
> + err = sock_create(AF_INET, SOCK_DGRAM, 0, &vxlan_rcv_socket);
> + if (err)
> + goto error;
We actually want to move away from always registering handlers at
module load time and do it on demand when the first tunnel of this
type is created. The other types don't do it yet but it would be good
for new code, especially as the number of tunnel types increases.
> @@ -950,7 +971,8 @@
> <group title="Tunnel Options">
> <p>
> These options apply to interfaces with <ref column="type"/> of
> - <code>gre</code>, <code>ipsec_gre</code>, and <code>capwap</code>.
> + <code>gre</code>, <code>ipsec_gre</code>, <code>capwap</code>, and
> + <code>vxlan</code>.
Shouldn't this also include vxlan_ipsec?
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev