On Fri, Sep 7, 2012 at 12:56 PM, Kyle Mestery (kmestery)
<[email protected]> wrote:
> On Sep 6, 2012, at 10:24 PM, Jesse Gross wrote:
>> On Wed, Sep 5, 2012 at 2:58 PM, Kyle Mestery <[email protected]> wrote:
>>> @@ -1185,7 +1207,12 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, 
>>> u16 *in_port, __be64 *tun_id,
>>>                                break;
>>>
>>>                        case OVS_KEY_ATTR_TUN_ID:
>>> -                               *tun_id = nla_get_be64(nla);
>>> +                               tun_key->tun_id = nla_get_be64(nla);
>>> +                               break;
>>> +
>>> +                       case OVS_KEY_ATTR_IPV4_TUNNEL:
>>> +                               memcpy(tun_key, nla_data(nla),
>>> +                                      sizeof(*tun_key));
>>
>> We should enforce consistency here as well (this one is probably more
>> likely to be broken anyways because flow setups just echo back what
>> the kernel supplied but this is generated "by hand").  In this case we
>> can't enforce that both come together.
>>
> I don't think there is an implicit order to the netlink attributes here, so 
> what I did
> was something like this:
>
> diff --git a/datapath/flow.c b/datapath/flow.c
> index 1c4eb99..5be492e 100644
> --- a/datapath/flow.c
> +++ b/datapath/flow.c
> @@ -1187,9 +1187,10 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 
> *in_port,
>  {
>         const struct nlattr *nla;
>         int rem;
> +       __be64 tun_id = 0;
>
>         *in_port = DP_MAX_PORTS;
> -       memset(tun_key, 0, sizeof(struct ovs_key_ipv4_tunnel));
> +       memset(tun_key, 0, sizeof(*tun_key));
>         *priority = 0;
>
>         nla_for_each_nested(nla, attr, rem) {
> @@ -1205,12 +1206,19 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 
> *in_port,
>                                 break;
>
>                         case OVS_KEY_ATTR_TUN_ID:
> -                               tun_key->tun_id = nla_get_be64(nla);
> +                               tun_id = nla_get_be64(nla);
> +                               if (tun_key->tun_id != 0 &&
> +                                   tun_key->tun_id != tun_id)
> +                                       return -EINVAL;
>                                 break;
>
>                         case OVS_KEY_ATTR_IPV4_TUNNEL:
> +                               if (tun_key->tun_id != 0)
> +                                       tun_id = tun_key->tun_id;
>                                 memcpy(tun_key, nla_data(nla),
>                                        sizeof(*tun_key));
> +                               if (tun_id != 0 && tun_id != tun_key->tun_id)
> +                                       return -EINVAL;
>                                 break;

It's true that there's no ordering requirements here.  However, we do
have to be able to accept either or both attributes.  Current
userspace will generate OVS_KEY_ATTR_TUN_ID but it looks like this
will only store the value unless we also get OVS_KEY_ATTR_IPV4_TUNNEL.
 I guess I would probably also add an explicit flag to indicate that
tun_id has been set rather than relying on nonzero to have that
meaning.

>>> diff --git a/datapath/flow.h b/datapath/flow.h
>>> index 02c563a..5d4fcde 100644
>>> --- a/datapath/flow.h
>>> +++ b/datapath/flow.h
>>> @@ -42,7 +42,6 @@ struct sw_flow_actions {
>>>
>>> struct sw_flow_key {
>>>        struct {
>>> -               __be64  tun_id;         /* Encapsulating tunnel ID. */
>>>                u32     priority;       /* Packet QoS priority. */
>>>                u16     in_port;        /* Input switch port (or 
>>> DP_MAX_PORTS). */
>>>        } phy;
>>> @@ -92,6 +91,9 @@ struct sw_flow_key {
>>>                        } nd;
>>>                } ipv6;
>>>        };
>>> +        struct {
>>> +               struct ovs_key_ipv4_tunnel tun_key;  /* Encapsulating 
>>> tunnel key. */
>>> +       } tun;
>>> };
>>
>> The solution for this is a little more complicated because for
>> efficiency we only match on as much of the flow struct as is
>> necessary.  Since we currently calculate how far we need to search
>> (i.e. TCP/IPv4) without taking into account a tunnel header at the end
>> this will never match on the tunnel.  The other less serious problem
>> is that this reduces the effectiveness of the partial length match
>> when tunneling because IPv4 packets will have to match over a series
>> of zeros where the IPv6 header exists.  Basically we want to stick
>> tunnel headers at the end of the current match, where ever that might
>> be.
>>
>> One way of doing this is to add a copy of the tun_key to each union
>> that could possibly be the end of the struct.  We essentially do that
>> with the TCP/UDP ports which could follow either a IPv4 or IPv6
>> header.  However in the case of tunnels there are many more possible
>> end locations (I count 7) and whereas L4 ports logically follows the
>> L3 header that's not really the case with tunnels, we're just moving
>> them around because we can and the benefit is large.
>>
>> I think what I would do is to make struct tun completely independent
>> of struct sw_flow_key and then have it "float" over where ever we
>> decide is the end of the flow struct.  It's a little messy if you want
>> random access to it because you have to compute the size of the
>> earlier members but in practice don't think we ever do so it shouldn't
>> be too bad.
>>
> I think I see what you're looking for here. I'm going to make this a separate 
> patch
> built on top of the other stuff, since it seems this may be a little more 
> complicated
> as you indicate. I'll send it out with the other patch, though.

Breaking it out into a separate patch definitely makes sense.  I'll
just want to apply them both at the same time so that we don't have
any performance regressions.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Reply via email to