On Fri, Sep 7, 2012 at 12:56 PM, Kyle Mestery (kmestery)
<[email protected]> wrote:
> On Sep 6, 2012, at 10:24 PM, Jesse Gross wrote:
>> On Wed, Sep 5, 2012 at 2:58 PM, Kyle Mestery <[email protected]> wrote:
>>> @@ -1185,7 +1207,12 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority,
>>> u16 *in_port, __be64 *tun_id,
>>> break;
>>>
>>> case OVS_KEY_ATTR_TUN_ID:
>>> - *tun_id = nla_get_be64(nla);
>>> + tun_key->tun_id = nla_get_be64(nla);
>>> + break;
>>> +
>>> + case OVS_KEY_ATTR_IPV4_TUNNEL:
>>> + memcpy(tun_key, nla_data(nla),
>>> + sizeof(*tun_key));
>>
>> We should enforce consistency here as well (this one is probably more
>> likely to be broken anyways because flow setups just echo back what
>> the kernel supplied but this is generated "by hand"). In this case we
>> can't enforce that both come together.
>>
> I don't think there is an implicit order to the netlink attributes here, so
> what I did
> was something like this:
>
> diff --git a/datapath/flow.c b/datapath/flow.c
> index 1c4eb99..5be492e 100644
> --- a/datapath/flow.c
> +++ b/datapath/flow.c
> @@ -1187,9 +1187,10 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16
> *in_port,
> {
> const struct nlattr *nla;
> int rem;
> + __be64 tun_id = 0;
>
> *in_port = DP_MAX_PORTS;
> - memset(tun_key, 0, sizeof(struct ovs_key_ipv4_tunnel));
> + memset(tun_key, 0, sizeof(*tun_key));
> *priority = 0;
>
> nla_for_each_nested(nla, attr, rem) {
> @@ -1205,12 +1206,19 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16
> *in_port,
> break;
>
> case OVS_KEY_ATTR_TUN_ID:
> - tun_key->tun_id = nla_get_be64(nla);
> + tun_id = nla_get_be64(nla);
> + if (tun_key->tun_id != 0 &&
> + tun_key->tun_id != tun_id)
> + return -EINVAL;
> break;
>
> case OVS_KEY_ATTR_IPV4_TUNNEL:
> + if (tun_key->tun_id != 0)
> + tun_id = tun_key->tun_id;
> memcpy(tun_key, nla_data(nla),
> sizeof(*tun_key));
> + if (tun_id != 0 && tun_id != tun_key->tun_id)
> + return -EINVAL;
> break;
It's true that there's no ordering requirements here. However, we do
have to be able to accept either or both attributes. Current
userspace will generate OVS_KEY_ATTR_TUN_ID but it looks like this
will only store the value unless we also get OVS_KEY_ATTR_IPV4_TUNNEL.
I guess I would probably also add an explicit flag to indicate that
tun_id has been set rather than relying on nonzero to have that
meaning.
>>> diff --git a/datapath/flow.h b/datapath/flow.h
>>> index 02c563a..5d4fcde 100644
>>> --- a/datapath/flow.h
>>> +++ b/datapath/flow.h
>>> @@ -42,7 +42,6 @@ struct sw_flow_actions {
>>>
>>> struct sw_flow_key {
>>> struct {
>>> - __be64 tun_id; /* Encapsulating tunnel ID. */
>>> u32 priority; /* Packet QoS priority. */
>>> u16 in_port; /* Input switch port (or
>>> DP_MAX_PORTS). */
>>> } phy;
>>> @@ -92,6 +91,9 @@ struct sw_flow_key {
>>> } nd;
>>> } ipv6;
>>> };
>>> + struct {
>>> + struct ovs_key_ipv4_tunnel tun_key; /* Encapsulating
>>> tunnel key. */
>>> + } tun;
>>> };
>>
>> The solution for this is a little more complicated because for
>> efficiency we only match on as much of the flow struct as is
>> necessary. Since we currently calculate how far we need to search
>> (i.e. TCP/IPv4) without taking into account a tunnel header at the end
>> this will never match on the tunnel. The other less serious problem
>> is that this reduces the effectiveness of the partial length match
>> when tunneling because IPv4 packets will have to match over a series
>> of zeros where the IPv6 header exists. Basically we want to stick
>> tunnel headers at the end of the current match, where ever that might
>> be.
>>
>> One way of doing this is to add a copy of the tun_key to each union
>> that could possibly be the end of the struct. We essentially do that
>> with the TCP/UDP ports which could follow either a IPv4 or IPv6
>> header. However in the case of tunnels there are many more possible
>> end locations (I count 7) and whereas L4 ports logically follows the
>> L3 header that's not really the case with tunnels, we're just moving
>> them around because we can and the benefit is large.
>>
>> I think what I would do is to make struct tun completely independent
>> of struct sw_flow_key and then have it "float" over where ever we
>> decide is the end of the flow struct. It's a little messy if you want
>> random access to it because you have to compute the size of the
>> earlier members but in practice don't think we ever do so it shouldn't
>> be too bad.
>>
> I think I see what you're looking for here. I'm going to make this a separate
> patch
> built on top of the other stuff, since it seems this may be a little more
> complicated
> as you indicate. I'll send it out with the other patch, though.
Breaking it out into a separate patch definitely makes sense. I'll
just want to apply them both at the same time so that we don't have
any performance regressions.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev