On Wed, Dec 14, 2011 at 4:12 PM, Ben Pfaff <b...@nicira.com> wrote:
> Why do we care?  Current userspace isn't really affected.  At most, we
> get flow statistics wrong by one packet (I guess multiple packets is
> theoretically possible?), if and when this happens.  The place where it
> keeps coming up is in situations where we want userspace to be able to
> associate metadata with a packet sent from kernel to user.  This came up
> in August regarding sflow:
>        http://openvswitch.org/pipermail/dev/2011-August/010384.html

In theory the number of packets that you lose is unbounded if traffic
suddenly spikes, causing the CPU to hold onto the flow forever.  I
don't think it really matters and it is possible to do better than we
do today, though.

> So far I've thought of the following ways for userspace to figure out
> when a kernel flow is really gone:
>
>        1. Use a timer.  Not really satisfying.
>
>        2. Somehow learn when a kernel grace period has elapsed.  Not a
>           good idea since it's so bound to the particular kernel
>           implementation.

Yes, I agree these aren't particularly appealing.

>        3. Somehow actually eliminate the problem with deleting flows,
>           so that when userspace receives the response to the flow
>           deletion we know that no more packets can go through the
>           flow.  I don't know how to do this efficiently.

I'm not sure that this is really a question of efficiency, so much as
it is complexity.  Basically you have to make userspace able to
tolerate blocking while the flow is deleted and then use
synchronize_rcu when removing flows.  Presumably this would mean that
you need some kind of worker threads.

>        4. Have the RCU callback for flow deletion send the Netlink
>           broadcast message that tells userspace that the flow is gone.
>           The Netlink client that sent the actual deletion request
>           would still get a synchronous response, but the broadcast
>           would be delayed until the flow was really gone.  I think
>           this might be practical, but I don't know the exact
>           restrictions on RCU callbacks.

I think that should be OK.  RCU is just operating in softirq context
so the restrictions aren't too severe.  It will make sending the
messages more likely to fail because they have to be allocated in
atomic context.

You'd also have to be more careful about what you access when sending
the notification.  You'd first need to store the caller's Netlink PID
and other information to make sure that it gets back to the right
place.  We also currently access the datapath structure, which could
be bad if both flows and datapaths are deleted in the same grace
period (in practice they will almost certainly get destructed in order
but it's not something that I'd want to assume).  Also, I think
there's annoying ordering issue on module unload once you allow
datapaths to be removed automatically on shutdown.  You basically have
to unregister the datapath family, clean out all the datapaths, do an
RCU barrier, and unregister the rest of the families.

It seems like if we improve the accuracy of flow stats then it would
be nice to do the same for datapaths and ports as well, although it's
not quite as important and could potentially use a different strategy.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to