Regards
_Sugesh

> -----Original Message-----
> From: Jesse Gross [mailto:je...@kernel.org]
> Sent: Friday, March 25, 2016 12:38 AM
> To: Chandran, Sugesh <sugesh.chand...@intel.com>
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [RFC PATCH] tunneling: Improving vxlan performance
> using DPDK flow director feature.
> 
> On Fri, Mar 18, 2016 at 8:50 AM, Chandran, Sugesh
> <sugesh.chand...@intel.com> wrote:
> > Hi Jesse,
> > Please find my answers inline.
> >
> > Regards
> > _Sugesh
> >
> >
> >> -----Original Message-----
> >> From: Jesse Gross [mailto:je...@kernel.org]
> >> Sent: Thursday, March 17, 2016 11:50 PM
> >> To: Chandran, Sugesh <sugesh.chand...@intel.com>
> >> Cc: dev@openvswitch.org
> >> Subject: Re: [ovs-dev] [RFC PATCH] tunneling: Improving vxlan
> >> performance using DPDK flow director feature.
> >>
> >> On Thu, Mar 17, 2016 at 3:43 PM, Chandran, Sugesh
> >> <sugesh.chand...@intel.com> wrote:
> >> > Hi,
> >> >
> >> > This patch proposes an approach that uses Flow director feature on
> >> > the
> >> Intel Fortville NICs to boost the VxLAN tunneling performance. In our
> >> testing we verified that the VxLAN performance is almost doubled with
> this patch.
> >> > The solution programs the NIC to report the flow ID along with the
> >> > VxLAN
> >> packets, and it is matched by OVS in software. There may be corner
> >> cases that needs to addressed in the approach, For eg:  There is a
> >> possibility of race condition where NIC reports flow ID that may
> >> match on different flow in OVS. This happen when a rule is evicted by
> >> a new rule with same flowID+ hash in the OVS software. The packets
> >> may hit on wrong new rule in OVS until the flow get deleted in the
> hardware too.
> >> >
> >> > It is a hardware specific implementation (Only work with Intel
> >> > Fortville
> >> NICs) for now, however the proposal works with any programmable
> >> NICs.This RFC proves that the OVS can offer very high speed tunneling
> >> performance using flow programmability in NICs. I am looking for
> >> comments/suggestions on adding this support(such as configuring,
> >> enable it for all the programmable NICs and etc) in OVS userspace
> >> datapath for improving the performance.
> >>
> >> This is definitely very interesting to see. Can you post some more
> >> specific performance numbers?
> > [Sugesh]
> > VxLAN DECAP performance(Unidirectional, Single flow, Single CPU Core)
> > -------------------------------------------------------------------
> > PKT-IN - 9.3 Mpps
> > Pkt size - 114 byte VxLAN Packets(64 byte payload) PKT-OUT - 5.6 Mpps(
> > Without Optimization) PKT-OUT - 9.3 Mpps(After the optimization, It
> > hits the Input Line rate)
> >
> > VxLAN ENCAP-DECAP performance (Bidirectional, Single CPU Core)
> > ----------------------------------------------------------------------
> > ----------- PKT-IN - 9.3 Mpps, PKT SIZE - 114 Byte VxLAN Packets (64
> > Byte payload) --> PKT-IN - 14 Mpps, PKT SIZE - 64 Byte UDP packets <--
> >
> > PKT-OUT - 3.6 Mpps(Without Optimization) PKT-OUT - 5.3 Mpps(Using the
> > patch)
> 
> Thanks, that is interesting to see, particularly for a gateway-type use case
> where an appliance is translating between encapsulated and non-
> encapsulated packets.
> 
> >> Is this really specific to VXLAN? I'm sure that it could be
> >> generalized to other tunneling protocols (Geneve would be nice given
> >> that OVN is using it and I know Fortville supports it). But shouldn't
> >> it apply to non-tunneled traffic as well?
> > Yes, this can be applied for any tunneling protocol provided the NIC
> > hardware is programmed to handle those packets.
> > We haven’t tested it for non-tunneled packets. The performance
> > improvement on non-tunneled packets are subjective due to the fact
> > that there is a limitation on number of hardware flows(8K on FVL), and
> > software still has to spend cycles on matching the flow IDs reported
> > by hardware.  This improves the tunneling performance in all the cases,
> because it tunnel packets needs two lookup than one.
> 
> Looking at the code some more, I think there are basically two sources of
> optimization here:
>  * Accelerating the EMC by avoiding netdev_flow_key_equal_mf() on the
> assumption that the rule you've installed points exactly to the correct flow.
> However, I don't think this is legal because the flows that you are
> programming the hardware with don't capture the full set of values in an OVS
> flow. For example, in the case of tunnels, there is no match on DMAC.

[Sugesh] We can program hardware to match on all the fields that we want , 
including the tunnel fields in the outer header.

>  * Chaining together the multiple lookups used by tunnels on the assumption
> that the outer VXLAN source port distinguishes the inner flow. This would
> allow avoiding netdev_flow_key_equal_mf() a second time. This is definitely
> not legal because the VXLAN source port is only capturing a small subset of
> the total data that OVS is using.

[Sugesh] From our analysis we found that optimizing one lookup give no
significant performance boost when compared with the overhead. This is  due to 
the
fact that the second netdev_flow_key_equal_mf() still need the tunnel 
information 
to match on a flow.  We found in our tests that most CPU cycles spends on 
extracting
header fields from the packets than lookup. 

The proposal is to avoid the header field extraction by using an additional 
unique software 
flow ID to match on. The two flows for tunnel are marked with this ID when 
installing on the
EMC. The hardware report this ID along with hash(to mitigate the hash collision 
in EMC) 
for every incoming packets that match on a hardware rule. This used in EMC 
along with hash to find the flow. Currently OVS compares  hash +key(from header 
fields) 
to match a flow. The inner flow uses the same unique ID and hardware flow flag 
to match 
on than the source port. We have modified the code little bit more, so that it 
saves the hardware
id in the matching flow, for every emc_insert.


         emc_insert(flow_cache, &keys[i], flow);
+#ifdef DPDK_I40E_TNL_OFFLOAD_ENABLE
+        struct rte_mbuf *mbuf  = (struct rte_mbuf *)packet;
+        flow->hw_rule_id = (mbuf->ol_flags &
+                                                    PKT_RX_FDIR_ID) ? 
mbuf->hash.fdir.hi : 0;
+#endif //DPDK_I40E_TNL_OFFLOAD_ENABLE
> 
> Please correct me if I am wrong.
> 
> I'm not sure that I really see any advantage in using a Flow Director perfect
> filter to return a software defined hash value compared to just using the RSS
> hash directly as we are doing today. I think the main case where it would be
> useful is if hardware wildcarding was used to skip the EMC altogether and its
> size constraints. If that was done then I think that this would no longer be
> specialized to VXLAN at all.
[Sugesh] This may give performance improvement when we have 
large set of rules that overflows EMC. But for a typical use case where 80-90% 
rules hits EMC
doesn’t get any performance benefit out of it. Please correct me if I am wrong 
here.
The intention here is to optimize the tunneling performance in all the use 
cases.
> 
> >> It looks like this is adding a hardware flow when a new flow is added
> >> to the datapath. How does this affect flow setup performance?
> >>
> > We haven’t performed any stress tests with so many flows to verify the
> > flow setup performance. What is the expectation here? Currently how
> > many rules can be setup per second in OVS ?
> 
> It's hard to give a concrete number here since flow setup performance
> depends on the complexity of the flow table and, of course, the machine. In
> general, the goal is to avoid needing to do flow setups in response to traffic
> but this depends on the use case. At a minimum, it would be good to
> understand the difference in performance as a result of this change and try
> to minimize any impact. Since this is really just a hint and we'll need to 
> deal
> with mismatch between software and hardware in any case, perhaps it
> makes sense to program the hardware flows asynchronously.
[Sugesh] Thank you for the input. We will test and find out the hardware
Flow programming overhead and share the results. And also will look at the 
possibilities of
asynchronous flow programming .
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to