> -----Original Message----- > From: Hu, Jiayu > Sent: Friday, January 5, 2018 2:13 PM > To: dev@dpdk.org > Cc: Richardson, Bruce <bruce.richard...@intel.com>; Chen, Junjie J > <junjie.j.c...@intel.com>; Tan, Jianfeng <jianfeng....@intel.com>; > step...@networkplumber.org; Yigit, Ferruh <ferruh.yi...@intel.com>; > Ananyev, Konstantin <konstantin.anan...@intel.com>; Yao, Lei A > <lei.a....@intel.com>; Hu, Jiayu <jiayu...@intel.com> > Subject: [PATCH v4 1/2] gro: code cleanup > > - Remove needless check and variants > - For better understanding, update the programmer guide and rename > internal functions and variants > - For supporting tunneled gro, move common internal functions from > gro_tcp4.c to gro_tcp4.h > - Comply RFC 6864 to process the IPv4 ID field > > Signed-off-by: Jiayu Hu <jiayu...@intel.com> > Reviewed-by: Junjie Chen <junjie.j.c...@intel.com> Tested-by: Lei Yao<lei.a....@intel.com> I have tested this patch with following traffic follow: NIC1(In kernel)-->NIC2(pmd, GRO on)-->vhost-user->virtio-net(in VM) The Iperf test with 1 stream show that GRO VxLAN can improve the performance from 6 Gbps(GRO off) to 16 Gbps(GRO on).
> --- > .../prog_guide/generic_receive_offload_lib.rst | 246 ++++++++------- > doc/guides/prog_guide/img/gro-key-algorithm.svg | 223 > ++++++++++++++ > lib/librte_gro/gro_tcp4.c | 339 > +++++++-------------- > lib/librte_gro/gro_tcp4.h | 253 ++++++++++----- > lib/librte_gro/rte_gro.c | 102 +++---- > lib/librte_gro/rte_gro.h | 92 +++--- > 6 files changed, 750 insertions(+), 505 deletions(-) > create mode 100644 doc/guides/prog_guide/img/gro-key-algorithm.svg > > diff --git a/doc/guides/prog_guide/generic_receive_offload_lib.rst > b/doc/guides/prog_guide/generic_receive_offload_lib.rst > index 22e50ec..c2d7a41 100644 > --- a/doc/guides/prog_guide/generic_receive_offload_lib.rst > +++ b/doc/guides/prog_guide/generic_receive_offload_lib.rst > @@ -32,128 +32,162 @@ Generic Receive Offload Library > =============================== > > Generic Receive Offload (GRO) is a widely used SW-based offloading > -technique to reduce per-packet processing overhead. It gains performance > -by reassembling small packets into large ones. To enable more flexibility > -to applications, DPDK implements GRO as a standalone library. Applications > -explicitly use the GRO library to merge small packets into large ones. > - > -The GRO library assumes all input packets have correct checksums. In > -addition, the GRO library doesn't re-calculate checksums for merged > -packets. If input packets are IP fragmented, the GRO library assumes > -they are complete packets (i.e. with L4 headers). > - > -Currently, the GRO library implements TCP/IPv4 packet reassembly. > - > -Reassembly Modes > ----------------- > - > -The GRO library provides two reassembly modes: lightweight and > -heavyweight mode. If applications want to merge packets in a simple way, > -they can use the lightweight mode API. If applications want more > -fine-grained controls, they can choose the heavyweight mode API. > - > -Lightweight Mode > -~~~~~~~~~~~~~~~~ > - > -The ``rte_gro_reassemble_burst()`` function is used for reassembly in > -lightweight mode. It tries to merge N input packets at a time, where > -N should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``. > - > -In each invocation, ``rte_gro_reassemble_burst()`` allocates temporary > -reassembly tables for the desired GRO types. Note that the reassembly > -table is a table structure used to reassemble packets and different GRO > -types (e.g. TCP/IPv4 GRO and TCP/IPv6 GRO) have different reassembly > table > -structures. The ``rte_gro_reassemble_burst()`` function uses the > reassembly > -tables to merge the N input packets. > - > -For applications, performing GRO in lightweight mode is simple. They > -just need to invoke ``rte_gro_reassemble_burst()``. Applications can get > -GROed packets as soon as ``rte_gro_reassemble_burst()`` returns. > - > -Heavyweight Mode > -~~~~~~~~~~~~~~~~ > - > -The ``rte_gro_reassemble()`` function is used for reassembly in > heavyweight > -mode. Compared with the lightweight mode, performing GRO in > heavyweight mode > -is relatively complicated. > - > -Before performing GRO, applications need to create a GRO context object > -by calling ``rte_gro_ctx_create()``. A GRO context object holds the > -reassembly tables of desired GRO types. Note that all update/lookup > -operations on the context object are not thread safe. So if different > -processes or threads want to access the same context object > simultaneously, > -some external syncing mechanisms must be used. > - > -Once the GRO context is created, applications can then use the > -``rte_gro_reassemble()`` function to merge packets. In each invocation, > -``rte_gro_reassemble()`` tries to merge input packets with the packets > -in the reassembly tables. If an input packet is an unsupported GRO type, > -or other errors happen (e.g. SYN bit is set), ``rte_gro_reassemble()`` > -returns the packet to applications. Otherwise, the input packet is either > -merged or inserted into a reassembly table. > - > -When applications want to get GRO processed packets, they need to use > -``rte_gro_timeout_flush()`` to flush them from the tables manually. > +technique to reduce per-packet processing overheads. By reassembling > +small packets into larger ones, GRO enables applications to process > +fewer large packets directly, thus reducing the number of packets to > +be processed. To benefit DPDK-based applications, like Open vSwitch, > +DPDK also provides own GRO implementation. In DPDK, GRO is > implemented > +as a standalone library. Applications explicitly use the GRO library to > +reassemble packets. > + > +Overview > +-------- > + > +In the GRO library, there are many GRO types which are defined by packet > +types. One GRO type is in charge of process one kind of packets. For > +example, TCP/IPv4 GRO processes TCP/IPv4 packets. > + > +Each GRO type has a reassembly function, which defines own algorithm and > +table structure to reassemble packets. We assign input packets to the > +corresponding GRO functions by MBUF->packet_type. > + > +The GRO library doesn't check if input packets have correct checksums and > +doesn't re-calculate checksums for merged packets. The GRO library > +assumes the packets are complete (i.e., MF==0 && frag_off==0), when IP > +fragmentation is possible (i.e., DF==0). Additionally, it complies RFC > +6864 to process the IPv4 ID field. > > -TCP/IPv4 GRO > ------------- > +Currently, the GRO library provides GRO supports for TCP/IPv4 packets. > + > +Two Sets of API > +--------------- > + > +For different usage scenarios, the GRO library provides two sets of API. > +The one is called the lightweight mode API, which enables applications to > +merge a small number of packets rapidly; the other is called the > +heavyweight mode API, which provides fine-grained controls to > +applications and supports to merge a large number of packets. > + > +Lightweight Mode API > +~~~~~~~~~~~~~~~~~~~~ > + > +The lightweight mode only has one function ``rte_gro_reassemble_burst()``, > +which process N packets at a time. Using the lightweight mode API to > +merge packets is very simple. Calling ``rte_gro_reassemble_burst()`` is > +enough. The GROed packets are returned to applications as soon as it > +finishes. > + > +In ``rte_gro_reassemble_burst()``, table structures of different GRO > +types are allocated in the stack. This design simplifies applications' > +operations. However, limited by the stack size, the maximum number of > +packets that ``rte_gro_reassemble_burst()`` can process in an invocation > +should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``. > + > +Heavyweight Mode API > +~~~~~~~~~~~~~~~~~~~~ > + > +Compared with the lightweight mode, using the heavyweight mode API is > +relatively complex. Firstly, applications need to create a GRO context > +by ``rte_gro_ctx_create()``. ``rte_gro_ctx_create()`` allocates tables > +structures in the heap and stores their pointers in the GRO context. > +Secondly, applications use ``rte_gro_reassemble()`` to merge packets. > +If input packets have invalid parameters, ``rte_gro_reassemble()`` > +returns them to applications. For example, packets of unsupported GRO > +types or TCP SYN packets are returned. Otherwise, the input packets are > +either merged with the existed packets in the tables or inserted into the > +tables. Finally, applications use ``rte_gro_timeout_flush()`` to flush > +packets from the tables, when they want to get the GROed packets. > + > +Note that all update/lookup operations on the GRO context are not thread > +safe. So if different processes or threads want to access the same > +context object simultaneously, some external syncing mechanisms must be > +used. > + > +Reassembly Algorithm > +-------------------- > + > +The reassembly algorithm is used for reassembling packets. In the GRO > +library, different GRO types can use different algorithms. In this > +section, we will introduce an algorithm, which is used by TCP/IPv4 GRO. > > -TCP/IPv4 GRO supports merging small TCP/IPv4 packets into large ones, > -using a table structure called the TCP/IPv4 reassembly table. > +Challenges > +~~~~~~~~~~ > > -TCP/IPv4 Reassembly Table > -~~~~~~~~~~~~~~~~~~~~~~~~~ > +The reassembly algorithm determines the efficiency of GRO. There are two > +challenges in the algorithm design: > > -A TCP/IPv4 reassembly table includes a "key" array and an "item" array. > -The key array keeps the criteria to merge packets and the item array > -keeps the packet information. > +- a high cost algorithm/implementation would cause packet dropping in a > + high speed network. > > -Each key in the key array points to an item group, which consists of > -packets which have the same criteria values but can't be merged. A key > -in the key array includes two parts: > +- packet reordering makes it hard to merge packets. For example, Linux > + GRO fails to merge packets when encounters packet reordering. > > -* ``criteria``: the criteria to merge packets. If two packets can be > - merged, they must have the same criteria values. > +The above two challenges require our algorithm is: > > -* ``start_index``: the item array index of the first packet in the item > - group. > +- lightweight enough to scale fast networking speed > > -Each element in the item array keeps the information of a packet. An item > -in the item array mainly includes three parts: > +- capable of handling packet reordering > > -* ``firstseg``: the mbuf address of the first segment of the packet. > +In DPDK GRO, we use a key-based algorithm to address the two challenges. > > -* ``lastseg``: the mbuf address of the last segment of the packet. > +Key-based Reassembly Algorithm > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +:numref:`figure_gro-key-algorithm` illustrates the procedure of the > +key-based algorithm. Packets are classified into "flows" by some header > +fields (we call them as "key"). To process an input packet, the algorithm > +searches for a matched "flow" (i.e., the same value of key) for the > +packet first, then checks all packets in the "flow" and tries to find a > +"neighbor" for it. If find a "neighbor", merge the two packets together. > +If can't find a "neighbor", store the packet into its "flow". If can't > +find a matched "flow", insert a new "flow" and store the packet into the > +"flow". > + > +.. note:: > + Packets in the same "flow" that can't merge are always caused > + by packet reordering. > + > +The key-based algorithm has two characters: > + > +- classifying packets into "flows" to accelerate packet aggregation is > + simple (address challenge 1). > + > +- storing out-of-order packets makes it possible to merge later (address > + challenge 2). > + > +.. _figure_gro-key-algorithm: > + > +.. figure:: img/gro-key-algorithm.* > + :align: center > + > + Key-based Reassembly Algorithm > + > +TCP/IPv4 GRO > +------------ > > -* ``next_pkt_index``: the item array index of the next packet in the same > - item group. TCP/IPv4 GRO uses ``next_pkt_index`` to chain the packets > - that have the same criteria value but can't be merged together. > +The table structure used by TCP/IPv4 GRO contains two arrays: flow array > +and item array. The flow array keeps flow information, and the item array > +keeps packet information. > > -Procedure to Reassemble a Packet > -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > +Header fields used to define a TCP/IPv4 flow include: > > -To reassemble an incoming packet needs three steps: > +- source and destination: Ethernet and IP address, TCP port > > -#. Check if the packet should be processed. Packets with one of the > - following properties aren't processed and are returned immediately: > +- TCP acknowledge number > > - * FIN, SYN, RST, URG, PSH, ECE or CWR bit is set. > +TCP/IPv4 packets whose FIN, SYN, RST, URG, PSH, ECE or CWR bit is set > +won't be processed. > > - * L4 payload length is 0. > +Header fields deciding if two packets are neighbors include: > > -#. Traverse the key array to find a key which has the same criteria > - value with the incoming packet. If found, go to the next step. > - Otherwise, insert a new key and a new item for the packet. > +- TCP sequence number > > -#. Locate the first packet in the item group via ``start_index``. Then > - traverse all packets in the item group via ``next_pkt_index``. If a > - packet is found which can be merged with the incoming one, merge them > - together. If one isn't found, insert the packet into this item group. > - Note that to merge two packets is to link them together via mbuf's > - ``next`` field. > +- IPv4 ID. The IPv4 ID fields of the packets, whose DF bit is 0, should > + be increased by 1. > > -When packets are flushed from the reassembly table, TCP/IPv4 GRO > updates > -packet header fields for the merged packets. Note that before reassembling > -the packet, TCP/IPv4 GRO doesn't check if the checksums of packets are > -correct. Also, TCP/IPv4 GRO doesn't re-calculate checksums for merged > -packets. > +.. note:: > + We comply RFC 6864 to process the IPv4 ID field. Specifically, > + we check IPv4 ID fields for the packets whose DF bit is 0 and > + ignore IPv4 ID fields for the packets whose DF bit is 1. > + Additionally, packets which have different value of DF bit can't > + be merged. > diff --git a/doc/guides/prog_guide/img/gro-key-algorithm.svg > b/doc/guides/prog_guide/img/gro-key-algorithm.svg > new file mode 100644 > index 0000000..94e42f5 > --- /dev/null > +++ b/doc/guides/prog_guide/img/gro-key-algorithm.svg > @@ -0,0 +1,223 @@ > +<?xml version="1.0" encoding="UTF-8" standalone="no"?> > +<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" > "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd"> > +<!-- Generated by Microsoft Visio 11.0, SVG Export, v1.0 gro-key- > algorithm.svg Page-1 --> > +<svg xmlns="http://www.w3.org/2000/svg" > xmlns:xlink="http://www.w3.org/1999/xlink" > xmlns:ev="http://www.w3.org/2001/xml-events" > + > xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/ > " width="6.06163in" height="2.66319in" > + viewBox="0 0 436.438 191.75" xml:space="preserve" color- > interpolation-filters="sRGB" class="st10"> > + <v:documentProperties v:langID="1033" v:viewMarkup="false"/> > + > + <style type="text/css"> > + <![CDATA[ > + .st1 {fill:url(#grad30-4);stroke:#404040;stroke- > linecap:round;stroke-linejoin:round;stroke-width:0.25} > + .st2 {fill:#000000;font-family:Calibri;font-size:1.00001em} > + .st3 {font-size:1em;font-weight:bold} > + .st4 {fill:#000000;font-family:Calibri;font-size:1.00001em;font- > weight:bold} > + .st5 {font-size:1em;font-weight:normal} > + .st6 {marker-end:url(#mrkr5-38);stroke:#404040;stroke- > linecap:round;stroke-linejoin:round;stroke-width:1} > + .st7 {fill:#404040;fill-opacity:1;stroke:#404040;stroke- > opacity:1;stroke-width:0.28409090909091} > + .st8 {fill:none;stroke:none;stroke-linecap:round;stroke- > linejoin:round;stroke-width:0.25} > + .st9 {fill:#000000;font-family:Calibri;font-size:0.833336em} > + .st10 {fill:none;fill-rule:evenodd;font- > size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3} > + ]]> > + </style> > + > + <defs id="Patterns_And_Gradients"> > + <linearGradient id="grad30-4" v:fillPattern="30" > v:foreground="#c6d09f" v:background="#d1dab4" x1="0" y1="1" x2="0" > y2="0"> > + <stop offset="0" style="stop-color:#c6d09f;stop- > opacity:1"/> > + <stop offset="1" style="stop-color:#d1dab4;stop- > opacity:1"/> > + </linearGradient> > + <linearGradient id="grad30-35" v:fillPattern="30" > v:foreground="#f0f0f0" v:background="#ffffff" x1="0" y1="1" x2="0" y2="0"> > + <stop offset="0" style="stop-color:#f0f0f0;stop- > opacity:1"/> > + <stop offset="1" style="stop-color:#ffffff;stop- > opacity:1"/> > + </linearGradient> > + </defs> > + <defs id="Markers"> > + <g id="lend5"> > + <path d="M 2 1 L 0 0 L 1.98117 -0.993387 C 1.67173 - > 0.364515 1.67301 0.372641 1.98465 1.00043 " style="stroke:none"/> > + </g> > + <marker id="mrkr5-38" class="st7" v:arrowType="5" > v:arrowSize="2" v:setback="6.16" refX="-6.16" orient="auto" > + markerUnits="strokeWidth" > overflow="visible"> > + <use xlink:href="#lend5" transform="scale(-3.52,- > 3.52) "/> > + </marker> > + </defs> > + <g v:mID="0" v:index="1" v:groupContext="foregroundPage"> > + <title>Page-1</title> > + <v:pageProperties v:drawingScale="1" v:pageScale="1" > v:drawingUnits="0" v:shadowOffsetX="9" v:shadowOffsetY="-9"/> > + <v:layer v:name="Connector" v:index="0"/> > + <g id="shape1-1" v:mID="1" v:groupContext="shape" > transform="translate(0.25,-117.25)"> > + <title>Rounded rectangle</title> > + <desc>Categorize into an existed “flow”</desc> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="90" cy="173.75" width="180" > height="36"/> > + <path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 > 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75 > + A9.00007 9.00007 -180 0 0 -0 > 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z" > + class="st1"/> > + <text x="8.91" y="177.35" class="st2" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Categorize into > an <tspan > + > class="st3">existed</tspan><tspan class="st3" v:langID="2052"> > </tspan>“<tspan class="st3">flow</tspan>”</text> </g> > + <g id="shape2-9" v:mID="2" v:groupContext="shape" > transform="translate(0.25,-58.75)"> > + <title>Rounded rectangle.2</title> > + <desc>Search for a “neighbor”</desc> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="90" cy="173.75" width="180" > height="36"/> > + <path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 > 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75 > + A9.00007 9.00007 -180 0 0 -0 > 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z" > + class="st1"/> > + <text x="32.19" y="177.35" class="st2" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Search for a > “<tspan > + > class="st3">neighbor</tspan>”</text> </g> > + <g id="shape3-14" v:mID="3" v:groupContext="shape" > transform="translate(225.813,-117.25)"> > + <title>Rounded rectangle.3</title> > + <desc>Insert a new “flow” and store the > packet</desc> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="105.188" cy="173.75" width="210.38" > height="36"/> > + <path d="M201.37 191.75 A9.00007 9.00007 -180 0 0 > 210.37 182.75 L210.37 164.75 A9.00007 9.00007 -180 0 0 201.37 155.75 > + L9 155.75 A9.00007 9.00007 - > 180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L201.37 191.75 > + Z" class="st1"/> > + <text x="5.45" y="177.35" class="st2" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Insert a <tspan > + class="st3">new > </tspan>“<tspan class="st3">flow</tspan>” and <tspan class="st3">store > </tspan>the packet</text> </g> > + <g id="shape4-21" v:mID="4" v:groupContext="shape" > transform="translate(225.25,-58.75)"> > + <title>Rounded rectangle.4</title> > + <desc>Store the packet</desc> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="83.25" cy="173.75" width="166.5" > height="36"/> > + <path d="M157.5 191.75 A9.00007 9.00007 -180 0 0 > 166.5 182.75 L166.5 164.75 A9.00007 9.00007 -180 0 0 157.5 155.75 L9 > + 155.75 A9.00007 9.00007 -180 > 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L157.5 191.75 Z" > + class="st1"/> > + <text x="42.81" y="177.35" class="st4" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Store <tspan > + class="st5">the > packet</tspan></text> </g> > + <g id="shape5-26" v:mID="5" v:groupContext="shape" > transform="translate(0.25,-0.25)"> > + <title>Rounded rectangle.5</title> > + <desc>Merge the packet</desc> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="90" cy="173.75" width="180" > height="36"/> > + <path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 > 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75 > + A9.00007 9.00007 -180 0 0 -0 > 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z" > + class="st1"/> > + <text x="46.59" y="177.35" class="st4" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Merge <tspan > + class="st5">the > packet</tspan></text> </g> > + <g id="shape6-31" v:mID="6" v:groupContext="shape" > v:layerMember="0" transform="translate(81.25,-175.75)"> > + <title>Dynamic connector</title> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <path d="M9 191.75 L9 208.09" class="st6"/> > + </g> > + <g id="shape7-39" v:mID="7" v:groupContext="shape" > v:layerMember="0" transform="translate(81.25,-117.25)"> > + <title>Dynamic connector.7</title> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <path d="M9 191.75 L9 208.09" class="st6"/> > + </g> > + <g id="shape8-45" v:mID="8" v:groupContext="shape" > v:layerMember="0" transform="translate(81.25,-58.75)"> > + <title>Dynamic connector.8</title> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <path d="M9 191.75 L9 208.09" class="st6"/> > + </g> > + <g id="shape9-51" v:mID="9" v:groupContext="shape" > v:layerMember="0" transform="translate(180.25,-126.25)"> > + <title>Dynamic connector.9</title> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <path d="M0 182.75 L39.4 182.75" class="st6"/> > + </g> > + <g id="shape10-57" v:mID="10" v:groupContext="shape" > v:layerMember="0" transform="translate(180.25,-67.75)"> > + <title>Dynamic connector.10</title> > + <v:userDefs> > + <v:ud v:nameU="visVersion" > v:val="VT0(14):26"/> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <path d="M0 182.75 L38.84 182.75" class="st6"/> > + </g> > + <g id="shape11-63" v:mID="11" v:groupContext="shape" > transform="translate(65.5,-173.5)"> > + <title>Sheet.11</title> > + <desc>packet</desc> > + <v:userDefs> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="24.75" cy="182.75" width="49.5" > height="18"/> > + <rect x="0" y="173.75" width="49.5" height="18" > class="st8"/> > + <text x="8.46" y="186.35" class="st2" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>packet</text> > </g> > + <g id="shape14-66" v:mID="14" v:groupContext="shape" > transform="translate(98.125,-98.125)"> > + <title>Sheet.14</title> > + <desc>find a “flow”</desc> > + <v:userDefs> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="32.0625" cy="183.875" width="64.13" > height="15.75"/> > + <rect x="0" y="176" width="64.125" height="15.75" > class="st8"/> > + <text x="6.41" y="186.88" class="st9" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>find a > “flow”</text> </g> > + <g id="shape15-69" v:mID="15" v:groupContext="shape" > transform="translate(99.25,-39.625)"> > + <title>Sheet.15</title> > + <desc>find a “neighbor”</desc> > + <v:userDefs> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="40.5" cy="183.875" width="81" > height="15.75"/> > + <rect x="0" y="176" width="81" height="15.75" > class="st8"/> > + <text x="5.48" y="186.88" class="st9" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>find a > “neighbor”</text> </g> > + <g id="shape13-72" v:mID="13" v:groupContext="shape" > transform="translate(181.375,-79)"> > + <title>Sheet.13</title> > + <desc>not find</desc> > + <v:userDefs> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="21.375" cy="183.875" width="42.75" > height="15.75"/> > + <rect x="0" y="176" width="42.75" height="15.75" > class="st8"/> > + <text x="5.38" y="186.88" class="st9" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>not find</text> > </g> > + <g id="shape12-75" v:mID="12" v:groupContext="shape" > transform="translate(181.375,-137.5)"> > + <title>Sheet.12</title> > + <desc>not find</desc> > + <v:userDefs> > + <v:ud v:nameU="msvThemeColors" > v:val="VT0(36):26"/> > + <v:ud v:nameU="msvThemeEffects" > v:val="VT0(16):26"/> > + </v:userDefs> > + <v:textBlock v:margins="rect(4,4,4,4)"/> > + <v:textRect cx="21.375" cy="183.875" width="42.75" > height="15.75"/> > + <rect x="0" y="176" width="42.75" height="15.75" > class="st8"/> > + <text x="5.38" y="186.88" class="st9" > v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>not find</text> > </g> > + </g> > +</svg> > diff --git a/lib/librte_gro/gro_tcp4.c b/lib/librte_gro/gro_tcp4.c > index 03e5ccf..27af23e 100644 > --- a/lib/librte_gro/gro_tcp4.c > +++ b/lib/librte_gro/gro_tcp4.c > @@ -6,8 +6,6 @@ > #include <rte_mbuf.h> > #include <rte_cycles.h> > #include <rte_ethdev.h> > -#include <rte_ip.h> > -#include <rte_tcp.h> > > #include "gro_tcp4.h" > > @@ -44,20 +42,20 @@ gro_tcp4_tbl_create(uint16_t socket_id, > } > tbl->max_item_num = entries_num; > > - size = sizeof(struct gro_tcp4_key) * entries_num; > - tbl->keys = rte_zmalloc_socket(__func__, > + size = sizeof(struct gro_tcp4_flow) * entries_num; > + tbl->flows = rte_zmalloc_socket(__func__, > size, > RTE_CACHE_LINE_SIZE, > socket_id); > - if (tbl->keys == NULL) { > + if (tbl->flows == NULL) { > rte_free(tbl->items); > rte_free(tbl); > return NULL; > } > - /* INVALID_ARRAY_INDEX indicates empty key */ > + /* INVALID_ARRAY_INDEX indicates an empty flow */ > for (i = 0; i < entries_num; i++) > - tbl->keys[i].start_index = INVALID_ARRAY_INDEX; > - tbl->max_key_num = entries_num; > + tbl->flows[i].start_index = INVALID_ARRAY_INDEX; > + tbl->max_flow_num = entries_num; > > return tbl; > } > @@ -69,116 +67,15 @@ gro_tcp4_tbl_destroy(void *tbl) > > if (tcp_tbl) { > rte_free(tcp_tbl->items); > - rte_free(tcp_tbl->keys); > + rte_free(tcp_tbl->flows); > } > rte_free(tcp_tbl); > } > > -/* > - * merge two TCP/IPv4 packets without updating checksums. > - * If cmp is larger than 0, append the new packet to the > - * original packet. Otherwise, pre-pend the new packet to > - * the original packet. > - */ > -static inline int > -merge_two_tcp4_packets(struct gro_tcp4_item *item_src, > - struct rte_mbuf *pkt, > - uint16_t ip_id, > - uint32_t sent_seq, > - int cmp) > -{ > - struct rte_mbuf *pkt_head, *pkt_tail, *lastseg; > - uint16_t tcp_datalen; > - > - if (cmp > 0) { > - pkt_head = item_src->firstseg; > - pkt_tail = pkt; > - } else { > - pkt_head = pkt; > - pkt_tail = item_src->firstseg; > - } > - > - /* check if the packet length will be beyond the max value */ > - tcp_datalen = pkt_tail->pkt_len - pkt_tail->l2_len - > - pkt_tail->l3_len - pkt_tail->l4_len; > - if (pkt_head->pkt_len - pkt_head->l2_len + tcp_datalen > > - TCP4_MAX_L3_LENGTH) > - return 0; > - > - /* remove packet header for the tail packet */ > - rte_pktmbuf_adj(pkt_tail, > - pkt_tail->l2_len + > - pkt_tail->l3_len + > - pkt_tail->l4_len); > - > - /* chain two packets together */ > - if (cmp > 0) { > - item_src->lastseg->next = pkt; > - item_src->lastseg = rte_pktmbuf_lastseg(pkt); > - /* update IP ID to the larger value */ > - item_src->ip_id = ip_id; > - } else { > - lastseg = rte_pktmbuf_lastseg(pkt); > - lastseg->next = item_src->firstseg; > - item_src->firstseg = pkt; > - /* update sent_seq to the smaller value */ > - item_src->sent_seq = sent_seq; > - } > - item_src->nb_merged++; > - > - /* update mbuf metadata for the merged packet */ > - pkt_head->nb_segs += pkt_tail->nb_segs; > - pkt_head->pkt_len += pkt_tail->pkt_len; > - > - return 1; > -} > - > -static inline int > -check_seq_option(struct gro_tcp4_item *item, > - struct tcp_hdr *tcp_hdr, > - uint16_t tcp_hl, > - uint16_t tcp_dl, > - uint16_t ip_id, > - uint32_t sent_seq) > -{ > - struct rte_mbuf *pkt0 = item->firstseg; > - struct ipv4_hdr *ipv4_hdr0; > - struct tcp_hdr *tcp_hdr0; > - uint16_t tcp_hl0, tcp_dl0; > - uint16_t len; > - > - ipv4_hdr0 = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt0, char *) + > - pkt0->l2_len); > - tcp_hdr0 = (struct tcp_hdr *)((char *)ipv4_hdr0 + pkt0->l3_len); > - tcp_hl0 = pkt0->l4_len; > - > - /* check if TCP option fields equal. If not, return 0. */ > - len = RTE_MAX(tcp_hl, tcp_hl0) - sizeof(struct tcp_hdr); > - if ((tcp_hl != tcp_hl0) || > - ((len > 0) && (memcmp(tcp_hdr + 1, > - tcp_hdr0 + 1, > - len) != 0))) > - return 0; > - > - /* check if the two packets are neighbors */ > - tcp_dl0 = pkt0->pkt_len - pkt0->l2_len - pkt0->l3_len - tcp_hl0; > - if ((sent_seq == (item->sent_seq + tcp_dl0)) && > - (ip_id == (item->ip_id + 1))) > - /* append the new packet */ > - return 1; > - else if (((sent_seq + tcp_dl) == item->sent_seq) && > - ((ip_id + item->nb_merged) == item->ip_id)) > - /* pre-pend the new packet */ > - return -1; > - else > - return 0; > -} > - > static inline uint32_t > find_an_empty_item(struct gro_tcp4_tbl *tbl) > { > - uint32_t i; > - uint32_t max_item_num = tbl->max_item_num; > + uint32_t max_item_num = tbl->max_item_num, i; > > for (i = 0; i < max_item_num; i++) > if (tbl->items[i].firstseg == NULL) > @@ -187,13 +84,12 @@ find_an_empty_item(struct gro_tcp4_tbl *tbl) > } > > static inline uint32_t > -find_an_empty_key(struct gro_tcp4_tbl *tbl) > +find_an_empty_flow(struct gro_tcp4_tbl *tbl) > { > - uint32_t i; > - uint32_t max_key_num = tbl->max_key_num; > + uint32_t max_flow_num = tbl->max_flow_num, i; > > - for (i = 0; i < max_key_num; i++) > - if (tbl->keys[i].start_index == INVALID_ARRAY_INDEX) > + for (i = 0; i < max_flow_num; i++) > + if (tbl->flows[i].start_index == INVALID_ARRAY_INDEX) > return i; > return INVALID_ARRAY_INDEX; > } > @@ -201,10 +97,11 @@ find_an_empty_key(struct gro_tcp4_tbl *tbl) > static inline uint32_t > insert_new_item(struct gro_tcp4_tbl *tbl, > struct rte_mbuf *pkt, > - uint16_t ip_id, > - uint32_t sent_seq, > + uint64_t start_time, > uint32_t prev_idx, > - uint64_t start_time) > + uint32_t sent_seq, > + uint16_t ip_id, > + uint8_t is_atomic) > { > uint32_t item_idx; > > @@ -219,9 +116,10 @@ insert_new_item(struct gro_tcp4_tbl *tbl, > tbl->items[item_idx].sent_seq = sent_seq; > tbl->items[item_idx].ip_id = ip_id; > tbl->items[item_idx].nb_merged = 1; > + tbl->items[item_idx].is_atomic = is_atomic; > tbl->item_num++; > > - /* if the previous packet exists, chain the new one with it */ > + /* If the previous packet exists, chain them together. */ > if (prev_idx != INVALID_ARRAY_INDEX) { > tbl->items[item_idx].next_pkt_idx = > tbl->items[prev_idx].next_pkt_idx; > @@ -232,12 +130,13 @@ insert_new_item(struct gro_tcp4_tbl *tbl, > } > > static inline uint32_t > -delete_item(struct gro_tcp4_tbl *tbl, uint32_t item_idx, > +delete_item(struct gro_tcp4_tbl *tbl, > + uint32_t item_idx, > uint32_t prev_item_idx) > { > uint32_t next_idx = tbl->items[item_idx].next_pkt_idx; > > - /* set NULL to firstseg to indicate it's an empty item */ > + /* NULL indicates an empty item. */ > tbl->items[item_idx].firstseg = NULL; > tbl->item_num--; > if (prev_item_idx != INVALID_ARRAY_INDEX) > @@ -247,53 +146,33 @@ delete_item(struct gro_tcp4_tbl *tbl, uint32_t > item_idx, > } > > static inline uint32_t > -insert_new_key(struct gro_tcp4_tbl *tbl, > - struct tcp4_key *key_src, > +insert_new_flow(struct gro_tcp4_tbl *tbl, > + struct tcp4_flow_key *src, > uint32_t item_idx) > { > - struct tcp4_key *key_dst; > - uint32_t key_idx; > + struct tcp4_flow_key *dst; > + uint32_t flow_idx; > > - key_idx = find_an_empty_key(tbl); > - if (key_idx == INVALID_ARRAY_INDEX) > + flow_idx = find_an_empty_flow(tbl); > + if (unlikely(flow_idx == INVALID_ARRAY_INDEX)) > return INVALID_ARRAY_INDEX; > > - key_dst = &(tbl->keys[key_idx].key); > + dst = &(tbl->flows[flow_idx].key); > > - ether_addr_copy(&(key_src->eth_saddr), &(key_dst->eth_saddr)); > - ether_addr_copy(&(key_src->eth_daddr), &(key_dst->eth_daddr)); > - key_dst->ip_src_addr = key_src->ip_src_addr; > - key_dst->ip_dst_addr = key_src->ip_dst_addr; > - key_dst->recv_ack = key_src->recv_ack; > - key_dst->src_port = key_src->src_port; > - key_dst->dst_port = key_src->dst_port; > + ether_addr_copy(&(src->eth_saddr), &(dst->eth_saddr)); > + ether_addr_copy(&(src->eth_daddr), &(dst->eth_daddr)); > + dst->ip_src_addr = src->ip_src_addr; > + dst->ip_dst_addr = src->ip_dst_addr; > + dst->recv_ack = src->recv_ack; > + dst->src_port = src->src_port; > + dst->dst_port = src->dst_port; > > - /* non-INVALID_ARRAY_INDEX value indicates this key is valid */ > - tbl->keys[key_idx].start_index = item_idx; > - tbl->key_num++; > + tbl->flows[flow_idx].start_index = item_idx; > + tbl->flow_num++; > > - return key_idx; > + return flow_idx; > } > > -static inline int > -is_same_key(struct tcp4_key k1, struct tcp4_key k2) > -{ > - if (is_same_ether_addr(&k1.eth_saddr, &k2.eth_saddr) == 0) > - return 0; > - > - if (is_same_ether_addr(&k1.eth_daddr, &k2.eth_daddr) == 0) > - return 0; > - > - return ((k1.ip_src_addr == k2.ip_src_addr) && > - (k1.ip_dst_addr == k2.ip_dst_addr) && > - (k1.recv_ack == k2.recv_ack) && > - (k1.src_port == k2.src_port) && > - (k1.dst_port == k2.dst_port)); > -} > - > -/* > - * update packet length for the flushed packet. > - */ > static inline void > update_header(struct gro_tcp4_item *item) > { > @@ -315,84 +194,106 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, > struct ipv4_hdr *ipv4_hdr; > struct tcp_hdr *tcp_hdr; > uint32_t sent_seq; > - uint16_t tcp_dl, ip_id; > + uint16_t tcp_dl, ip_id, frag_off, hdr_len; > + uint8_t is_atomic; > > - struct tcp4_key key; > + struct tcp4_flow_key key; > uint32_t cur_idx, prev_idx, item_idx; > - uint32_t i, max_key_num; > + uint32_t i, max_flow_num, left_flow_num; > int cmp; > + uint8_t find; > > eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *); > ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len); > tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len); > + hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len; > > /* > - * if FIN, SYN, RST, PSH, URG, ECE or > - * CWR is set, return immediately. > + * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE > + * or CWR set. > */ > if (tcp_hdr->tcp_flags != TCP_ACK_FLAG) > return -1; > - /* if payload length is 0, return immediately */ > - tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len - > - pkt->l4_len; > - if (tcp_dl == 0) > + /* > + * Don't process the packet whose payload length is less than or > + * equal to 0. > + */ > + tcp_dl = pkt->pkt_len - hdr_len; > + if (tcp_dl <= 0) > return -1; > > - ip_id = rte_be_to_cpu_16(ipv4_hdr->packet_id); > + /* > + * Save IPv4 ID for the packet whose DF bit is 0. For the packet > + * whose DF bit is 1, IPv4 ID is ignored. > + */ > + frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset); > + is_atomic = (frag_off & IPV4_HDR_DF_FLAG) == IPV4_HDR_DF_FLAG; > + ip_id = is_atomic ? 0 : rte_be_to_cpu_16(ipv4_hdr->packet_id); > sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq); > > ether_addr_copy(&(eth_hdr->s_addr), &(key.eth_saddr)); > ether_addr_copy(&(eth_hdr->d_addr), &(key.eth_daddr)); > key.ip_src_addr = ipv4_hdr->src_addr; > key.ip_dst_addr = ipv4_hdr->dst_addr; > + key.recv_ack = tcp_hdr->recv_ack; > key.src_port = tcp_hdr->src_port; > key.dst_port = tcp_hdr->dst_port; > - key.recv_ack = tcp_hdr->recv_ack; > > - /* search for a key */ > - max_key_num = tbl->max_key_num; > - for (i = 0; i < max_key_num; i++) { > - if ((tbl->keys[i].start_index != INVALID_ARRAY_INDEX) && > - is_same_key(tbl->keys[i].key, key)) > - break; > + /* Search for a matched flow. */ > + max_flow_num = tbl->max_flow_num; > + left_flow_num = tbl->flow_num; > + find = 0; > + for (i = 0; i < max_flow_num && left_flow_num; i++) { > + if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) { > + if (is_same_tcp4_flow(tbl->flows[i].key, key)) { > + find = 1; > + break; > + } > + left_flow_num--; > + } > } > > - /* can't find a key, so insert a new key and a new item. */ > - if (i == tbl->max_key_num) { > - item_idx = insert_new_item(tbl, pkt, ip_id, sent_seq, > - INVALID_ARRAY_INDEX, start_time); > + /* > + * Fail to find a matched flow. Insert a new flow and store the > + * packet into the flow. > + */ > + if (find == 0) { > + item_idx = insert_new_item(tbl, pkt, start_time, > + INVALID_ARRAY_INDEX, sent_seq, ip_id, > + is_atomic); > if (item_idx == INVALID_ARRAY_INDEX) > return -1; > - if (insert_new_key(tbl, &key, item_idx) == > + if (insert_new_flow(tbl, &key, item_idx) == > INVALID_ARRAY_INDEX) { > - /* > - * fail to insert a new key, so > - * delete the inserted item > - */ > + /* Fail to insert a new flow. */ > delete_item(tbl, item_idx, INVALID_ARRAY_INDEX); > return -1; > } > return 0; > } > > - /* traverse all packets in the item group to find one to merge */ > - cur_idx = tbl->keys[i].start_index; > + /* > + * Check all packets in the flow and try to find a neighbor for > + * the input packet. > + */ > + cur_idx = tbl->flows[i].start_index; > prev_idx = cur_idx; > do { > cmp = check_seq_option(&(tbl->items[cur_idx]), tcp_hdr, > - pkt->l4_len, tcp_dl, ip_id, sent_seq); > + sent_seq, ip_id, pkt->l4_len, tcp_dl, 0, > + is_atomic); > if (cmp) { > if (merge_two_tcp4_packets(&(tbl->items[cur_idx]), > - pkt, ip_id, > - sent_seq, cmp)) > + pkt, cmp, sent_seq, ip_id, 0)) > return 1; > /* > - * fail to merge two packets since the packet > - * length will be greater than the max value. > - * So insert the packet into the item group. > + * Fail to merge the two packets, as the packet > + * length is greater than the max value. Store > + * the packet into the flow. > */ > - if (insert_new_item(tbl, pkt, ip_id, sent_seq, > - prev_idx, start_time) == > + if (insert_new_item(tbl, pkt, start_time, prev_idx, > + sent_seq, ip_id, > + is_atomic) == > INVALID_ARRAY_INDEX) > return -1; > return 0; > @@ -401,12 +302,9 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, > cur_idx = tbl->items[cur_idx].next_pkt_idx; > } while (cur_idx != INVALID_ARRAY_INDEX); > > - /* > - * can't find a packet in the item group to merge, > - * so insert the packet into the item group. > - */ > - if (insert_new_item(tbl, pkt, ip_id, sent_seq, prev_idx, > - start_time) == INVALID_ARRAY_INDEX) > + /* Fail to find a neighbor, so store the packet into the flow. */ > + if (insert_new_item(tbl, pkt, start_time, prev_idx, sent_seq, > + ip_id, is_atomic) == INVALID_ARRAY_INDEX) > return -1; > > return 0; > @@ -418,46 +316,35 @@ gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl > *tbl, > struct rte_mbuf **out, > uint16_t nb_out) > { > - uint16_t k = 0; > + uint32_t max_flow_num = tbl->max_flow_num; > uint32_t i, j; > - uint32_t max_key_num = tbl->max_key_num; > + uint16_t k = 0; > > - for (i = 0; i < max_key_num; i++) { > - /* all keys have been checked, return immediately */ > - if (tbl->key_num == 0) > + for (i = 0; i < max_flow_num; i++) { > + if (unlikely(tbl->flow_num == 0)) > return k; > > - j = tbl->keys[i].start_index; > + j = tbl->flows[i].start_index; > while (j != INVALID_ARRAY_INDEX) { > if (tbl->items[j].start_time <= flush_timestamp) { > out[k++] = tbl->items[j].firstseg; > if (tbl->items[j].nb_merged > 1) > update_header(&(tbl->items[j])); > /* > - * delete the item and get > - * the next packet index > + * Delete the packet and get the next > + * packet in the flow. > */ > - j = delete_item(tbl, j, > - INVALID_ARRAY_INDEX); > + j = delete_item(tbl, j, > INVALID_ARRAY_INDEX); > + tbl->flows[i].start_index = j; > + if (j == INVALID_ARRAY_INDEX) > + tbl->flow_num--; > > - /* > - * delete the key as all of > - * packets are flushed > - */ > - if (j == INVALID_ARRAY_INDEX) { > - tbl->keys[i].start_index = > - INVALID_ARRAY_INDEX; > - tbl->key_num--; > - } else > - /* update start_index of the key */ > - tbl->keys[i].start_index = j; > - > - if (k == nb_out) > + if (unlikely(k == nb_out)) > return k; > } else > /* > - * left packets of this key won't be > - * timeout, so go to check other keys. > + * The left packets in this flow won't be > + * timeout. Go to check other flows. > */ > break; > } > diff --git a/lib/librte_gro/gro_tcp4.h b/lib/librte_gro/gro_tcp4.h > index d129523..c2b66a8 100644 > --- a/lib/librte_gro/gro_tcp4.h > +++ b/lib/librte_gro/gro_tcp4.h > @@ -5,17 +5,20 @@ > #ifndef _GRO_TCP4_H_ > #define _GRO_TCP4_H_ > > +#include <rte_ip.h> > +#include <rte_tcp.h> > + > #define INVALID_ARRAY_INDEX 0xffffffffUL > #define GRO_TCP4_TBL_MAX_ITEM_NUM (1024UL * 1024UL) > > /* > - * the max L3 length of a TCP/IPv4 packet. The L3 length > - * is the sum of ipv4 header, tcp header and L4 payload. > + * The max length of a IPv4 packet, which includes the length of the L3 > + * header, the L4 header and the data payload. > */ > -#define TCP4_MAX_L3_LENGTH UINT16_MAX > +#define MAX_IPV4_PKT_LENGTH UINT16_MAX > > -/* criteria of mergeing packets */ > -struct tcp4_key { > +/* Header fields representing a TCP/IPv4 flow */ > +struct tcp4_flow_key { > struct ether_addr eth_saddr; > struct ether_addr eth_daddr; > uint32_t ip_src_addr; > @@ -26,77 +29,76 @@ struct tcp4_key { > uint16_t dst_port; > }; > > -struct gro_tcp4_key { > - struct tcp4_key key; > +struct gro_tcp4_flow { > + struct tcp4_flow_key key; > /* > - * the index of the first packet in the item group. > - * If the value is INVALID_ARRAY_INDEX, it means > - * the key is empty. > + * The index of the first packet in the flow. > + * INVALID_ARRAY_INDEX indicates an empty flow. > */ > uint32_t start_index; > }; > > struct gro_tcp4_item { > /* > - * first segment of the packet. If the value > + * The first MBUF segment of the packet. If the value > * is NULL, it means the item is empty. > */ > struct rte_mbuf *firstseg; > - /* last segment of the packet */ > + /* The last MBUF segment of the packet */ > struct rte_mbuf *lastseg; > /* > - * the time when the first packet is inserted > - * into the table. If a packet in the table is > - * merged with an incoming packet, this value > - * won't be updated. We set this value only > - * when the first packet is inserted into the > - * table. > + * The time when the first packet is inserted into the table. > + * This value won't be updated, even if the packet is merged > + * with other packets. > */ > uint64_t start_time; > /* > - * we use next_pkt_idx to chain the packets that > - * have same key value but can't be merged together. > + * next_pkt_idx is used to chain the packets that > + * are in the same flow but can't be merged together > + * (e.g. caused by packet reordering). > */ > uint32_t next_pkt_idx; > - /* the sequence number of the packet */ > + /* TCP sequence number of the packet */ > uint32_t sent_seq; > - /* the IP ID of the packet */ > + /* IPv4 ID of the packet */ > uint16_t ip_id; > - /* the number of merged packets */ > + /* The number of merged packets */ > uint16_t nb_merged; > + /* Indicate if IPv4 ID can be ignored */ > + uint8_t is_atomic; > }; > > /* > - * TCP/IPv4 reassembly table structure. > + * TCP/IPv4 reassembly table structure > */ > struct gro_tcp4_tbl { > /* item array */ > struct gro_tcp4_item *items; > - /* key array */ > - struct gro_tcp4_key *keys; > + /* flow array */ > + struct gro_tcp4_flow *flows; > /* current item number */ > uint32_t item_num; > - /* current key num */ > - uint32_t key_num; > + /* current flow num */ > + uint32_t flow_num; > /* item array size */ > uint32_t max_item_num; > - /* key array size */ > - uint32_t max_key_num; > + /* flow array size */ > + uint32_t max_flow_num; > }; > > /** > * This function creates a TCP/IPv4 reassembly table. > * > * @param socket_id > - * socket index for allocating TCP/IPv4 reassemble table > + * Socket index for allocating the TCP/IPv4 reassemble table > * @param max_flow_num > - * the maximum number of flows in the TCP/IPv4 GRO table > + * The maximum number of flows in the TCP/IPv4 GRO table > * @param max_item_per_flow > - * the maximum packet number per flow. > + * The maximum number of packets per flow > * > * @return > - * if create successfully, return a pointer which points to the > - * created TCP/IPv4 GRO table. Otherwise, return NULL. > + * - Return the table pointer on success. > + * - Return NULL on failure. > */ > void *gro_tcp4_tbl_create(uint16_t socket_id, > uint16_t max_flow_num, > @@ -106,62 +108,56 @@ void *gro_tcp4_tbl_create(uint16_t socket_id, > * This function destroys a TCP/IPv4 reassembly table. > * > * @param tbl > - * a pointer points to the TCP/IPv4 reassembly table. > + * Pointer pointing to the TCP/IPv4 reassembly table. > */ > void gro_tcp4_tbl_destroy(void *tbl); > > /** > - * This function searches for a packet in the TCP/IPv4 reassembly table > - * to merge with the inputted one. To merge two packets is to chain them > - * together and update packet headers. Packets, whose SYN, FIN, RST, PSH > - * CWR, ECE or URG bit is set, are returned immediately. Packets which > - * only have packet headers (i.e. without data) are also returned > - * immediately. Otherwise, the packet is either merged, or inserted into > - * the table. Besides, if there is no available space to insert the > - * packet, this function returns immediately too. > + * This function merges a TCP/IPv4 packet. It doesn't process the packet, > + * which has SYN, FIN, RST, PSH, CWR, ECE or URG set, or doesn't have > + * payload. > * > - * This function assumes the inputted packet is with correct IPv4 and > - * TCP checksums. And if two packets are merged, it won't re-calculate > - * IPv4 and TCP checksums. Besides, if the inputted packet is IP > - * fragmented, it assumes the packet is complete (with TCP header). > + * This function doesn't check if the packet has correct checksums and > + * doesn't re-calculate checksums for the merged packet. Additionally, > + * it assumes the packets are complete (i.e., MF==0 && frag_off==0), > + * when IP fragmentation is possible (i.e., DF==0). It returns the > + * packet, if the packet has invalid parameters (e.g. SYN bit is set) > + * or there is no available space in the table. > * > * @param pkt > - * packet to reassemble. > + * Packet to reassemble > * @param tbl > - * a pointer that points to a TCP/IPv4 reassembly table. > + * Pointer pointing to the TCP/IPv4 reassembly table > * @start_time > - * the start time that the packet is inserted into the table > + * The time when the packet is inserted into the table > * > * @return > - * if the packet doesn't have data, or SYN, FIN, RST, PSH, CWR, ECE > - * or URG bit is set, or there is no available space in the table to > - * insert a new item or a new key, return a negative value. If the > - * packet is merged successfully, return an positive value. If the > - * packet is inserted into the table, return 0. > + * - Return a positive value if the packet is merged. > + * - Return zero if the packet isn't merged but stored in the table. > + * - Return a negative value for invalid parameters or no available > + * space in the table. > */ > int32_t gro_tcp4_reassemble(struct rte_mbuf *pkt, > struct gro_tcp4_tbl *tbl, > uint64_t start_time); > > /** > - * This function flushes timeout packets in a TCP/IPv4 reassembly table > - * to applications, and without updating checksums for merged packets. > - * The max number of flushed timeout packets is the element number of > - * the array which is used to keep flushed packets. > + * This function flushes timeout packets in a TCP/IPv4 reassembly table, > + * and without updating checksums. > * > * @param tbl > - * a pointer that points to a TCP GRO table. > + * TCP/IPv4 reassembly table pointer > * @param flush_timestamp > - * this function flushes packets which are inserted into the table > - * before or at the flush_timestamp. > + * Flush packets which are inserted into the table before or at the > + * flush_timestamp. > * @param out > - * pointer array which is used to keep flushed packets. > + * Pointer array used to keep flushed packets > * @param nb_out > - * the element number of out. It's also the max number of timeout > + * The element number in 'out'. It also determines the maximum number > of > * packets that can be flushed finally. > * > * @return > - * the number of packets that are returned. > + * The number of flushed packets > */ > uint16_t gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl *tbl, > uint64_t flush_timestamp, > @@ -173,10 +169,131 @@ uint16_t gro_tcp4_tbl_timeout_flush(struct > gro_tcp4_tbl *tbl, > * reassembly table. > * > * @param tbl > - * pointer points to a TCP/IPv4 reassembly table. > + * TCP/IPv4 reassembly table pointer > * > * @return > - * the number of packets in the table > + * The number of packets in the table > */ > uint32_t gro_tcp4_tbl_pkt_count(void *tbl); > + > +/* > + * Check if two TCP/IPv4 packets belong to the same flow. > + */ > +static inline int > +is_same_tcp4_flow(struct tcp4_flow_key k1, struct tcp4_flow_key k2) > +{ > + return (is_same_ether_addr(&k1.eth_saddr, &k2.eth_saddr) && > + is_same_ether_addr(&k1.eth_daddr, &k2.eth_daddr) > && > + (k1.ip_src_addr == k2.ip_src_addr) && > + (k1.ip_dst_addr == k2.ip_dst_addr) && > + (k1.recv_ack == k2.recv_ack) && > + (k1.src_port == k2.src_port) && > + (k1.dst_port == k2.dst_port)); > +} > + > +/* > + * Check if two TCP/IPv4 packets are neighbors. > + */ > +static inline int > +check_seq_option(struct gro_tcp4_item *item, > + struct tcp_hdr *tcph, > + uint32_t sent_seq, > + uint16_t ip_id, > + uint16_t tcp_hl, > + uint16_t tcp_dl, > + uint16_t l2_offset, > + uint8_t is_atomic) > +{ > + struct rte_mbuf *pkt_orig = item->firstseg; > + struct ipv4_hdr *iph_orig; > + struct tcp_hdr *tcph_orig; > + uint16_t len, l4_len_orig; > + > + iph_orig = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt_orig, char *) + > + l2_offset + pkt_orig->l2_len); > + tcph_orig = (struct tcp_hdr *)((char *)iph_orig + pkt_orig->l3_len); > + l4_len_orig = pkt_orig->l4_len; > + > + /* Check if TCP option fields equal */ > + len = RTE_MAX(tcp_hl, l4_len_orig) - sizeof(struct tcp_hdr); > + if ((tcp_hl != l4_len_orig) || ((len > 0) && > + (memcmp(tcph + 1, tcph_orig + 1, > + len) != 0))) > + return 0; > + > + /* Don't merge packets whose DF bits are different */ > + if (unlikely(item->is_atomic ^ is_atomic)) > + return 0; > + > + /* Check if the two packets are neighbors */ > + len = pkt_orig->pkt_len - l2_offset - pkt_orig->l2_len - > + pkt_orig->l3_len - l4_len_orig; > + if ((sent_seq == item->sent_seq + len) && (is_atomic || > + (ip_id == item->ip_id + item->nb_merged))) > + /* Append the new packet */ > + return 1; > + else if ((sent_seq + tcp_dl == item->sent_seq) && (is_atomic || > + (ip_id + 1 == item->ip_id))) > + /* Pre-pend the new packet */ > + return -1; > + > + return 0; > +} > + > +/* > + * Merge two TCP/IPv4 packets without updating checksums. > + * If cmp is larger than 0, append the new packet to the > + * original packet. Otherwise, pre-pend the new packet to > + * the original packet. > + */ > +static inline int > +merge_two_tcp4_packets(struct gro_tcp4_item *item, > + struct rte_mbuf *pkt, > + int cmp, > + uint32_t sent_seq, > + uint16_t ip_id, > + uint16_t l2_offset) > +{ > + struct rte_mbuf *pkt_head, *pkt_tail, *lastseg; > + uint16_t hdr_len, l2_len; > + > + if (cmp > 0) { > + pkt_head = item->firstseg; > + pkt_tail = pkt; > + } else { > + pkt_head = pkt; > + pkt_tail = item->firstseg; > + } > + > + /* Check if the IPv4 packet length is greater than the max value */ > + hdr_len = l2_offset + pkt_head->l2_len + pkt_head->l3_len + > + pkt_head->l4_len; > + l2_len = l2_offset > 0 ? pkt_head->outer_l2_len : pkt_head->l2_len; > + if (unlikely(pkt_head->pkt_len - l2_len + pkt_tail->pkt_len - hdr_len > > + MAX_IPV4_PKT_LENGTH)) > + return 0; > + > + /* Remove the packet header */ > + rte_pktmbuf_adj(pkt_tail, hdr_len); > + > + /* Chain two packets together */ > + if (cmp > 0) { > + item->lastseg->next = pkt; > + item->lastseg = rte_pktmbuf_lastseg(pkt); > + } else { > + lastseg = rte_pktmbuf_lastseg(pkt); > + lastseg->next = item->firstseg; > + item->firstseg = pkt; > + /* Update sent_seq and ip_id */ > + item->sent_seq = sent_seq; > + item->ip_id = ip_id; > + } > + item->nb_merged++; > + > + /* Update MBUF metadata for the merged packet */ > + pkt_head->nb_segs += pkt_tail->nb_segs; > + pkt_head->pkt_len += pkt_tail->pkt_len; > + > + return 1; > +} > #endif > diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c > index d6b8cd1..7176c0e 100644 > --- a/lib/librte_gro/rte_gro.c > +++ b/lib/librte_gro/rte_gro.c > @@ -23,11 +23,14 @@ static gro_tbl_destroy_fn > tbl_destroy_fn[RTE_GRO_TYPE_MAX_NUM] = { > static gro_tbl_pkt_count_fn tbl_pkt_count_fn[RTE_GRO_TYPE_MAX_NUM] > = { > gro_tcp4_tbl_pkt_count, NULL}; > > +#define IS_IPV4_TCP_PKT(ptype) (RTE_ETH_IS_IPV4_HDR(ptype) && \ > + ((ptype & RTE_PTYPE_L4_TCP) == RTE_PTYPE_L4_TCP)) > + > /* > - * GRO context structure, which is used to merge packets. It keeps > - * many reassembly tables of desired GRO types. Applications need to > - * create GRO context objects before using rte_gro_reassemble to > - * perform GRO. > + * GRO context structure. It keeps the table structures, which are > + * used to merge packets, for different GRO types. Before using > + * rte_gro_reassemble(), applications need to create the GRO context > + * first. > */ > struct gro_ctx { > /* GRO types to perform */ > @@ -65,7 +68,7 @@ rte_gro_ctx_create(const struct rte_gro_param *param) > param->max_flow_num, > param->max_item_per_flow); > if (gro_ctx->tbls[i] == NULL) { > - /* destroy all created tables */ > + /* Destroy all created tables */ > gro_ctx->gro_types = gro_types; > rte_gro_ctx_destroy(gro_ctx); > return NULL; > @@ -85,8 +88,6 @@ rte_gro_ctx_destroy(void *ctx) > uint64_t gro_type_flag; > uint8_t i; > > - if (gro_ctx == NULL) > - return; > for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) { > gro_type_flag = 1ULL << i; > if ((gro_ctx->gro_types & gro_type_flag) == 0) > @@ -103,62 +104,54 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts, > uint16_t nb_pkts, > const struct rte_gro_param *param) > { > - uint16_t i; > - uint16_t nb_after_gro = nb_pkts; > - uint32_t item_num; > - > - /* allocate a reassembly table for TCP/IPv4 GRO */ > + /* Allocate a reassembly table for TCP/IPv4 GRO */ > struct gro_tcp4_tbl tcp_tbl; > - struct gro_tcp4_key tcp_keys[RTE_GRO_MAX_BURST_ITEM_NUM]; > + struct gro_tcp4_flow > tcp_flows[RTE_GRO_MAX_BURST_ITEM_NUM]; > struct gro_tcp4_item tcp_items[RTE_GRO_MAX_BURST_ITEM_NUM] > = {{0} }; > > struct rte_mbuf *unprocess_pkts[nb_pkts]; > - uint16_t unprocess_num = 0; > + uint32_t item_num; > int32_t ret; > - uint64_t current_time; > + uint16_t i, unprocess_num = 0, nb_after_gro = nb_pkts; > > - if ((param->gro_types & RTE_GRO_TCP_IPV4) == 0) > + if (unlikely((param->gro_types & RTE_GRO_TCP_IPV4) == 0)) > return nb_pkts; > > - /* get the actual number of packets */ > + /* Get the maximum number of packets */ > item_num = RTE_MIN(nb_pkts, (param->max_flow_num * > - param->max_item_per_flow)); > + param->max_item_per_flow)); > item_num = RTE_MIN(item_num, > RTE_GRO_MAX_BURST_ITEM_NUM); > > for (i = 0; i < item_num; i++) > - tcp_keys[i].start_index = INVALID_ARRAY_INDEX; > + tcp_flows[i].start_index = INVALID_ARRAY_INDEX; > > - tcp_tbl.keys = tcp_keys; > + tcp_tbl.flows = tcp_flows; > tcp_tbl.items = tcp_items; > - tcp_tbl.key_num = 0; > + tcp_tbl.flow_num = 0; > tcp_tbl.item_num = 0; > - tcp_tbl.max_key_num = item_num; > + tcp_tbl.max_flow_num = item_num; > tcp_tbl.max_item_num = item_num; > > - current_time = rte_rdtsc(); > - > for (i = 0; i < nb_pkts; i++) { > - if ((pkts[i]->packet_type & (RTE_PTYPE_L3_IPV4 | > - RTE_PTYPE_L4_TCP)) == > - (RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)) > { > - ret = gro_tcp4_reassemble(pkts[i], > - &tcp_tbl, > - current_time); > + if (IS_IPV4_TCP_PKT(pkts[i]->packet_type)) { > + /* > + * The timestamp is ignored, since all packets > + * will be flushed from the tables. > + */ > + ret = gro_tcp4_reassemble(pkts[i], &tcp_tbl, 0); > if (ret > 0) > - /* merge successfully */ > + /* Merge successfully */ > nb_after_gro--; > - else if (ret < 0) { > - unprocess_pkts[unprocess_num++] = > - pkts[i]; > - } > + else if (ret < 0) > + unprocess_pkts[unprocess_num++] = pkts[i]; > } else > unprocess_pkts[unprocess_num++] = pkts[i]; > } > > - /* re-arrange GROed packets */ > if (nb_after_gro < nb_pkts) { > - i = gro_tcp4_tbl_timeout_flush(&tcp_tbl, current_time, > - pkts, nb_pkts); > + /* Flush all packets from the tables */ > + i = gro_tcp4_tbl_timeout_flush(&tcp_tbl, 0, pkts, nb_pkts); > + /* Copy unprocessed packets */ > if (unprocess_num > 0) { > memcpy(&pkts[i], unprocess_pkts, > sizeof(struct rte_mbuf *) * > @@ -174,31 +167,28 @@ rte_gro_reassemble(struct rte_mbuf **pkts, > uint16_t nb_pkts, > void *ctx) > { > - uint16_t i, unprocess_num = 0; > struct rte_mbuf *unprocess_pkts[nb_pkts]; > struct gro_ctx *gro_ctx = ctx; > + void *tcp_tbl; > uint64_t current_time; > + uint16_t i, unprocess_num = 0; > > - if ((gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == 0) > + if (unlikely((gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == 0)) > return nb_pkts; > > + tcp_tbl = gro_ctx->tbls[RTE_GRO_TCP_IPV4_INDEX]; > current_time = rte_rdtsc(); > > for (i = 0; i < nb_pkts; i++) { > - if ((pkts[i]->packet_type & (RTE_PTYPE_L3_IPV4 | > - RTE_PTYPE_L4_TCP)) == > - (RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)) > { > - if (gro_tcp4_reassemble(pkts[i], > - gro_ctx->tbls > - [RTE_GRO_TCP_IPV4_INDEX], > + if (IS_IPV4_TCP_PKT(pkts[i]->packet_type)) { > + if (gro_tcp4_reassemble(pkts[i], tcp_tbl, > current_time) < 0) > unprocess_pkts[unprocess_num++] = pkts[i]; > } else > unprocess_pkts[unprocess_num++] = pkts[i]; > } > if (unprocess_num > 0) { > - memcpy(pkts, unprocess_pkts, > - sizeof(struct rte_mbuf *) * > + memcpy(pkts, unprocess_pkts, sizeof(struct rte_mbuf *) * > unprocess_num); > } > > @@ -224,6 +214,7 @@ rte_gro_timeout_flush(void *ctx, > flush_timestamp, > out, max_nb_out); > } > + > return 0; > } > > @@ -232,19 +223,20 @@ rte_gro_get_pkt_count(void *ctx) > { > struct gro_ctx *gro_ctx = ctx; > gro_tbl_pkt_count_fn pkt_count_fn; > + uint64_t gro_types = gro_ctx->gro_types, flag; > uint64_t item_num = 0; > - uint64_t gro_type_flag; > uint8_t i; > > - for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) { > - gro_type_flag = 1ULL << i; > - if ((gro_ctx->gro_types & gro_type_flag) == 0) > + for (i = 0; i < RTE_GRO_TYPE_MAX_NUM && gro_types; i++) { > + flag = 1ULL << i; > + if ((gro_types & flag) == 0) > continue; > > + gro_types ^= flag; > pkt_count_fn = tbl_pkt_count_fn[i]; > - if (pkt_count_fn == NULL) > - continue; > - item_num += pkt_count_fn(gro_ctx->tbls[i]); > + if (pkt_count_fn) > + item_num += pkt_count_fn(gro_ctx->tbls[i]); > } > + > return item_num; > } > diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h > index 81a2eac..7979a59 100644 > --- a/lib/librte_gro/rte_gro.h > +++ b/lib/librte_gro/rte_gro.h > @@ -31,8 +31,8 @@ extern "C" { > /**< TCP/IPv4 GRO flag */ > > /** > - * A structure which is used to create GRO context objects or tell > - * rte_gro_reassemble_burst() what reassembly rules are demanded. > + * Structure used to create GRO context objects or used to pass > + * application-determined parameters to rte_gro_reassemble_burst(). > */ > struct rte_gro_param { > uint64_t gro_types; > @@ -78,26 +78,23 @@ void rte_gro_ctx_destroy(void *ctx); > > /** > * This is one of the main reassembly APIs, which merges numbers of > - * packets at a time. It assumes that all inputted packets are with > - * correct checksums. That is, applications should guarantee all > - * inputted packets are correct. Besides, it doesn't re-calculate > - * checksums for merged packets. If inputted packets are IP fragmented, > - * this function assumes them are complete (i.e. with L4 header). After > - * finishing processing, it returns all GROed packets to applications > - * immediately. > + * packets at a time. It doesn't check if input packets have correct > + * checksums and doesn't re-calculate checksums for merged packets. > + * It assumes the packets are complete (i.e., MF==0 && frag_off==0), > + * when IP fragmentation is possible (i.e., DF==1). The GROed packets > + * are returned as soon as the function finishes. > * > * @param pkts > - * a pointer array which points to the packets to reassemble. Besides, > - * it keeps mbuf addresses for the GROed packets. > + * Pointer array pointing to the packets to reassemble. Besides, it > + * keeps MBUF addresses for the GROed packets. > * @param nb_pkts > - * the number of packets to reassemble. > + * The number of packets to reassemble > * @param param > - * applications use it to tell rte_gro_reassemble_burst() what rules > - * are demanded. > + * Application-determined parameters for reassembling packets. > * > * @return > - * the number of packets after been GROed. If no packets are merged, > - * the returned value is nb_pkts. > + * The number of packets after been GROed. If no packets are merged, > + * the return value is equals to nb_pkts. > */ > uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts, > uint16_t nb_pkts, > @@ -107,32 +104,28 @@ uint16_t rte_gro_reassemble_burst(struct > rte_mbuf **pkts, > * @warning > * @b EXPERIMENTAL: this API may change without prior notice > * > - * Reassembly function, which tries to merge inputted packets with > - * the packets in the reassembly tables of a given GRO context. This > - * function assumes all inputted packets are with correct checksums. > - * And it won't update checksums if two packets are merged. Besides, > - * if inputted packets are IP fragmented, this function assumes they > - * are complete packets (i.e. with L4 header). > + * Reassembly function, which tries to merge input packets with the > + * existed packets in the reassembly tables of a given GRO context. > + * It doesn't check if input packets have correct checksums and doesn't > + * re-calculate checksums for merged packets. Additionally, it assumes > + * the packets are complete (i.e., MF==0 && frag_off==0), when IP > + * fragmentation is possible (i.e., DF==1). > * > - * If the inputted packets don't have data or are with unsupported GRO > - * types etc., they won't be processed and are returned to applications. > - * Otherwise, the inputted packets are either merged or inserted into > - * the table. If applications want get packets in the table, they need > - * to call flush API. > + * If the input packets have invalid parameters (e.g. no data payload, > + * unsupported GRO types), they are returned to applications. Otherwise, > + * they are either merged or inserted into the table. Applications need > + * to flush packets from the tables by flush API, if they want to get the > + * GROed packets. > * > * @param pkts > - * packet to reassemble. Besides, after this function finishes, it > - * keeps the unprocessed packets (e.g. without data or unsupported > - * GRO types). > + * Packets to reassemble. It's also used to store the unprocessed packets. > * @param nb_pkts > - * the number of packets to reassemble. > + * The number of packets to reassemble > * @param ctx > - * a pointer points to a GRO context object. > + * GRO context object pointer > * > * @return > - * return the number of unprocessed packets (e.g. without data or > - * unsupported GRO types). If all packets are processed (merged or > - * inserted into the table), return 0. > + * The number of unprocessed packets. > */ > uint16_t rte_gro_reassemble(struct rte_mbuf **pkts, > uint16_t nb_pkts, > @@ -142,29 +135,28 @@ uint16_t rte_gro_reassemble(struct rte_mbuf > **pkts, > * @warning > * @b EXPERIMENTAL: this API may change without prior notice > * > - * This function flushes the timeout packets from reassembly tables of > - * desired GRO types. The max number of flushed timeout packets is the > - * element number of the array which is used to keep the flushed packets. > + * This function flushes the timeout packets from the reassembly tables > + * of desired GRO types. The max number of flushed packets is the > + * element number of 'out'. > * > - * Besides, this function won't re-calculate checksums for merged > - * packets in the tables. That is, the returned packets may be with > - * wrong checksums. > + * Additionally, the flushed packets may have incorrect checksums, since > + * this function doesn't re-calculate checksums for merged packets. > * > * @param ctx > - * a pointer points to a GRO context object. > + * GRO context object pointer. > * @param timeout_cycles > - * max TTL for packets in reassembly tables, measured in nanosecond. > + * The max TTL for packets in reassembly tables, measured in nanosecond. > * @param gro_types > - * this function only flushes packets which belong to the GRO types > - * specified by gro_types. > + * This function flushes packets whose GRO types are specified by > + * gro_types. > * @param out > - * a pointer array that is used to keep flushed timeout packets. > + * Pointer array used to keep flushed packets. > * @param max_nb_out > - * the element number of out. It's also the max number of timeout > + * The element number of 'out'. It's also the max number of timeout > * packets that can be flushed finally. > * > * @return > - * the number of flushed packets. If no packets are flushed, return 0. > + * The number of flushed packets. > */ > uint16_t rte_gro_timeout_flush(void *ctx, > uint64_t timeout_cycles, > @@ -180,10 +172,10 @@ uint16_t rte_gro_timeout_flush(void *ctx, > * of a given GRO context. > * > * @param ctx > - * pointer points to a GRO context object. > + * GRO context object pointer. > * > * @return > - * the number of packets in all reassembly tables. > + * The number of packets in the tables. > */ > uint64_t rte_gro_get_pkt_count(void *ctx); > > -- > 2.7.4