On 2014-08-06 08:41, Bill Fischofer wrote: > Add ODP_PMR_LEN term to odp_pmr_term_e enum
Is this needed? > > Signed-off-by: Bill Fischofer <[email protected]> > --- > classification_design.dox | 900 ++ > images/classification_flow.eps | 33518 > +++++++++++++++++++++++++++++++++++++++ > images/classification_flow.png | Bin 0 -> 35193 bytes > 3 files changed, 34418 insertions(+) > create mode 100644 classification_design.dox > create mode 100644 images/classification_flow.eps > create mode 100644 images/classification_flow.png > > diff --git a/classification_design.dox b/classification_design.dox > new file mode 100644 > index 0000000..bf0209e > --- /dev/null > +++ b/classification_design.dox > @@ -0,0 +1,900 @@ > +/* Copyright (c) 2014, Linaro Limited > + * All rights reserved > + * > + * SPDX-License-Identifier: BSD-3-Clause > + */ > + > +/*! > +@page classification_design ODP Design - Classification API > +For the implementation of the ODP classification API please see @ref > odp_classify.h > + > +@tableofcontents > + > +@section introduction Introduction > +This document defines the Classification APIs supported by ODP v1.0. Remove v1.0 > +Classification is logically composed of two stages: Parsing and Rule > Matching. > +Parsing takes a raw packet and validates its structure and identifies fields > of interest in the various headers that comprise the layers of the packet. > +Rule Matching, in turn, takes the result of parsing and sorts packets into > Classes of Service (CoS) based on application-defined rule sets. > +@subsection use_of_terms Use of Terms > +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", > "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this > document are to be interpreted as described in [RFC > 2119](https://tools.ietf.org/html/rfc21199). > +@subsection purpose Purpose > +ODP is a framework for software-based packet forwarding/filtering > applications, and the purpose of the Packet Classifier API is to enable > applications to program the platform hardware or software implementation to > assist in prioritization, classification and scheduling of each packet, so > that the software application can run faster, scale better and adhere to QoS > requirements. > +The following API abstraction are not modelled after any existing product > implementation, but is instead defined in terms of what a typical data-plane > application may require from such a platform, without sacrificing simplicity > and avoiding ambiguity. > +Certain terms that are being used within the context of existing products in > relation to packet parsing and classification, such as “access lists” are > avoided such that not to suggest any relationship between the abstraction > used within this API and any particular manner in which they may be > implemented in hardware. > +These are the key ODP objects that the parser needs to employ, that are > presently defined in ODP: > +@subsubsection odp_pktio odp_pktio > +odp_pktio specifies an individual packet I/O channel instance. > +In other words, it would translate to a physical interface or a logical > port, or in the case of channelized protocols (e.g., > [Interlaken](https://www.google.com/url?q=https%3A%2F%2Fwww.cortina-systems.com%2Fimages%2Fdocuments%2F400023_Interlaken_Technology_White_Paper.pdf&sa=D&sntz=1&usg=AFQjCNEBdJTBmA1XaNGY3pmumQTfgSi1oA)) > it would map to a logical channel on that interface. > +Since the classifier API deals exclusively with ingress, this object > represents the source of packets into the classifier. > +In order to support any non-trivial use case, the classifier API needs to be > able to assign multiple odp_queue instances for any single odp_pktio object, > and may also assign any odp_queue instance to more than one odp_pktio object. > +@subsubsection odp_queue odp_queue > +odp_queue specifies a logical queue for packets, and in the case of ingress, > this would represent a stream of packets which share several attributes, that > are delivered to the ODP application for processing. > +The per-queue attributes currently defined are: queue type, sync (ordering); > priority; and schedule group (set of processor cores). > +@subsubsection odp_buffer_pool odp_buffer_pool > +odp_buffer_pool specifies a collection of buffers of same size and > alignment, as well as a set of policies such as flow control and processor > affinity. > +The classifier API refers to such pools that are designated for storing > ingress packets. > +@section functional_description Functional Description > +Following is the functionality that is required of the classifier API, and > its underlying implementation. > +The details and order of the following paragraph is informative, and is only > intended to help convey the functional scope of a classifier and provide > context for the API. > +In reality, implementations may execute many of these steps concurrently, or > in different order while maintaining the evident dependencies: > + > +-# Apply a set of \e classification \e rules to the header of an incoming > packet, identify the header fields, e.g., \e ethertype, IP version, IP > protocol, transport layer port numbers, IP DiffServ, VLAN id, 802.1p priority. > + > +-# Store these fields as packet meta data for application use, and for the > remainder of parser operations. > +The \e odp_pktio is also stored as one of the meta data fields for > subsequent use. > + > +-# Compute an \e odp_cos (Class of Service) value from a subset of supported > fields from 1) above. Remove (Class of Service)? this has been covered above right? > + > + > +-# Based on the \e odp_cos from 3) above, select the \e odp_queue through > which the packet is delivered to the application. > + > +-# Validate the packet data integrity (checksums, FCS) and correctness > (e.g., length fields) and store the validation result, along with optional > error layer and type indicator, in packet meta data. > +Optionally, if a packet fails validation, override the \e odp_cos selection > in step 3 to a class of service designated for errored packets. > + > +-# Since the selected \e odp_queue may require preservation of packet order, > i.e., SYNC_ATOMIC or SYNC_ORDERED, optionally select the packet header fields > from which the parser calculates a \e odp_flow_signature, which may be a > unique flow identifier or a hash, such that the packets which are assigned > the same \e odp_flow_signature are scheduled in the same order they are > received. > + > +-# Based on the \e odp_cos from 3) above, select the \e odp_buffer_pool that > should be used to acquire a buffer to store the packet data and meta data. > + > +-# Allocate a buffer from \e odp_buffer_pool selected in 6) above and > logically store the packet data and meta data to the allocated buffer, or in > accordance with class-of-service drop policy and subject to pool buffer > availability, optionally discard the packet. > + > +-# Enqueue the buffer into the \e odp_queue selected in 4) above. > + > +The above is an abstract description of the classifier functionality, and > may be applied to a variety of applications in many different ways. > +The ultimate meaning of how this functionality applies to an application > also depends on other ODP modules, so the above may not complete a full > depiction. > +For instance, the exact meaning of \e priority, which is a per-queue > attribute is influenced by the ODP scheduler semantics, and the system > behavior under stress depends on the ODP buffer pool module behavior. > + > +For the sole purpose of illustrating the above abstract functionality, here > is an example of a Layer-2 (IEEE 802.1D) bridge application: > +Such a forwarding application that also adheres to IEEE 802.1p/q priority, > which has 8 traffic priority levels, might create 8 \e odp_buffer_pool > instances, one for each PCP priority level, and 8 \e odp_queue instances one > per priority level. > +Incoming packets will be inspected for a VLAN header; the PCP field will be > extracted, and used to select both the pool and the queue. > +Because each queue will be assigned a priority value, the packets with > highest PCP values will be scheduled before any packet with a lower PCP value. > +Also, in a case of congestion, buffer pools for lower priority packets will > be depleted earlier than the pools containing packets of the high priority, > and hence the lower priority packets will be dropped (assuming that is the > only flow control method that is supported in the platform) while higher > priority packets will continue to be received into buffers and processed. > +@subsection flow_diagram Classification Processing Flow Diagram > +@image html classification_flow.png "Figure 1: Classification Flow Diagram" > +@image latex classification_flow.eps "Figure 1: Classification Flow Diagram" I can't see the full image when I open the pdf > + > +@section api_elements API Elements > +While the above description refers to the abstracted packet classifier, the > following is the description of the API designed to program the packet > classifier, and is intended to add clarity to the functions provided further > below. > +@subsection cos_creation Class of Service Creation and Binding > +To program the classifier, a class-of-service instance must be created, > which will contain the packet filtering resources that it may require. > +All subsequent calls refer to one or more of these resources. > +Each class of service instance must be associated with a single queue or > queue group, which will be the destination of all packets matching that > particular filter. > +The queue assignment is implemented as a separate function call such that > the queue may be modified at any time, without tearing down the filters that > define the class of service. > +In other words, it is possible to change the destination queue for a class > of service defined by its filters quickly and dynamically. > +Optionally, on platforms that support multiple packet buffer pools, each > class of service may be assigned a different pool such that when buffers are > exhausted for one class of service, other classes are not negatively impacted > and continue to be processed. > + > +@subsection default_packet_handling Default packet handling > +There SHOULD be one \b odp_cos assigned to each port with the \c > odp_cos_pktio_set() function, which will function as the default > class-of-service for all packets received from an ingress port, that do not > match any of the filters defined subsequently. > +At minimum this default class-of-service MUST have a queue and a buffer pool > assigned to it on platforms that support multiple packet buffer pools. > +Multiple odp_pktio instances (i.e., multiple ports) MAY each have their own > default odp_cos, or MAY share a odp_cos with other ports, based on > application requirements. > + > +@subsection packet_classification Packet Classification > +For each odp_pktio port, the API allows the assignment of a class-of-service > to a packet using one of three methods: > + > +-# The packet may be assigned a specific class-of-service based on its > Layer-2 (802.1P/902.1Q VLAN tag) priority field. > +Since the standard field defines 8 discrete priority levels, the API allows > to assign an odp_cos to each of these priority levels with the \c > odp_cos_with_l2_priority() function. > + > +-# Similarly, a class-of-service may be assigned using the Layer-3 (IP > DiffServ) header field. > +The application supplies an array of \e odp_cos values that covers the > entire range of the standard protocol header field, where array elements do > not need to contain unique values. > +There is also a need to specify if Layer-3 priority takes precedence over > Layer-2 priority in a packet with both headers present. > + > +-# Additionally, the application may also program a number of \e pattern \e > matching \e rules that assign a class-of-service for packets with header > fields matching specified values. > +The field-matching rules take precedence over the previously described > priority-based assignment of a class-of-service. > +Using these matching rules the application should be able for example to > identify all packets containing VoIP traffic based on the protocol being UDP, > and a specific destination or source port numbers, and appropriately assign > these packets an class-of-service that maps to a higher priority queue, > assuring voice packets a lower and bound latency. > + > +@subsection scaling_and_flow Scaling and Flow Discrimination > +In addition to classifying packets and routing them to those queues with the > appropriate priority, and optionally limiting their memory consumption by > designating certain classes of packets to specific buffer pools, the > classifier API also facilitates the scaling of data-plane application on > multi-core systems by creating a mechanism to define which packet headers > need to be combined to result in a value representing a specific packet flow. > +The classifier generates a signature, which can be a checksum or hash of > arbitrary strength that covers those packet header fields that are identified > by the application as identifying flows. > + > +The \e flow \e signatures that result from hashing are then stored with the > packet meta data (along with its class-of-service and its ingress \e > odp_pktio port), and subsequently may be utilized by the implementation of a > scheduler queue to maintain the order of packets with the same flow > signature, while allowing packets with different signatures to be processed > concurrently and independently on different processing cores. > + > +@subsection packet_meta_data Packet meta data Elements > +Here are the specific information elements that SHOULD be stored within the > packet meta data structure: > +- Protocol fields that are decoded and extracted by the parsing phase > +- Flow-signature calculated from a prescribed collection of protocol fields > +- The class-of-service identifier that is selected for the packet > +- The ingress port identifier > +- The result of packet validation, including an indication of the type of > error detected, if any > + > +The ODP packet API module SHALL provide accessors for retrieving the above > meta data fields from the container buffer in an implementation-independent > manner. > + > +@section api_definitions API Definitions > +@subsection data_types Data Types > +The following data types are referenced in the API descriptions described > below. > +The names are part of the ODP API and MUST be present in any conforming > implementation, however the type values shown here are illustrative and > implementations SHOULD either use these or substitute their own type values > that are appropriate to the underlying platform. > + > +@verbatim > +/** > + * 'odp_pktio_t' value to indicate any port > + */ > +#define ODP_PKTIO_ANY ((odp_pktio_t)~0) > + > + > +/** > + * 'odp_pktio_t' value to indicate an error > + */ > +#define ODP_PKTIO_INVALID ((odp_pktio_t)0) > + > + > +/** > + * Class of service instance type > + */ > +typedef uint32_t odp_cos_t; > + > + > +/** > + * flow signature type, only used for packet meta data field. > + */ > +typedef uint32_t odp_flowsig_t; > + > + > +/** > + * This value is returned from odp_cos_create() on failure, > + * May also be used as a “sink” class of service that > + * results in packets being discarded. > + */ > +#define ODP_COS_INVALID ((odp_cos_t)~0) > +@endverbatim > + > +@subsection cos_routines Class of Service Routines > +Conforming ODP implementations MUST provide the following Classification > APIs: > +@subsubsection cos_create odp_cos_create > +@verbatim > +/** > + * Create a class-of-service > + * > + * @param name is a string intended for debugging purposes. > + * > + * @return Class of service instance identifier, > + * or ODP_COS_INVALID on error. > + */ > + > +odp_cos_t odp_cos_create(const char *name); > +@endverbatim > + > +This routine is used to create a class of service that can be the target of > classifier rules. > +The number of such classes supported is implementation-defined. > +Attempts to create more than are supported by the implementation will result > in an \c ODP_COS_INVALID return and errno being set to \c > ODP_IMPLEMENTATION_LIMIT. > + > +@subsubsection cos_destroy odp_cos_destroy > +@verbatim > +/** > + * Discard a class-of-service along with all its associated resources > + * > + * @param cos_id class-of-service instance. > + * > + * @return 0 on success, -1 on error. > + */ > + > +int odp_cos_destroy(odp_cos_t cos_id); > +@endverbatim > + > +This routine is the bracketing routine for odp_cos_create(). > +It is used to destroy an existing CoS. > +It is the caller’s responsibility to ensure that no active pattern matching > rules refer to the CoS prior to calling this routine. > +Results are unpredictable if this restriction is not met. > +@subsubsection cos_set_queue odp_cos_set_queue > +@verbatim > +/** > + * Assign a queue for a class-of-service > + * > + * @param cos_id class-of-service instance. > + * > + * @param queue_id is the identifier of a queue where all packets > + * of this specific class of service will be enqueued. > + * > + * @return 0 on success, negative error code on failure. > + */ > + > +int odp_cos_set_queue(odp_cos_t cos_id, odp_queue_t queue_id); > +@endverbatim > + > +This routine associates a target queue with a CoS such that all packets > assigned to this CoS will be enqueued to the specified queue_id at the end of > classification processing. > +@subsubsection cos_set_queue_group odp_cos_set_queue_group > +@verbatim > +/** > + * Assign a homogenous queue-group to a class-of-service. > + * > + * @param cos_id identifier of class-of-service instance > + * @param queue_group_id identifier of the queue group to receive packets > + * associated with this class of service. > + * > + * @return 0 on success, negative error code on failure. > + */ > + > +int odp_cos_set_queue_group(odp_cos_t cos_id, odp_queue_group_t > queue_group_id); > +@endverbatim > + > +This routine associates a target queue group with a CoS such that all > packets assigned to this CoS will be distributed to the specified > queue_group_id at the end of classification processing. > +@subsubsection cos_set_pool odp_cos_set_pool > +@verbatim > +/** > + * Assign packet buffer pool for specific class-of-service > + * > + * @param cos_id class-of-service instance. > + * @param pool_id is a buffer pool identifier where all packet buffers > + * will be sourced to store packet that belong to this > + * class of service. > + * > + * @return 0 on success negative error code on failure. > + * > + * > + */ > + > +int odp_cos_set_pool(odp_cos_t cos_id, odp_buffer_pool_t pool_id); > +@endverbatim > + > +This OPTIONAL routine associates a target buffer pool with a CoS such that > all packets assigned to this CoS will be stored in packet buffers allocated > from the designated pool_id. > + > + > +@subsection cos_drop_policy Class of Service Drop Policy Routines > +These routines control how drop policies are to be observed for a given > class of service. > +@subsubsection drop_data_types Data types > +~~~~~{.c} > +enum odp_cos_drop_e { > + ODP_COS_DROP_POOL, /**< Follow buffer pool drop policy */ > + ODP_COS_DROP_NEVER, /**< Never drop, ignoring buffer pool > policy */ > +}; > +typedef enum odp_drop_e odp_drop_t; > +~~~~~ > + > +@subsubsection cos_set_drop odp_cos_set_drop > +@verbatim > +/** > + * Assign packet drop policy for specific class-of-service > + * > + * @param cos_id class-of-service instance. > + * @param drop_policy is the desired packet drop policy for this class. > + * > + * @return 0 on success negative error code on failure. > + */ > + > +int odp_cos_set_drop(odp_cos_t cos_id, odp_drop_t drop_policy); > +@endverbatim > + > +This routine sets the drop policy for a class of service. > +It is an OPTIONAL routine. > +If an implementation does not provide this function it MUST supply a > definition of it that simply returns ODP_FUNCTION_NOT_AVAILABLE. > +@subsubsection pktio_set_default_cos odp_pktio_set_default_cos > +@verbatim > +/** > + * Setup per-port default class-of-service > + * > + * @param pktio_in ingress port identifier. > + * @param default_cos class-of-service set to all packets arriving > + * at the 'pktio_in' ingress port, unless overridden by subsequent > + * header-based filters. > + * > + * @return 0 on success negative error code on failure. > + * > + * > + * @note This may replace the default queue per pktio. > + */ > + > +int odp_pktio_set_default_cos(odp_pktio_t pktio_in, odp_cos_t default_cos); > +@endverbatim > + > +This routine specifies a default class of service for a given pktio instance. > +Incoming packets on the specified pktio are assigned to this class of > service if no other pattern matching rule obtains. > +@subsubsection pktio_set_error_cos odp_pktio_set_error_cos > +@verbatim > +/** > + * Setup per-port error class-of-service > + * > + * @param pktio_in ingress port identifier. > + * @param error_cos class-of-service set to all packets arriving > + * at the 'pktio_in' ingress port that contain an error. > + * > + * @return 0 on success negative error code on failure. > + */ > + > +int odp_pktio_set_error_cos(odp_pktio_t pktio_in, odp_cos_t error_cos); > +@endverbatim > + > +This OPTIONAL function assigns a class-of-service used to handle packets > containing various types of errors. > +The specific errors types include L2 FCS and optionally L3/L4 checksum > errors, malformed headers, etc., depending on platform capabilities. > +The specified error_cos MAY simply discard these packets or deliver them via > a queue to the application for further processing. > +@subsubsection pktio_set_skip odp_pktio_set_skip > +@verbatim > +/** > + * Setup per-port header offset > + * > + * @param pktio_in ingress port identifier. > + * @param offset is the number of bytes the classifier must skip. > + * > + * @return Success or ODP_FUNCTION_NOT_AVAILABLE > + */ > + > +int odp_pktio_set_skip(odp_pktio_t pktio_in, size_t offset); > +@endverbatim > + > +This OPTIONAL function applies to ports that carry an additional headers > preceding the standard Ethernet header. > +Such headers are typically vendor-specific and thus the classifier is not > required to parse such headers, but the size of a custom header is critical > for the classifier to be able to parse standard protocol headers that > normally follow. > +@subsubsection cos_set_headroom odp_cos_set_headroom > +@verbatim > +/** > + * Specify per-port buffer headroom > + * > + * @param pktio_in ingress port identifier. > + * @param headroom number of bytes of space preceding packet data to > reserve > + * for use as headroom. Must not exceed the > implementation > + * defined ODP_PACKET_MAX_HEADROOM. > + * > + * @return Success or ODP_PARAMETER_ERROR, > + * or ODP_FUNCTION_NOT_AVAILABLE > + */ > + > +int odp_cos_set_headroom(odp_cos_t cos_id, size_t req_room); > +@endverbatim > + > +This OPTIONAL routine specifies the number of bytes of headroom that should > be reserved for each packet assigned to this class of service. > +Each implementation defines an ODP_PACKET_MAX_HEADROOM limit that sets an > upper bound on the size of the headroom that can be reserved for a packet. > +@subsubsection cos_with_l2_priority odp_cos_with_l2_priority > +@verbatim > +/** > + * Request to override per-port class of service > + * based on Layer-2 priority field if present. > + * > + * @param pktio_in ingress port identifier. > + * @param num_qos is the number of QoS levels, typically 8. > + * @param qos_table are the values of the Layer-2 QoS header field. > + * @param cos_table is the class-of-service assigned to each of the > + * allowed Layer-2 QOS levels. > + * @return 0 on success negative error code on failure. > + */ > + > +int odp_cos_with_l2_priority(odp_pktio_t pktio_in, > + size_t num_qos, > + uint8_t qos_table[], /**< 'num_qos' > elements */ > + odp_cos_t cos_table[]); /**< 'num_qos' > elements */ > +@endverbatim > + > +This routine is used to assign classes of service based on the layer 2 (L2) > priority associated with input packets received on the specified pktio_in. > +For each of the values in qos_table[], the corresponding value in > cos_table[] will be assigned. > +@subsubsection cos_with_l3_dscp odp_cos_with_l3_dscp > +@verbatim > +/** > + * > + * @param pktio_in ingress port identifier. > + * @param num_qos is the number of allowed Layer-3 QoS levels. > + * @param qos_table are the values of the Layer-3 QoS header field. > + * @param cos_table is the class-of-service assigned to each of the > + * allowed Layer-3 QOS levels. > + * @param l3_preference when true, Layer-3 QoS overrides L2 QoS when > present. > + * > + * @return 0 on success negative error code on failure. > + */ > + > +int odp_cos_with_l3_qos(odp_pktio_t pktio_in, > + size_t num_qos, > + uint8_t qos_table[], /**< 'num_qos' > elements */ > + odp_cos_t cos_table[], /**< 'num_qos' > elements */ > + odp_bool_t l3_preference); > +@endverbatim > + > +This OPTIONAL routine is used to assign classes of service based on the > layer 3 (L3) Differentiated Services (DS) designation. > +This is the DSCP field of an IPv4 header or the first six bits of the > Traffic Class of an IPv6 header. > +For each of the values in qos_table[], the corresponding value in > cos_table[] will be assigned. > +The l3_preference flag is use to control whether the CoS assigned by this > routine takes precedence over the CoS assigned by odp_cos_with_l2_priority() > in the event that both apply to the same packet. > + > +@subsection pmrs Pattern Matching Rules > +While the above routines permit class of service assignments to be made > based on static criteria, the real power of classification is the ability to > identify flows based on the variable contents of packet headers. > +To do this ODP provides support for defining pattern matching rules (PMRs) > that operate based on values contained in specified header fields. > + > +Associated with PMRs are enums that are used to specify standard packet > header fields: > +@subsubsection cos_hdr_flow_fields odp_cos_hdr_flow_fields_e > +@verbatim > +/** > + * Packet header field enumeration > + * for fields that may be used to calculate > + * the flow signature, if present in a packet. > + */ > + > +enum odp_cos_hdr_flow_fields_e { > + ODP_COS_FHDR_IN_PKTIO, /**< Ingress port number */ > + ODP_COS_FHDR_L2_SAP, /**< Ethernet Source MAC address */ > + ODP_COS_FHDR_L2_DAP, /**< Ethernet Destination MAC address > */ > + ODP_COS_FHDR_L2_VID, /**< Ethernet VLAN ID */ > + ODP_COS_FHDR_L3_FLOW /**< IPv6 flow_id */ > + ODP_COS_FHDR_L3_SAP, /**< IP source address */ > + ODP_COS_FHDR_L3_DAP, /**< IP destination address */ > + ODP_COS_FHDR_L4_PROTO, /**< IP protocol (e.g. TCP/UDP/ICMP) */ > + ODP_COS_FHDR_L4_SAP, /**< Transport source port */ > + ODP_COS_FHDR_L4_DAP, /**< Transport destination port */ > + ODP_COS_FHDR_IPSEC_SPI, /**< IPsec session identifier */ > + ODP_COS_FHDR_LD_VNI, /**< NVGRE/VXLAN network identifier */ > + ODP_COS_FHDR_USER /**< Application-specific header > field(s) */ > +}; > +@endverbatim > + > +Conforming ODP implementations SHOULD implement efficient flow set > management routines such as these: > + > +~~~~~{.c} > +/** > + * Set of header fields that take part in flow signature hash calculation: > + * bit positions per 'odp_cos_hdr_flow_fields_e' enumeration. > + * > +typedef uint16_t odp_cos_flow_set_t; > + > + > +/** > + * Set a member of the flow signature fields data set > + * > +static inline odp_cos_flow_set_t > +odp_cos_flow_set( odp_cos_flow_set_t set, > + enum odp_cos_hdr_flow_fields_e field) > +{ > + return set | (1U << field); > +} > + > + > +/** > + * Test a member of the flow signature fields data set > + * > +static inline bool > +odp_cos_flow_is_set( odp_cos_flow_set_t set, > + enum odp_cos_hdr_flow_fields_e field) > +{ > + return (set & (1U << field)) != 0; > +} > +~~~~~ > + > +These routines are intended to be used in support of the following flow > signature APIs: > + > +@subsubsection cos_class_flow_sig odp_cos_class_flow_signature > +@verbatim > +/** > + * Set up set of headers used to calculate a flow signature > + * based on class-of-service. > + * > + * @param cos_id class of service instance identifier > + * @param req_data_set requested data-set for flow signature calculation > + * > + * @return data-set that was successfully applied. All-zeros data set > + * indicates a failure to assign any of the requested fields, or other > + * error. > + */ > + > +odp_cos_flow_set_t > +odp_cos_class_flow_signature(odp_cos_t cos_id, > + odp_cos_flow_set_t req_data_set); > +@endverbatim > + > +This OPTIONAL routine associates a fow set with a class of service for flow > signature calculation. > + > +@subsubsection cos_port_flow_sig odp_cos_port_flow_signature > +@verbatim > +/** > + * Set up set of headers used to calculate a flow signature > + * based on ingress port. > + * > + * @param pktio_in ingress port identifier. > + * @param req_data_set requested data-set for flow signature calculation > + * > + * @return data-set that was successfully applied. An all-zeros data-set > + * indicates a failure to assign any of the requested fields, or other > + * error. > + */ > + > +odp_cos_flow_set_t > +odp_cos_port_flow_signature(odp_pktio_t pktio_in, > + odp_cos_flow_set_t req_data_set); > +@endverbatim > + > +@subsection pmr_routines Pattern Matching Rules Routines > +The following data structures SHOULD be implemented to support the > definition of pattern matching routines by conforming ODP implementations: > + > +~~~~~{.c} > +/** > + * PMR - Packet Matching Rule > + * Up to 32 bit of ternary matching of one of the available header fields > + * > + > + > +#define ODP_PMR_INVAL ((odp_pmr_t)NULL) > +typedef struct odp_pmr_s *odp_pmr_t; > +~~~~~ > + > +@subsecion terms Terms > +Terms are the elements of a PMR and are identified by the following enum: > + > +@verbatim > +enum odp_pmr_term_e { > + ODP_PMR_LEN, /**< Total length of received packet */ > + ODP_PMR_ETHTYPE_0, /**< Initial (outer) Ethertype only > (*val=uint16_t)*/ > + ODP_PMR_ETHTYPE_X, /**< Ethertype of most inner VLAN tag > (*val=uint16_t)*/ > + ODP_PMR_VLAN_ID_0, /**< First VLAN ID (outer) (*val=uint16_t) */ > + ODP_PMR_VLAN_ID_X, /**< Last VLAN ID (inner) (*val=uint16_t) */ > + ODP_PMR_DMAC, /**< destination MAC address (*val=uint64_t) */ > + ODP_PMR_IPPROTO, /**< IP Protocol or IPv6 Next Header > (*val=uint8_t) */ > + ODP_PMR_UDP_DPORT, /**< Destination UDP port, implies IPPROTO=17 > */ > + ODP_PMR_TCP_DPORT, /**< Destination TCP port implies IPPROTO=6 */ > + ODP_PMR_UDP_SPORT, /**< Source UDP Port (*val=uint16_t) */ > + ODP_PMR_TCP_SPORT, /**< Source TCP port (*val=uint16_t) */ > + ODP_PMR_SIP_ADDR, /**< Source IP address (uint32_t) */ > + ODP_PMR_DIP_ADDR, /**< Destination IP address (uint32_t) */ > + ODP_PMR_SIP6_ADDR, /**< Source IP address (uint8_t[16]) */ > + ODP_PMR_DIP6_ADDR, /**< Destination IP address (uint8_t[16]) */ > + ODP_PMR_IPSEC_SPI, /**< IPsec session identifier(*val=uint32_t) */ > + ODP_PMR_LD_VNI, /**< NVGRE/VXLAN network identifier > (*val=uint32_t) */ > + > + > + /** Inner header may repeat above values with this offset */ > + ODP_PMR_INNER_HDR_OFF=32 > +}; > +@endverbatim > + > +@subsubsection tunnel_considerations Tunnel Considerations > +Note that PMRs may be extended to support tunnels and tenants (NVGRE, > VXLAN) via the ODP_PMR_INNER_HDR_OFF enum. > +This enum is intended to be used as an “adder” to a PMR to indicate that the > term refers to an inner header. > +For example, the term ODP_PMR_DMAC would refer to the destination MAC > address of the packet if the packet is not a tunnel, or of the outer header > (the tunnel) if the packet is a tunnel. > +To refer to the inner (tenant) destination MAC, the term would be specified > as ODP_PMR_INNER_HDR_OFF+ODP_PMR_DMAC. > + > +@subsection pmr_apis PMR APIs > +The following APIs are provided to enable an ODP application to specify PMRs > as a series of individual or cascaded terms: > +@subsubsection pmr_create_match odp_pmr_create_match > +@verbatim > +/** > + * Create a packet match rule with mask and value > + * > + * @param term is one value of the enumerated values supported > + * @param val is the value to match against the packet header > + * in native byte order. > + * @param mask is the mask to indicate which bits of the header > + * should be matched ('1') and which should be ignored ('0') > + * @param val_sz size of the ‘val’ and ‘mask’ arguments, > + * that must match the value size requirement of the > + * specific ‘term’. > + * > + * @return a handle of the matching rule or ODP_PMR_INVAL on error > + */ > + > +odp_pmr_t odp_pmr_create_match(enum odp_pmr_term_e term, > + const void *val, const void *mask, size_t > val_sz); > +@endverbatim > + > +This routine creates a PMR that matches a single value to a term. > + > +@subsubsection pmr_create_range odp_pmr_create_range > +@verbatim > +/** > + * Create a packet match rule with value range > + * > + * @param term is one value of the enumerated values supported > + * @param val1 is the lower bound of the header field range. > + * @param val2 is the upper bound of the header field range. > + * @param val_sz size of the ‘val1’ and ‘val2’ arguments, > + * that must match the value size requirement of the > + * specific ‘term’. > + * > + * @return a handle of the matching rule or ODP_PMR_INVAL on error > + * @note: Range is inclusive [val1..val2]. > + */ > + > +odp_pmr_t odp_pmr_create_range(enum odp_pmr_term_e term, > + const void *val1, const void *val2, > size_t val_sz); > +@endverbatim > + > +This routine creates a PMR that matches an inclusive range of values to a > term. > + > +@subsubsection pmr_destroy odp_pmr_destroy > +@verbatim > +/** > + * Invalidate a packet match rule and vacate its resources > + * > + * @param pmr_id the identifier of the PMR to be destroyed > + * > + * @return Success or ODP_PMR_INVALID if the specified pmr_id not found. > + */ > + > +int odp_pmr_destroy(odp_omr_t pmr_id); > +@endverbatim > + > +This routine destroys a previously created PMR. > +If the PMR is currently associated with an active class of service it is > unpredictable at which point the match defined by the PMR is deactivated in > terms of packet flow. > +However, implementations MUST ensure that a PMR is either matched or not > matched in its entirety such that dynamic changes to PMRs do not result in > partial matches. > + > +@subsubsection pktio_pmr_cos odp_pktio_pmr_cos > +@verbatim > +/** > + * Apply a PMR to a pktio to assign a CoS. > + * > + * @param pmr_id the id of the PMR to be activated > + * @param src_pktio the pktio to which this PMR is to be applied > + * @param dst_cos the CoS to be assigned by this PMR > + * > + * @return Success or ODP_PARAMETER_ERROR > + */ > + > +int odp_pktio_pmr_cos(odp_pmr_t pmr_id, odp_pktio_t src_pktio, odp_cos_t > dst_cos); > +@endverbatim > + > +This routine links a pktio to a corresponding class of service via a > specified PMR. > +Any packet received on the specified src_pktio that matches the specified > pmr_id will be assigned to the specified dst_cos. > +If multiple PMRs match the implementation MAY define an inherent precedence > or it MAY be unpredictable as to which PMR will determine the assigned CoS. > +For this reason applications SHOULD NOT be written to use conflicting or > ambiguous PMR definitions. > + > +@subsubsection cos_pmr_cos odp_cos_pmr_cos > +@verbatim > +/** > + * Cascade a PMR to refine packets from one CoS to another. > + * > + * @param pmr_id the id of the PMR to be activated > + * @param src_cos the id of the CoS to be filtered > + * @param dst_cos the id of the CoS to be assigned to packets filtered > + * from src_cos that match pmr_id. > + * > + * @return Success or ODP_PARAMETER_ERROR if an input is in error > + * or ODP_IMPLEMENTATION_LIMIT if cascade depth is exceeded > + */ > + > +int odp_cos_pmr_cos(odp_pmr_t pmr_id, odp_cos_t src_cos, odp_cos_t dst_cos); > +@endverbatim > + > +This routine is used to cascade PMRs by passing packets assigned to the > src_cos through another PMR. > +Those matching are reassigned to the specified dst_cos. > +Note that this process can be repeated to an implementation-defined maximum > supported cascade depth. > +When cascades are defined, the actual class of service assigned to a packet > is the result of the longest chain of PMRs that can be matched against the > packet. > + > +For example, suppose the following sequence of PMRs is in effect: > + > +@verbatim > +odp_pktio_pmr_cos(pmr_idA, pktio_id, cos_idA); > +odp_cos_pmr_cos(pmr_idB, cos_idA, cos_idB); > +odp_cos_pmr_cos(pmr_idC, cos_idB, cos_idC); > +odp_cos_pmr_cos(pmr_idD, cos_idC, cos_idD); > +@endverbatim > + > +If a packet arrives on pktio_id that matches pmr_idA it is assigned to > cos_idA. > +But since it is now on cos_idA it is further filtered by pmr_idB and if it > matches is reassigned to cos_idB. > +This process continues until no further more specific match is found to > determine the final CoS that the packet receives. > + > +Note that given this rule set a packet that matched pmr_idA and pmr_idC it > would be assigned to cos_idA because the rule that can assign packets to > pmr_idC is only applicable to packets that are assigned to cos_idB, not > cos_idA. > + > +Using cascaded PMRs it is possible to build quite sophisticated filters (up > to the implementation limits supported by a given platform). > +For example, one could add additional rules to the above set: > + > +@verbatim > +odp_cos_pmr_cos(pmr_idAC, cos_idA, cos_idC); > +odp_cos_pmr_cos(pmr_idAD, cos_idA, cos_idD); > +@endverbatim > + > +To cover cases where some packets on cos_idA should be further sorted to > cos_idB while others should be sorted directly to cos_idC or cos_idD. > +Again it is the application’s responsibility to ensure that the cascades > remain unambiguous and that loops be avoided (e.g., having rules that bounce > packets between cos_idA and cos_idB endlessly). > + > +@subsection pmr_stats PMR Statistics > +Conforming ODP implementations SHOULD maintain statistics regarding PMRs and > provide the following routines for retrieving them: > + > +@subsubsection pmr_match_count odp_pmr_match_count > +@verbatim > +/** > + * Retrieve packet matcher statistics > + * > + * @param pmr_id the id of the PMR from which to retrieve the count > + * > + * @return The current number of matches for a given matcher instance. > + */ > + > +signed long odp_pmr_match_count(odp_pmr_t pmr_id); > +@endverbatim > + > +@subsubsection pmr_terms_cap odp_pmr_terms_cap > +@verbatim > +/** > + * Inquire about matching terms supported by the classifier > + * > + * @return A mask one bit per enumerated term, one for each of op_pmr_term_e > + */ > + > +unsigned long long odp_pmr_terms_cap(void); > +@endverbatim > + > +@subsubsection pmr_terms_avail odp_pmr_terms_avail > +@verbatim > +/** > + * Return the number of packet matching terms available for use > + * > + * @return A number of packet matcher resources available for use. > + */ > + > +unsigned odp_pmr_terms_avail(void); > +@endverbatim > + > +@subsection pmr_composite_rules Pattern Matching Composite Routines > +As a shorthand, applications MAY express pattern matching rules using a > table rather than constructing them term-by-term. > +ODP implementations MUST support both methods of rule specification but MAY > have implementation-specific restrictions on the complexity of table-based > rules they support. > +Note that some implementations MAY be able to implement tables directly > while others MAY choose to implement tables by internally generating the > equivalent set of term generating calls. > + > +@subsubsection pmr_table_structure PMR Table Structure > +@verbatim > +/** > + * Following structure is used to define composite packet matching rules > + * in the form of an array of individual match or range rules. > + * The underlying platform may not support all or any specific combination > + * of value match or range rules, and the application should take care > + * of inspecting the return value when installing such rules, and perform > + * appropriate fallback action. > + */ > + > +typedef struct odp_pmr_match_t { > + enum odp_pmr_match_type_e { > + ODP_PMR_MASK, /**< Match a masked set of bits > */ > + ODP_PMR_RANGE, /**< Match an integer range */ > + } match_type; > + union { > + struct { > + enum odp_pmr_term_e term; > + const void *val; > + const void *mask; > + unsigned int val_sz; > + } mask; /**< Match a masked set of bits */ > + struct { > + enum odp_pmr_term_e term; > + const void *val1; > + const void *val2; > + unsigned int val_sz; > + } range; /**< Match an integer range */ > + }; > +} odp_pmr_match_t; > + > + > +/** An opaque handle to a composite packet match rule-set */ > +typedef struct odp_pmr_set_s *odp_pmr_set_t; > +@endverbatim; > + > +The above structure is used with the following APIs to implement table-based > PMRs: > + > +@subsubsection pmr_match_set_create odp_pmr_match_set_create > +@verbatim > +/** > + * Create a composite packet match rule > + * > + * @param num_terms is the number of terms in the match rule. > + * @param terms is an array of num_terms entries, one entry per > + * term desired. > + * @param dst_cos is the class-of-service to be assigned to packets > + * that match the compound rule-set, or a subset thereof, > + * if partly applied. > + * @param pmr_set_id is the returned handle to the composite rule set. > + * > + * @return The return value may be a negative number indicating a general > + * error, or a positive number indicating the number of ‘terms’ elements that > + * have been successfully mapped to the underlying platform classification > engine, > + * and may be in the range from 1 to ‘num_terms’. > + */ > + > +int odp_pmr_match_set_create(int num_terms, odp_pmr_match_t *terms, > + odp_pmr_set_t *pmr_set_id); > +@endverbatim > + > +This routine is used to create a PMR match set. > + It is the equivalent to a cascade of PMRs except that there are no > “intermediate” classes of service defined. > +Instead, the entire match set either matches or does not match as a single > entity. > + > +@subsubsection pmr_match_set_destroy odp_pmr_match_set_destroy > +@verbatim > +/** > + * Function to delete a composite packet match rule set > + * > + * All of the resources pertaining to the match set associated with the > + * class-of-service will be released, but the class-of-service will > + * remain intact. > + * > + * @param pmr_set_id a composite rule-set handle returned when created. > + * > + * @note Depending on the implementation details, destroying a rule-set > + * may not guarantee the availability of hardware resources to create the > + * same or essentially similar rule-set. > + */ > + > +int odp_pmr_match_set_destroy(odp_pmr_set_t pmr_set_id); > +@endverbatim > + > +This routine destroys a PMR match set previously created by > odp_pmr_match_set_create(). > + > +@subsubsection pktio_pmr_match_set_cos odp_pktio_pmr_match_set_cos > +@verbatim > +/** > + * Apply a PMR Match Set to a pktio to assign a CoS. > + * > + * @param pmr_set_id the id of the PMR match set to be activated > + * @param src_pktio the pktio to which this PMR match set is to be applied > + * @param dst_cos the CoS to be assigned by this PMR match set > + * > + * @return Success or ODP_PARAMETER_ERROR > + */ > + > +int odp_pktio_pmr_match_set_cos(odp_pmr_t pmr_id, odp_pktio_t src_pktio, > + odp_cos_t dst_cos); > +@endverbatim > + > +This routine is the same as odp_pktio_pmr_cos() except that it operates on > PMR match sets rather than individual PMRs. > + > +@section items_pending Items pending resolution > +- Revise ‘odp_packet_io.h’ API with respect of default input queue per > ‘pktio’ instance. > +- Revise ‘odp_queue.h’ API to support an arbitrary priority range, typically > 8 priority levels with numeric priority values are platform-specific. > +- Add specific packet meta data fields to go into packet buffer which > contain all meta data fields parsed and generated by the classifier, for > later application use. > + > +@section implementation_notes Implementation Notes > +The following sections are not part of the specification, but shed light > into the intent of the specification in several areas, describing some > specific implementation approaches of these aspects. > + > +@subsection supporting_multi_pools Supporting multiple buffer pools > +The support of multiple buffer pools for containing packet buffers is > optional, and may not be supported by some platforms. > +The importance of this feature stems from the need of protecting a > networking application in the event of a congestion, or an attempted denial > of service attack. > +Separating different classes of service to dedicated buffer pools allows the > system to limit the memory resources that may be consumed by a particular > type of traffic, thereby reserving buffer resources for other classes of > traffic. > + > +In a software implementation, a packet would already be stored in memory > when the classifier is invoked, and so it seems the classifier is unable to > insert itself into the process of selecting a buffer pool. > +For obvious reasons the copying of a packet into a new buffer allocated from > a different pool by the classifier is not a desirable solution. > + > +The recommended solution is to implement buffer pools in the form of buffer > counters, while the actual buffers all belong to a single free list when not > used to store a packet. > +In such an implementation, the classifier will be able to associate a packet > already occupying a buffer to a different pool than the default by > incrementing the buffer counter of the newly selected pool, and decrementing > the counter representing the default pool. > +If however the selected pool counter has already reached a certain limit, > the classifier would be able to e.g discard the packet instead of > incrementing the destination pool counter, and thereby enforce the desirable > semantics of distinct buffer pools per class of service. > + > +Other possible action that may be taken in response to running out of > buffers or coming too low on buffers include back-pressure and > random-early-detect with a discard probability inversely proportional to the > number of free buffers in a pool. > +A related implementation topic is the ability to begin dropping some packets > before a buffer pool is entirely exhausted. > +This is typically referred to as <em>Random Early Detect</em> (or “RED”). > +This is deemed to be a feature of the buffer pool implementation on a given > platform, where in addition to a hard limit on the number of buffers that can > be allocated to a pool, there can also be an option discard packets with a > probability the increases as the number of outstanding buffers approaches > that hard limit. > + > +@subsection resolving_gaps Resolving gaps between the API and hardware > capabilities > +On platforms that support hardware packet accelerators, it is possible that > the packet parsing and classification functionality is sufficient to address > only a portion of the functionality specified within this document. > +This gap may be potentially bridged by augmenting the hardware > classification capabilities with a software logic implemented as part of the > platform. > +In that case, the platform will have to curve out a fraction of the > processing resources and dedicate those to the software classification logic, > which would be invoked for packets that the hardware platform was unable to > classify completely. > +At the time of this writing, it is believed however that the performance > penalty that will be incurred as a result of software augmentation is > unjustified for most application, i.e. > +it is preferred to lose the precision of packet prioritization while > maintaining full hardware packet processing speed. > + > +@subsection loopback_case The case for loopback ports, and some of their uses > +In some applications, it may be desirable to be able to run a single packet > through the classifier more than once. > +For example, an encrypted IPsec packet is received from a physical port. > +The encrypted packet is assigned a class of service based on its outer > unencrypted header fields. > +Later, processing the packet entails decrypting the payload of the packet, > authenticating it, and removing the original outer headers, which reveals a > new set of protocol headers which need to be used to re-classify the packet, > and assign it a new priority and buffer pool. > +An elegant solution for this use case would be to take advantage of > “loopback” logical ports that may be implemented in certain platforms, by > transmitting decapsulated packet into a loop-back port. > +The same packet then is received from a loop-back port and is examined by > the classifier in accordance to the rules assigned to the loopback odp_pktio > logical port instance. > +Similar mechanism may be applied to tunnel termination processing, fragment > reassembly et al. > + > +@section related_topics Related Topics > +The following section discusses aspects of the ODP API that are not integral > to the classifier, which only applies to ingress preprocessing. > +This section covers miscellaneous aspects of the API that need to be > addressed, and are related to packet buffer processing and egress > post-processing. > +Additional packet buffer manipulation APIs > +The need for these following calls are made evident by the need to > encapsulate, i.e., remove some headers and add other, thereby changing the > size of the headers of a packet during processing. > + > +@subsection initial_headroom Configuring initial packet buffer headroom > +The following function is provided to configure the pktio receive mechanism > to (optionally)reserve some headroom between start of the first buffer to the > first byte of the first packet data byte, which subsequently could be used to > increase the header size “in-place”, without allocating additional gather > list elements. > +If the request is granted, at least <req_bytes> bytes will be reserved in > the front of the packet data: > +@verbatim > +int odp_pktio_set_headroom(odp_pktio_t port_id, unsigned req_bytes); > +@endverbatim > +The return value should be negative if the request can not be satisfied, or > positive otherwise indicating the actual minimum headroom reserved. > +Note that the implementation may reserve more than the requested amount of > headroom, and hence on platforms that are unable to support per-port (or per > CoS) headroom configuration, a system-wide headroom configuration may be set > to the largest of all such requests, and thus satisfy the requirement. > +In addition to the above per-port headroom configuration call, there should > be an optional, per-CoS call that allows the reservation of different amounts > of packet buffer headroom for packets that match certain criteria: for > example, the following call allows the application to request that only > packets that are expected to be encapsulated in a tunnel, be augmented with a > large headroom amount, while packets that are received from a tunnel, and are > IP fragments, be assigned a different headroom requirement (see definition > for odp_cos_set_headroom() above. > +Egress packet scheduling, prioritization and ordering > + > + > +Open Issues > +* Parallel matching rules relative precedence. > +* Specify application-defined header field declaration APIs. > +* Review RFC 4301 for match requirements for IPsec SA, consider the use of > L4 port ranges instead of or in addition to value & mask matching criteria. > +* Consider the type of packet checks should route a packet through the error > CoS: L2 is a safe choice, but L3/L4 checksum or other exceptions deserve > consideration. > +Usage Examples > +Following is a simple sample configuration using the API elements described > above. > +TBD. TBD?? Cheers, Anders _______________________________________________ lng-odp mailing list [email protected] http://lists.linaro.org/mailman/listinfo/lng-odp
