Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

Ferruh Yigit Thu, 18 May 2017 05:13:14 -0700

On 5/17/2017 5:10 PM, Ananyev, Konstantin wrote:
>>> Hi Ferruh,
>>> Please see my comments/questions below.
>>> Thanks
>>> Konstantin
>>>
>>>> +
>>>> +/**
>>>> + * @file
>>>> + *
>>>> + * RTE Flow Classify Library
>>>> + *
>>>> + * This library provides flow record information with some measured 
>>>> properties.
>>>> + *
>>>> + * Application can select variety of flow types based on various flow 
>>>> keys.
>>>> + *
>>>> + * Library only maintains flow records between 
>>>> rte_flow_classify_stats_get()
>>>> + * calls and with a maximum limit.
>>>> + *
>>>> + * Provided flow record will be linked list rte_flow_classify_stat_xxx
>>>> + * structure.
>>>> + *
>>>> + * Library is responsible from allocating and freeing memory for flow 
>>>> record
>>>> + * table. Previous table freed with next rte_flow_classify_stats_get() 
>>>> call and
>>>> + * all tables are freed with rte_flow_classify_type_reset() or
>>>> + * rte_flow_classify_type_set(x, 0). Memory for table allocated on the 
>>>> fly while
>>>> + * creating records.
>>>> + *
>>>> + * A rte_flow_classify_type_set() with a valid type will register Rx/Tx
>>>> + * callbacks and start filling flow record table.
>>>> + * With rte_flow_classify_stats_get(), pointer sent to caller and 
>>>> meanwhile
>>>> + * library continues collecting records.
>>>> + *
>>>> + *  Usage:
>>>> + *  - application calls rte_flow_classify_type_set() for a device
>>>> + *  - library creates Rx/Tx callbacks for packets and start filling flow 
>>>> table
>>>
>>> Does it necessary to use an  RX callback here?
>>> Can library provide an API like collect(port_id, input_mbuf[], pkt_num) 
>>> instead?
>>> So the user would have a choice either setup a callback or call collect() 
>>> directly.
>>
>> This was also comment from Morten, I will update RFC to use direct API call.
>>
>>>
>>>> + *    for that type of flow (currently only one flow type supported)
>>>> + *  - application calls rte_flow_classify_stats_get() to get pointer to 
>>>> linked
>>>> + *    listed flow table. Library assigns this pointer to another value 
>>>> and keeps
>>>> + *    collecting flow data. In next rte_flow_classify_stats_get(), 
>>>> library first
>>>> + *    free the previous table, and pass current table to the application, 
>>>> keep
>>>> + *    collecting data.
>>>
>>> Ok, but that means that you can't use stats_get() for the same type
>>> from 2 different threads without explicit synchronization?
>>
>> Correct.
>> And multiple threads shouldn't be calling this API. It doesn't store
>> previous flow data, so multiple threads calling this only can have piece
>> of information. Do you see any use case that multiple threads can call
>> this API?
> 
> One example would be when you have multiple queues per port,
> managed/monitored by different cores.
> BTW, how are you going to collect the stats in that way?
> 
>>
>>>
>>>> + *  - application calls rte_flow_classify_type_reset(), library 
>>>> unregisters the
>>>> + *    callbacks and free all flow table data.
>>>> + *
>>>> + */
>>>> +
>>>> +enum rte_flow_classify_type {
>>>> +  RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0),
>>>> +  RTE_FLOW_CLASSIFY_TYPE_MAX,
>>>> +};
>>>> +
>>>> +#define RTE_FLOW_CLASSIFY_TYPE_MASK = (((RTE_FLOW_CLASSIFY_TYPE_MAX - 1) 
>>>> << 1) - 1)
>>>> +
>>>> +/**
>>>> + * Global configuration struct
>>>> + */
>>>> +struct rte_flow_classify_config {
>>>> +  uint32_t type; /* bitwise enum rte_flow_classify_type values */
>>>> +  void *flow_table_prev;
>>>> +  uint32_t flow_table_prev_item_count;
>>>> +  void *flow_table_current;
>>>> +  uint32_t flow_table_current_item_count;
>>>> +} rte_flow_classify_config[RTE_MAX_ETHPORTS];
>>>> +
>>>> +#define RTE_FLOW_CLASSIFY_STAT_MAX UINT16_MAX
>>>> +
>>>> +/**
>>>> + * Classification stats data struct
>>>> + */
>>>> +struct rte_flow_classify_stat_generic {
>>>> +  struct rte_flow_classify_stat_generic *next;
>>>> +  uint32_t id;
>>>> +  uint64_t timestamp;
>>>> +
>>>> +  struct ether_addr src_mac;
>>>> +  struct ether_addr dst_mac;
>>>> +  uint32_t src_ipv4;
>>>> +  uint32_t dst_ipv4;
>>>> +  uint8_t l3_protocol_id;
>>>> +  uint16_t src_port;
>>>> +  uint16_t dst_port;
>>>> +
>>>> +  uint64_t packet_count;
>>>> +  uint64_t packet_size; /* bytes */
>>>> +};
>>>
>>> Ok, so if I understood things right, for generic type it will always 
>>> classify all incoming packets by:
>>> <src_mac, dst_mac, src_ipv4, dst_ipv4, l3_protocol_id, src_port, dst_port>
>>> all by absolute values, and represent results as a linked list.
>>> Is that correct, or I misunderstood your intentions here?
>>
>> Correct.
>>
>>> If so, then I see several disadvantages here:
>>> 1) It is really hard to predict what kind of stats is required for that 
>>> particular cases.
>>>  Let say some people would like to collect stat by <dst_mac,, vlan> ,
>>> another by <dst_ipv4,subnet_mask>, third ones by <l4 dst_port> and so on.
>>> Having just one hardcoded filter doesn't seem very felxable/usable.
>>> I think you need to find a way to allow user to define what type of filter 
>>> they want to apply.
>>
>> The flow type should be provided by applications, according their needs,
>> and needs to be implemented in this library. The generic one will be the
>> only one implemented in first version:
>> enum rte_flow_classify_type {
>>      RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0),
>>      RTE_FLOW_CLASSIFY_TYPE_MAX,
>> };
>>
>>
>> App should set the type first via the API:
>> rte_flow_classify_type_set(uint8_t port_id, uint32_t type);
>>
>>
>> And the stats for this type will be returned, because returned type can
>> be different type of struct, returned as void:
>> rte_flow_classify_stats_get(uint8_t port_id, void *stats);
> 
> I understand that, but it means that for every different filter user wants to 
> use,
> someone has to update the library: define a new type and write a new piece of 
> code to handle it.
> That seems not flexible and totally un-extendable from user perspective.
> Even  HW allows some flexibility with RX filters.
> Why not allow user to specify a classification filter  he/she wants for that 
> particular case?
> In a way both rte_flow and rte_acl work?
> 
>>
>>> I think it was discussed already, but I still wonder why rte_flow_item 
>>> can't be used for that approach?
>>
>>
>>> 2) Even  one 10G port can produce you ~14M rte_flow_classify_stat_generic 
>>> entries in one second
>>> (all packets have different ipv4/ports or so).
>>> Accessing/retrieving items over linked list with 14M entries - doesn't 
>>> sound like a good idea.
>>> I'd say we need some better way to retrieve/present collected data.
>>
>> This is to keep flows, so I expect the numbers will be less comparing to
>> the packet numbers.
> 
> That was an  extreme example to show how bad the selected approach should 
> behave.
> What I am trying to say: we need a way to collect and retrieve stats in a 
> quick and easy way.
> Let say right now user invoked stats_get(port=0, type=generic).
> Now, he is interested to get stats for particular dst_ip only.
> The only way to get it: walk over whole list stats_get() returned and examine 
> each entry one by one.
> 
> I think would be much better to have something like:
> 
> struct rte_flow_stats {timestamp; packet_count; packet_bytes; ..};
> 
> <fill rte_flow_item (or something else) to define desired filter>
> 
> filter_id = rte_flow_stats_register(.., &rte_flow_item);
> ....
> struct rte_flow_stats stats;
> rte_flow_stats_get(..., filter_id, &stats);
> 
> That allows user to define flows to collect stats for.
> Again in that case you don't need to worry about when/where to destroy the 
> previous
> version of your stats.


Except from using rte_flow, above suggest instead of:
- set key/filter
- poll collect()
- when ever app wants call stats_get()

using:
- poll stats_get(key/filter);

specially after switched from callbacks to polling, this makes sense
because application already will have to do to continuous calls to this
library. Merging set filter/collect/stats_get into same function saves
library from storing/deleting stats until app asks for them, as you
mentioned above.

So, I will update RFC according.

> Of course the open question is how to treat packets that would match more 
> than one flow
> (priority/insertion order/something else?), but I suppose we'll need to deal 
> with that question anyway.
>  
> Konstantin
> 
>> It is possible to use fixed size arrays for this. But I think it is easy
>> to make this switch later, I would like to see the performance effect
>> before doing this switch. Do you think is it OK to start like this and
>> give that decision during implementation?

Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

Reply via email to