from:"Ivan Boule"

[dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists

2016-06-15 Thread Ivan Boule

On 06/15/2016 04:27 PM, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: Richardson, Bruce
>> Sent: Wednesday, June 15, 2016 3:22 PM
>> To: Ananyev, Konstantin
>> Cc: Ivan Boule; Thomas Monjalon; Pattan, Reshma; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx 
>> callback lists
>>
>> On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -Original Message-
>>>> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
>>>> Sent: Wednesday, June 15, 2016 3:07 PM
>>>> To: Richardson, Bruce; Ananyev, Konstantin
>>>> Cc: Thomas Monjalon; Pattan, Reshma; dev at dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx 
>>>> callback lists
>>>>
>>>> On 06/15/2016 03:29 PM, Bruce Richardson wrote:
>>>>> On Wed, Jun 15, 2016 at 12:40:16PM +, Ananyev, Konstantin wrote:
>>>>>> Hi Ivan,
>>>>>>
>>>>>>> -Original Message-
>>>>>>> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
>>>>>>> Sent: Wednesday, June 15, 2016 1:15 PM
>>>>>>> To: Thomas Monjalon; Ananyev, Konstantin
>>>>>>> Cc: Pattan, Reshma; dev at dpdk.org
>>>>>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect 
>>>>>>> Rx/Tx callback lists
>>>>>>>
>>>>>>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
>>>>>>>
>>>>>>>>>
>>>>>>>>>> I think the read access would need locking but we do not want it
>>>>>>>>>> in fast path.
>>>>>>>>>
>>>>>>>>> I don't think it would be needed.
>>>>>>>>> As I said - read/write interaction didn't change from what we have 
>>>>>>>>> right now.
>>>>>>>>> But if you have some particular scenario in mind that you believe 
>>>>>>>>> would cause
>>>>>>>>> a race condition - please speak up.
>>>>>>>>
>>>>>>>> If we add/remove a callback during a burst? Is it possible that the 
>>>>>>>> next
>>>>>>>> pointer would have a wrong value leading to a crash?
>>>>>>>> Maybe we need a comment to state that we should not alter burst
>>>>>>>> callbacks while running burst functions.
>>>>>>>>
>>>>>>>
>>>>>>> Hi Reshma,
>>>>>>>
>>>>>>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
>>>>>>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
>>>>>>> of RX callbacks associated with the polled RX queue to safely remove RX
>>>>>>> callback(s) in parallel.
>>>>>>> The problem is not [only] with the setting and the loading of "cb->next"
>>>>>>> that you assume to be atomic operations, which is certainly true on most
>>>>>>> CPUs.
>>>>>>> I see the 2 important following issues:
>>>>>>>
>>>>>>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
>>>>>>> RX callback could still be accessed in the callback parsing loop of the
>>>>>>> function "rte_eth_rx_burst()" after having been freed in parallel.
>>>>>>>
>>>>>>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
>>>>>>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
>>>>>>> does not free the "rte_eth_rxtx_callback" data structure associated with
>>>>>>> the removed callback !
>>>>>>
>>>>>> Yes, though it is documented behaviour, someone can probably
>>>>>> refer it as a feature, not a bug ;)
>>>>>>
>>>>>
>>>>> +1
>>>>> This is definitely not a bug, this is absolutely by design. One may argue 
>>>>> with
>>>>> the design, but it was done for a definite reason, so as to avoid paying 
>>>>> the
>>>>> penalty of

[dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists

2016-06-15 Thread Ivan Boule

On 06/15/2016 10:48 AM, Thomas Monjalon wrote:

>>
>>> I think the read access would need locking but we do not want it
>>> in fast path.
>>
>> I don't think it would be needed.
>> As I said - read/write interaction didn't change from what we have right now.
>> But if you have some particular scenario in mind that you believe would cause
>> a race condition - please speak up.
>
> If we add/remove a callback during a burst? Is it possible that the next
> pointer would have a wrong value leading to a crash?
> Maybe we need a comment to state that we should not alter burst
> callbacks while running burst functions.
>

Hi Reshma,

You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the 
function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list 
of RX callbacks associated with the polled RX queue to safely remove RX 
callback(s) in parallel.
The problem is not [only] with the setting and the loading of "cb->next" 
that you assume to be atomic operations, which is certainly true on most 
CPUs.
I see the 2 important following issues:

1) the "rte_eth_rxtx_callback" data structure associated with a removed 
RX callback could still be accessed in the callback parsing loop of the 
function "rte_eth_rx_burst()" after having been freed in parallel.

BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE 
MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that 
does not free the "rte_eth_rxtx_callback" data structure associated with 
the removed callback !

2) As a consequence of 1), RX callbacks can be invoked/executed 
while/after being removed.
If the application must free resources that it dynamically allocated to 
be used by the RX callback being removed, how to guarantee that the last 
invocation of that RX callback has been completed and that such a 
callback will never be invoked again, so that the resources can safely 
be freed?

This is an example of a well-known more generic object deletion problem 
which must arrange to guarantee that a deleted object is not used and 
not accessible for use anymore before being actually deleted (freed, for 
instance).

Note that a lock cannot be used in the execution path of the 
rte_eth_rx_burst() function to address this issue, as locks MUST NEVER 
be introduced in the RX/TX path of the DPDK framework.

Of course, the same issues stand for TX callbacks.

Regards,
Ivan

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop

2016-05-03 Thread Ivan Boule

On 05/03/2016 11:45 AM, Bruce Richardson wrote:
> On Mon, May 02, 2016 at 05:29:37PM +0530, Jerin Jacob wrote:
>> prefetch the next packet data address in advance in macswap loop
>> for performance improvement.
>>
>> ...
>>  for (i = 0; i < nb_rx; i++) {
>> +if (likely(i < nb_rx - 1))
>> +rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1],
>> +   void *));
>
> At least on IA platforms, there is no issue with prefetching beyond the end of
> the array, since it's only a hint to the cpu. If this is true for other 
> platforms,
> then I suggest we just drop the conditional and just always prefetch.

This is an interesting point.
Bruce, are you suggesting that prefetching at an invalid [virtual] 
address won't trigger a CPU exception?

Ivan

>
> /Bruce
>
>>  mb = pkts_burst[i];
>>  eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);

[dpdk-dev] ixgbe : query regarding your code changes for VF mac add

2016-04-26 Thread Ivan Boule

mplete
>>
>> [edit]
>> root at mx86-bgl-2-r1# show interfaces
>> ge-0/0/0 {
>>  unit 0 {
>>  family inet {
>>      address 10.1.1.102/24;
>>  }
>>  }
>> }
>>
>> [edit]
>> root at mx86-bgl-2-r1# run ping 10.1.1.101
>> PING 10.1.1.101 (10.1.1.101): 56 data bytes
>> ^C
>> --- 10.1.1.101 ping statistics ---
>> 3 packets transmitted, 0 packets received, 100% packet loss
>>
>> [edit]
>> root at mx86-bgl-2-r1#
>>
>> b.
>>   Logs at console for above cofig
>>   -
>>
>>
>> root at localhost:/home/pfe# ixgbevf_remove_mac_addr santosh portid=0
>> index=1 mac=00:50:56:A0:A0:C3
>>
>>
>>
>> Thanks
>> Santosh
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Apr 22, 2016 at 12:51 PM, Ivan Boule  wrote:
>>> Hi Santosh,
>>>
>>> My job at 6WIND does not consist in answering to DPDK questions. In general,
>>> I have other priorities, including vacations...
>>> In the meantime, nobody prevents you to add traces in the code to really
>>> understand what happens, as suggested in my last answer.
>>>
>>> Regards,
>>> Ivan
>>>
>>>
>>> On 04/21/2016 07:55 AM, santosh wrote:
>>>>
>>>> Hi Ivan and team,
>>>>
>>>> Please respond to my last mail and  let me know if there is any
>>>> alternate way to handle this.
>>>> Our release is in pending due to this issue.
>>>>
>>>>
>>>> Thanks & Regards
>>>> Santosh
>>>>
>>>> On Wed, Apr 20, 2016 at 2:35 PM, santosh  wrote:
>>>>>
>>>>> Hi Ivan,
>>>>>
>>>>> Thanks for your response.
>>>>>
>>>>> Let me explain you the issue that we are facing on our virtual router
>>>>> in VMware environment.
>>>>>
>>>>> 1. We are using ixgbe driver , SRIOV enabled .
>>>>>   root at localhost:~# lspci
>>>>>  "Intel Corporation 82599 Ethernet Controller Virtual
>>>>> Function"
>>>>>
>>>>> 2.  "mx86-bgl-1-r1"  is our router under testing and  R2 is a standard
>>>>> router.
>>>>>
>>>>> mx86-bgl-1-r1 is connected to R2
>>>>> mx86-bgl-1-r1 (10.1.1.103)-->R2(10.1.1.101)
>>>>>
>>>>> 2. This time ping test passes
>>>>>
>>>>> root at mx86-bgl-1-r1# show interfaces
>>>>> ge-0/0/0 {
>>>>>   unit 0 {
>>>>>   family inet {
>>>>>   address 10.1.1.103/24;
>>>>>   }
>>>>>   }
>>>>> }
>>>>>
>>>>> [edit]
>>>>> root at mx86-bgl-1-r1#
>>>>> root at mx86-bgl-1-r1# run ping 10.1.1.101
>>>>> PING 10.1.1.101 (10.1.1.101): 56 data bytes
>>>>> 64 bytes from 10.1.1.101: icmp_seq=0 ttl=64 time=0.980 ms
>>>>> 64 bytes from 10.1.1.101: icmp_seq=1 ttl=64 time=1.433 ms
>>>>>
>>>>>
>>>>> 3.  Default MAC address of interface ge-0/0/0  is as shown below:
>>>>>
>>>>> root at mx86-bgl-1-r1# run show interfaces ge-0/0/0 extensive | grep 
>>>>> current
>>>>> Current address: 02:06:0a:0e:ff:f0, Hardware address:
>>>>> 02:06:0a:0e:ff:f0
>>>>>
>>>>> [edit]
>>>>> root at mx86-bgl-1-r1#
>>>>>
>>>>>
>>>>> 4.  Now I am changing MAC. Ping works this time also
>>>>>
>>>>> root at mx86-bgl-1-r1# set interfaces ge-0/0/0 mac 02:06:0a:0a:ff:f0
>>>>>
>>>>> [edit]
>>>>> root at mx86-bgl-1-r1# commit
>>>>> commit complete
>>>>>
>>>>> [edit]
>>>>> root at mx86-bgl-1-r1# run show interfaces ge-0/0/0 extensive | grep 
>>>>> current
>>>>> Current address: 02:06:0a:0a:ff:f0, Hardware address:
>>>>> 02:06:0a:0e:ff:f0
>>>>>
>>>>> [edit]
>>>>> root at mx86-bgl-1-r1# show interfaces
>>>>> ge-0/0/0 {
>>>>>   mac 02:06:0a:0a:ff:f0;
>>>>>   unit 0 {
>>>>

[dpdk-dev] ixgbe : query regarding your code changes for VF mac add

2016-04-22 Thread Ivan Boule

ct ether_addr)) == 
>> 0) {
>>  PMD_DRV_LOG(DEBUG, "Existing MAC \n");
>>  printf("Santosh %s returning \n",__FUNCTION__);
>>  return;
>>  }
>>
>>
>> 8.  If I remove the above "return" statement, build the driver, loaded
>> in router and repeat the steps-2 to steps-5 of MAC add and MAC delete
>> on our router then ping passes.
>>
>> root at mx86-bgl-1-r1# run ping 10.1.1.101
>> PING 10.1.1.101 (10.1.1.101): 56 data bytes
>> 64 bytes from 10.1.1.101: icmp_seq=0 ttl=64 time=2.356 ms
>> 64 bytes from 10.1.1.101: icmp_seq=1 ttl=64 time=1.415 ms
>> 64 bytes from 10.1.1.101: icmp_seq=2 ttl=64 time=1.226 ms
>> ^C
>> --- 10.1.1.101 ping statistics ---
>> 3 packets transmitted, 3 packets received, 0% packet loss
>> round-trip min/avg/max/stddev = 1.226/1.666/2.356/0.494 ms
>>
>>
>> 9.  Please let me know whether is it  fine to remove that return
>> statement added by you in  "ixgbevf_add_mac_addr"  ?
>>
>>
>>
>> Thanks & Regards,
>> Santosh
>>
>> On Wed, Apr 20, 2016 at 3:31 AM, Ivan Boule  wrote:
>>> Hi Santosh,
>>>
>>> I do not get exactly what you attempt to do on a VF.
>>> Are you first deleting the so-called permanent MAC address by a call to the
>>> function ixgbevf_remove_mac_addr() ? This operation is not allowed.
>>> Can you explain exactly the sequence of operations that are done, so that I
>>> can understand how the test (memcmp(hw->mac.perm_addr, mac_addr,
>>> sizeof(struct ether_addr)) == 0) in the function ixgbevf_add_mac_addr()
>>> prevents them to be successfully performed.
>>>
>>> Ivan
>>>
>>> PS : please, can you CC your emails to dev at dpdk.org
>>>
>>>
>>> 2016-04-19 17:01 GMT+02:00 santosh :
>>>>
>>>> Hi Ivan,
>>>>
>>>> Your following code changes causing issue in Vmware environment.
>>>>
>>>> --- ---
>>>> --
>>>> + /*
>>>> + * On a 82599 VF, adding again the same MAC addr is not an idempotent
>>>> + * operation. Trap this case to avoid exhausting the [very limited]
>>>> + * set of PF resources used to store VF MAC addresses.
>>>> + */
>>>> + if (memcmp(hw->mac.perm_addr, mac_addr, sizeof(struct ether_addr)) == 0)
>>>> + return;
>>>>diag = ixgbevf_set_uc_addr_vf(hw, 2, mac_addr->addr_bytes);
>>>> 
>>>> - ---
>>>>
>>>>
>>>> Issue:
>>>> At CLI , we have provision to add /del MAC of an interface.
>>>> During MAC delete, existing MAC is deleted and default MAC is applied.
>>>> This default MAC is not being applied in VMware environment
>>>> successfully due to "return" statement
>>>> in your above code changes. As a result traffic is stopped completely.
>>>> If I remove above
>>>> "return" statement then traffic continues to flow after MAC delete.
>>>>
>>>> Please let me know your suggestion to handle this scenario .  If I
>>>> remove "return" what will be the consequences ?
>>>>
>>>> If removing "return" statement is not good idea then what are other
>>>> way to handle MAC delete scenario ?  we have only 1 VF per PF in our
>>>> setup as of now.
>>>>
>>>>
>>>> Thanks
>>>> Santosh
>>>
>>>
>>>
>>>
>>> --
>>> Ivan BOULE
>>> 6WIND
>>> Software Engineer
>>>
>>> Tel: +33 1 39 30 92 47
>>> Mob: + 33 6 77 25 26 38
>>> Fax: +33 1 39 30 92 11
>>> ivan.boule at 6wind.com
>>> www.6wind.com
>>> Join the Multicore Packet Processing Forum:
>>> www.multicorepacketprocessing.com
>>>
>>> Ce courriel ainsi que toutes les pi?ces jointes, est uniquement destin? ?
>>> son ou ses destinataires. Il contient des informations confidentielles qui
>>> sont la propri?t? de 6WIND. Toute r?v?lation, distribution ou copie des
>>> informations qu'il contient est strictement interdite. Si vous avez re?u ce
>>> message par erreur, veuillez imm?diatement le signaler ? l'?metteur et
>>> d?truire toutes les donn?es re?ues.
>>>
>>> This e-mail message, including any attachments, is for the sole use of the
>>> intended recipient(s) and contains information that is confidential and
>>> proprietary to 6WIND. All unauthorized review, use, disclosure or
>>> distribution is prohibited. If you are not the intended recipient, please
>>> contact the sender by reply e-mail and destroy all copies of the original
>>> message.
>>>


-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] ixgbe : query regarding your code changes for VF mac add

2016-04-22 Thread Ivan Boule

e not from
> NIC.
>Current address: 02:06:0a:0e:ff:f0, Hardware address: 02:06:0a:0e:ff:f0
>
> [edit]
> root at mx86-bgl-1-r1# run ping 10.1.1.101
> PING 10.1.1.101 (10.1.1.101): 56 data bytes
> ^C
> 2 packets transmitted, 0 packets received, 100% packet loss
>
> [edit]
> root at mx86-bgl-1-r1#
>
>
> 7. LOGs:(I have added a debug log just before the return statement
> that you place)
>
> Log o/p in failure case
> 
> Santosh ixgbevf_add_mac_addr returning
> 
>
> code changes:
> -
> ixgbevf_add_mac_addr()  {
> ...
>  if (memcmp(hw->mac.perm_addr, mac_addr, sizeof(struct ether_addr)) == 0) 
> {
>  PMD_DRV_LOG(DEBUG, "Existing MAC \n");
>  printf("Santosh %s returning \n",__FUNCTION__);
>  return;
>  }
>
>
> 8.  If I remove the above "return" statement, build the driver, loaded
> in router and repeat the steps-2 to steps-5 of MAC add and MAC delete
> on our router then ping passes.
>
> root at mx86-bgl-1-r1# run ping 10.1.1.101
> PING 10.1.1.101 (10.1.1.101): 56 data bytes
> 64 bytes from 10.1.1.101: icmp_seq=0 ttl=64 time=2.356 ms
> 64 bytes from 10.1.1.101: icmp_seq=1 ttl=64 time=1.415 ms
> 64 bytes from 10.1.1.101: icmp_seq=2 ttl=64 time=1.226 ms
> ^C
> --- 10.1.1.101 ping statistics ---
> 3 packets transmitted, 3 packets received, 0% packet loss
> round-trip min/avg/max/stddev = 1.226/1.666/2.356/0.494 ms
>
>
> 9.  Please let me know whether is it  fine to remove that return
> statement added by you in  "ixgbevf_add_mac_addr"  ?
>
>
>
> Thanks & Regards,
> Santosh
>
> On Wed, Apr 20, 2016 at 3:31 AM, Ivan Boule  wrote:
>> Hi Santosh,
>>
>> I do not get exactly what you attempt to do on a VF.
>> Are you first deleting the so-called permanent MAC address by a call to the
>> function ixgbevf_remove_mac_addr() ? This operation is not allowed.
>> Can you explain exactly the sequence of operations that are done, so that I
>> can understand how the test (memcmp(hw->mac.perm_addr, mac_addr,
>> sizeof(struct ether_addr)) == 0) in the function ixgbevf_add_mac_addr()
>> prevents them to be successfully performed.
>>
>> Ivan
>>
>> PS : please, can you CC your emails to dev at dpdk.org
>>
>>
>> 2016-04-19 17:01 GMT+02:00 santosh :
>>>
>>> Hi Ivan,
>>>
>>> Your following code changes causing issue in Vmware environment.
>>>
>>> --- ---
>>> --
>>> + /*
>>> + * On a 82599 VF, adding again the same MAC addr is not an idempotent
>>> + * operation. Trap this case to avoid exhausting the [very limited]
>>> + * set of PF resources used to store VF MAC addresses.
>>> + */
>>> + if (memcmp(hw->mac.perm_addr, mac_addr, sizeof(struct ether_addr)) == 0)
>>> + return;
>>>    diag = ixgbevf_set_uc_addr_vf(hw, 2, mac_addr->addr_bytes);
>>> 
>>> - ---
>>>
>>>
>>> Issue:
>>> At CLI , we have provision to add /del MAC of an interface.
>>> During MAC delete, existing MAC is deleted and default MAC is applied.
>>> This default MAC is not being applied in VMware environment
>>> successfully due to "return" statement
>>> in your above code changes. As a result traffic is stopped completely.
>>> If I remove above
>>> "return" statement then traffic continues to flow after MAC delete.
>>>
>>> Please let me know your suggestion to handle this scenario .  If I
>>> remove "return" what will be the consequences ?
>>>
>>> If removing "return" statement is not good idea then what are other
>>> way to handle MAC delete scenario ?  we have only 1 VF per PF in our
>>> setup as of now.
>>>
>>>
>>> Thanks
>>> Santosh
>>
>>
>>
>>
>> --
>> Ivan BOULE
>> 6WIND
>> Software Engineer
>>
>> Tel: +33 1 39 30 92 47
>> Mob: + 33 6 77 25 26 38
>> Fax: +33 1 39 30 92 11
>> ivan.boule at 6wind.com
>> www.6wind.com
>> Join the Multicore Packet Processing Forum:
>> www.multicorepacketprocessing.com
>>
>> Ce courriel ainsi que toutes les pi?ces jointes, est uniquement destin? ?
>> son ou ses destinataires. Il contient des informations confidentielles qui
>> sont la propri?t? de 6WIND. Toute r?v?lation, distribution ou copie des
>> informations qu'il contient est strictement interdite. Si vous avez re?u ce
>> message par erreur, veuillez imm?diatement le signaler ? l'?metteur et
>> d?truire toutes les donn?es re?ues.
>>
>> This e-mail message, including any attachments, is for the sole use of the
>> intended recipient(s) and contains information that is confidential and
>> proprietary to 6WIND. All unauthorized review, use, disclosure or
>> distribution is prohibited. If you are not the intended recipient, please
>> contact the sender by reply e-mail and destroy all copies of the original
>> message.
>>


-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] ixgbe : query regarding your code changes for VF mac add

2016-04-20 Thread Ivan Boule

Hi Santosh,

I do not get exactly what you attempt to do on a VF.
Are you first deleting the so-called permanent MAC address by a call to the
function ixgbevf_remove_mac_addr() ? This operation is not allowed.
Can you explain exactly the sequence of operations that are done, so that I
can understand how the test (memcmp(hw->mac.perm_addr, mac_addr,
sizeof(struct ether_addr)) == 0) in the function ixgbevf_add_mac_addr()
prevents them to be successfully performed.

Ivan

PS : please, can you CC your emails to dev at dpdk.org


2016-04-19 17:01 GMT+02:00 santosh :

> Hi Ivan,
>
> Your following code changes causing issue in Vmware environment.
>
> --- ---
> --
> + /*
> + * On a 82599 VF, adding again the same MAC addr is not an idempotent
> + * operation. Trap this case to avoid exhausting the [very limited]
> + * set of PF resources used to store VF MAC addresses.
> + */
> + if (memcmp(hw->mac.perm_addr, mac_addr, sizeof(struct ether_addr)) == 0)
> + return;
>   diag = ixgbevf_set_uc_addr_vf(hw, 2, mac_addr->addr_bytes);
> 
> - ---
>
>
> Issue:
> At CLI , we have provision to add /del MAC of an interface.
> During MAC delete, existing MAC is deleted and default MAC is applied.
> This default MAC is not being applied in VMware environment
> successfully due to "return" statement
> in your above code changes. As a result traffic is stopped completely.
> If I remove above
> "return" statement then traffic continues to flow after MAC delete.
>
> Please let me know your suggestion to handle this scenario .  If I
> remove "return" what will be the consequences ?
>
> If removing "return" statement is not good idea then what are other
> way to handle MAC delete scenario ?  we have only 1 VF per PF in our
> setup as of now.
>
>
> Thanks
> Santosh
>



-- 
Ivan BOULE
6WIND
Software Engineer

Tel: +33 1 39 30 92 47
Mob: + 33 6 77 25 26 38
Fax: +33 1 39 30 92 11
ivan.boule at 6wind.com
www.6wind.com
Join the Multicore Packet Processing Forum:
www.multicorepacketprocessing.com

Ce courriel ainsi que toutes les pi?ces jointes, est uniquement destin? ?
son ou ses destinataires. Il contient des informations confidentielles qui
sont la propri?t? de 6WIND. Toute r?v?lation, distribution ou copie des
informations qu'il contient est strictement interdite. Si vous avez re?u ce
message par erreur, veuillez imm?diatement le signaler ? l'?metteur et
d?truire toutes les donn?es re?ues.

This e-mail message, including any attachments, is for the sole use of the
intended recipient(s) and contains information that is confidential and
proprietary to 6WIND. All unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message.

[dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api

2016-02-12 Thread Ivan Boule

On 02/12/2016 12:44 PM, Ananyev, Konstantin wrote:
>
>>
>>> -Original Message-
...
>>
>> In that case we don't need to make any changes at rte_ethdev.[h,c] to 
>> alloc/free/maintain tx_buffer inside each queue...
>> It all will be upper layer responsibility.
>> So no need to modify existing rte_ethdev structures/code.
>> Again, no need for error callback - caller would check return value and 
>> decide what to do with unsent packets in the tx_buffer.
>>
>
> Just to summarise why I think it is better to have tx buffering managed on 
> the app level:
>
> 1. avoid any ABI change.
> 2. Avoid extra changes in rte_ethdev.c: tx_queue_setup/tx_queue_stop.
> 3. Provides much more flexibility to the user:
> a) where to allocate space for tx_buffer (stack, heap, hugepages, etc).
> b) user can mix and match plain tx_burst() and   
> tx_buffer/tx_buffer_flush()
>  in any way he fills it appropriate.
> c) user can change the size of tx_buffer without stop/re-config/start 
> queue:
>  just allocate new larger(smaller) tx_buffer & copy contents to the 
> new one.
> d) user can preserve buffered packets through device restart circle:
>  i.e if let say TX hang happened, and user has to do 
> dev_stop/dev_start -
>  contents of tx_buffer will stay unchanged and its contents could be
>  (re-)transmitted after device is up again, or  through different 
> port/queue if needed.
>
> As a drawbacks mentioned - tx error handling becomes less transparent...
> But we can add error handling routine and it's user provided parameter
> into struct rte_eth_dev_tx_buffer', something like that:
>
> +struct rte_eth_dev_tx_buffer {
> + buffer_tx_error_fn cbfn;
> + void *userdata;
> + unsigned nb_pkts;
> + uint64_t errors;
> + /**< Total number of queue packets to sent that are dropped. */
> + struct rte_mbuf *pkts[];
> +};
>
> Konstantin
>

Just to enforce Konstantin's comments.
As a very basic - not to say fundamental - rule, one should avoid adding 
in the PMD RX/TX API any extra processing that can be handled at a 
higher level.
The only and self-sufficient reason is that we must avoid impacting 
performances on the critical path, in particular for those - usually the 
majority of - applications that do not need such extra operations, or 
better implement them at upper level.

Maybe in a not so far future will come a proposal for forking a new open 
source fast-dpdk project aiming at providing API simplicity, 
zero-overhead, modular design, and all those nice properties that every 
one claims to seek :-)

Ivan

[dpdk-dev] [PATCH v2 9/9] pci: implement automatic bind/unbind

2016-02-03 Thread Ivan Boule

On 01/29/2016 03:49 PM, David Marchand wrote:
> Reuse pci hook to implement automatic bind / unbind.
> The more I look at this, the more I think this should go to the PMDs
> themselves (with options per devices to control this), with EAL offering
> helpers to achieve this.

This sounds good to me.
As a basic software design rule, the decisions of using a given resource 
and/or performing a given operation (the intelligent knowledge) should 
systematically be taken at the maximum possible software level.
Conversely, such high-level decisions should be offered the most 
appropriate low-level services to facilitate their implementation.
In the case we are considering here - the need to bind a PCI device - 
the Poll Mode Driver of that device is the only component which actually 
knows.
Conversely, the EAL layer is the best place where to implement the
OS-dependent and/or architecture-dependent device binding services to be
invoked by PMDs at PCI device probing time.

My 2 cents,
Ivan

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] where to find ethernet CRC when stripping is off

2016-01-21 Thread Ivan Boule

Hi Francesco,

On 01/20/2016 06:15 PM, Montorsi, Francesco wrote:
> Hi Ivan,
>
>
>> -Original Message-
>> You would be right... if the PMDs did not transparently strip the CRC in
>> software when hardware CRC stripping is disabled at port configuration (as
>> described above).
>> See for instance how the function ixgbe_recv_pkts_lro() in file
>> drivers/net/ixgbe/ixgbe_rxtx.c deals with crc_len.
>
> Yeah, I see. However, I wonder what's the utility of the hw_strip_crc feature 
> if finally it is completely masked to the mbuf user.
> However, to my understanding, looking at that ixgbe code, I think that what I 
> wrote before:
>
> uint32_t crc = *(rte_pktmbuf_mtod_offset (mymbuf, uint32_t*, 
> mymbuf->pkt_len)) ;
>
> should work, since the pkt_len and data_len has the "crc_len" removed, but 
> the CRC itself should be there.

In most cases, yes. But not in the [rare] case where the CRC would have 
been the only data stored into the last segment of a scattered input 
packet. See how the function ixgbe_recv_pkts_lro() checks this 
particular case, and free the associated mbuf.

> I know it is kind of an hack, but at least for ixgbe that sounds like a 
> possible (temporary) solution for me
>
>
>> Considering your need, I think now that PMDs should keep the CRC that are
>> stored in received packets when hardware CRC stripping is disabled by the
>> application, so that the application can access it as needed.
>>
> Yes, that would be very useful.
>
>> Note that this would impose that the input packet processing of such DPDK
>> applications be aware of the CRC presence (+4 in the packet length , for
>> instance).
> Or perhaps, to maintain backward compatibility, just a flag inside the mbuf 
> could be set that informs the user that at the end of the mbuf packet, you 
> can find 4 bytes with the CRC.
>
>>
>> Let's see what others, if any, that might care think about such a change into
>> the CRC stripping semantics.
>
> Thanks!
> Francesco
>


-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] where to find ethernet CRC when stripping is off

2016-01-20 Thread Ivan Boule

On 01/20/2016 04:02 PM, Montorsi, Francesco wrote:
> Hi all,
>
> I need to get access to the Ethernet CRC of received packets.
> To do this, I'm configuring:
>
> port_conf.rxmode.hw_strip_crc = 0;
>
> Now my question is: how am I supposed to access the Ethernet CRC from a DPDK 
> mbuf?
> Is the CRC just the 4 final bytes of the packets?
>
> Is this correct:
>
> uint32_t crc = rte_pktmbuf_mtod_offset (mymbuf, uint32_t*, 
> mymbuf->pkt_len) ;
>
> ?
>
> Thanks,
> Francesco Montorsi
>
Hi Francesco,

You would be right... if the PMDs did not transparently strip the CRC in 
software when hardware CRC stripping is disabled at port configuration 
(as described above).
See for instance how the function ixgbe_recv_pkts_lro() in file 
drivers/net/ixgbe/ixgbe_rxtx.c deals with crc_len.

Considering your need, I think now that PMDs should keep the CRC that 
are stored in received packets when hardware CRC stripping is disabled 
by the application, so that the application can access it as needed.

Note that this would impose that the input packet processing of such 
DPDK applications be aware of the CRC presence (+4 in the packet length 
, for instance).

Let's see what others, if any, that might care think about such a change 
into the CRC stripping semantics.

Ivan

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH v3] doc: announce ABI change for struct rte_eth_conf

2015-12-24 Thread Ivan Boule

Hi Jijiang,

See my comments inline below prefixewd with IB>
Ivan

On 12/18/2015 03:00 AM, Liu, Jijiang wrote:
> Hi Boule,
>
>> -Original Message-
>> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
>> Sent: Tuesday, December 15, 2015 4:50 PM
>> To: Liu, Jijiang
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v3] doc: announce ABI change for struct
>> rte_eth_conf
>>
>> On 12/14/2015 08:48 AM, Jijiang Liu wrote:
>>> In current codes, tunnel configuration information is not stored in a device
>> configuration, and it will get nothing if application want to retrieve tunnel
>> config, so I think it is necessary to add rte_eth_dev_tunnel_configure()
>> function is to do the configurations including flow and classification
>> information and store it in a device configuration.
>>>
>>> And tunneling packet encapsulation operation will benifit from the change.
>>>
>>> There are more descriptions for the ABI changes below,
>>>
>>> The struct 'rte_eth_tunnel_conf' is a new, its defination like below,
>>> struct rte_eth_tunnel_conf {
>>>  uint16_t tx_queue;
>>>  uint16_t filter_type;
>>>  struct rte_eth_tunnel_flow flow_tnl; };
>>>
>>> The ABI change announcement of struct 'rte_eth_tunnel_flow' have
>> already sent out, refer to [1].
>>>
>>> [1]http://dpdk.org/ml/archives/dev/2015-December/029837.html.
>>>
>>> The change of struct 'rte_eth_conf' like below, but it have not finalized 
>>> yet.
>>> struct rte_eth_conf {
>>> ...
>>> uint32_t dcb_capability_en;
>>> struct rte_fdir_conf fdir_conf; /**< FDIR configuration. */
>>> struct rte_intr_conf intr_conf; /**< Interrupt mode configuration. */
>>> struct rte_eth_tunnel_conf
>> *tunnel_conf[RTE_MAX_QUEUES_PER_PORT];
>>> /**< Tunnel configuration. */
>>> };
>>>
>>> v2 change:
>>> Add more description for the change.
>>>
>>> v3 change:
>>> Change ABI announcement description.
>>>
>>> Signed-off-by: Jijiang Liu  ---cmdline.c
>>>doc/guides/rel_notes/deprecation.rst |6 ++
>>>1 files changed, 6 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/doc/guides/rel_notes/deprecation.rst
>>> b/doc/guides/rel_notes/deprecation.rst
>>> index 5c458f2..9dbe89e 100644
>>> --- a/doc/guides/rel_notes/deprecation.rst
>>> +++ b/doc/guides/rel_notes/deprecation.rst
>>> @@ -23,3 +23,9 @@ Deprecation Notices
>>>* ABI changes are planned for struct rte_eth_tunnel_flow in order to
>> extend new fileds to support
>>>  tunneling packet configuration in unified tunneling APIs. The release 
>>> 2.2
>> does not contain these ABI
>>>  changes, but release 2.3 will, and no backwards compatibility is 
>>> planned.
>>> +
>>> +* ABI changes are planned for the struct rte_eth_conf in order to add
>>> +'tunnel_conf' variable
>>> +  in the struct to store tunnel configuration when using new API
>>> +rte_eth_dev_tunnel_configure
>>> +  (uint8_t port_id, uint16_t rx_queue, struct rte_eth_tunnel_conf *
>>> +tunnel_conf) to configure
>>> +  tunnel flow and classification information. The release 2.2 does
>>> +not contain these ABI change,
>>> +  but release 2.3 will, and no backward compatibility is planned.
>>>
>> Hi Jijiang,
>>
>> Can you provide a real use case - I mean an example of a real network
>> application - that really needs to save tunnel configurations in a data
>> structure associated with a NIC port?
>
> I'm trying to provide a tunneling packet solution in DPDK, which would 
> accelerate de/encapsulation operation of tunneling packet.
IB> I was asking for an example of an application that needs to SAVE in 
the DPDK structure associated with a port a tunnel configuration that it 
applies to that port.
Where does that saved tunnel configuration will participate to the 
acceleration of decap/encap ops?

>
> It was described at [1],
> [1] http://dpdk.org/ml/archives/dev/2015-December/030283.html
>
>
> Let me provide more details on this, these data structure definition have not 
> fully finalized yet, just for your reference.
> We are talking about why tunnel configuration need to be stored.
IB? yes :-)

>
> For NIC A RX process,
> VM 0--->VTEP A---> VXLAN network--->VTEP B---NIC A (Rx queue 1 with info [1] 
> )--->SW decapsulation--->vSwitch--->VM 0
>
> For NI

[dpdk-dev] [PATCH v3] doc: announce ABI change for struct rte_eth_conf

2015-12-15 Thread Ivan Boule

On 12/14/2015 08:48 AM, Jijiang Liu wrote:
> In current codes, tunnel configuration information is not stored in a device 
> configuration, and it will get nothing if application want to retrieve tunnel 
> config, so I think it is necessary to add rte_eth_dev_tunnel_configure() 
> function is to do the configurations including flow and classification 
> information and store it in a device configuration.
>
> And tunneling packet encapsulation operation will benifit from the change.
>
> There are more descriptions for the ABI changes below,
>
> The struct 'rte_eth_tunnel_conf' is a new, its defination like below,
> struct rte_eth_tunnel_conf {
> uint16_t tx_queue;
> uint16_t filter_type;
> struct rte_eth_tunnel_flow flow_tnl;
> };
>
> The ABI change announcement of struct 'rte_eth_tunnel_flow' have already sent 
> out, refer to [1].
>
> [1]http://dpdk.org/ml/archives/dev/2015-December/029837.html.
>
> The change of struct 'rte_eth_conf' like below, but it have not finalized yet.
> struct rte_eth_conf {
>   ...
>   uint32_t dcb_capability_en;
>   struct rte_fdir_conf fdir_conf; /**< FDIR configuration. */
>   struct rte_intr_conf intr_conf; /**< Interrupt mode configuration. */
>   struct rte_eth_tunnel_conf *tunnel_conf[RTE_MAX_QUEUES_PER_PORT];
>   /**< Tunnel configuration. */
> };
>
> v2 change:
>Add more description for the change.
>
> v3 change:
>Change ABI announcement description.
>
> Signed-off-by: Jijiang Liu 
> ---cmdline.c
>   doc/guides/rel_notes/deprecation.rst |6 ++
>   1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 5c458f2..9dbe89e 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -23,3 +23,9 @@ Deprecation Notices
>   * ABI changes are planned for struct rte_eth_tunnel_flow in order to extend 
> new fileds to support
> tunneling packet configuration in unified tunneling APIs. The release 2.2 
> does not contain these ABI
> changes, but release 2.3 will, and no backwards compatibility is planned.
> +
> +* ABI changes are planned for the struct rte_eth_conf in order to add 
> 'tunnel_conf' variable
> +  in the struct to store tunnel configuration when using new API 
> rte_eth_dev_tunnel_configure
> +  (uint8_t port_id, uint16_t rx_queue, struct rte_eth_tunnel_conf * 
> tunnel_conf) to configure
> +  tunnel flow and classification information. The release 2.2 does not 
> contain these ABI change,
> +  but release 2.3 will, and no backward compatibility is planned.
>
Hi Jijiang,

Can you provide a real use case - I mean an example of a real network 
application - that really needs to save tunnel configurations in a data 
structure associated with a NIC port?

Firstly, if the only usage is to enable applications to retrieve tunnel
configurations, then you are simply growing the size of the per-port
structure with tunnel configurations, and imposing it to every DPDK 
application.
You impose it to those applications that don't care about tunneling, but 
also to those applications which do care, but which prefer to have their 
own representation of ports where they store everything they need to.

If the tunnel configuration is also used for other purposes, then it
must be precisely described what happens with the saved tunnel 
configuration when the application changes the state of a port.
This is the case for instance when the application reconfigures the 
number of RX queues of a port.
Who is responsible for checking that some tunnels won't be matched anymore?
Who is responsible for dropping/invalidating the saved tunnel 
configuration, if such operations must be performed?
This list is likely to be not exhaustive, of course.

More globally, all side-effects of saving the tunnel configuration must 
be considered and addressed in a coherent way and in an easy-to-use 
approach.

By the way, as far as I know, the Linux kernel does not [need to] save 
tunnel data or ARP entries behind "netdevice" structures.

PS : in the "rte_eth_tunnel_conf" data structure, I think that the first 
field should be named "rx_queue" instead of "tx_queue".

Regards,
Ivan

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH 0/5] multicast address filtering

2015-05-29 Thread Ivan Boule

On 05/28/2015 06:21 PM, Stephen Hemminger wrote:
> On Thu, 28 May 2015 17:05:18 +0200
> Ivan Boule  wrote:
>
>> Introduce PMD API to set the list of multicast MAC addresses filtered
>> by a port.
>> Implemented in the following PMDs: igb, igbvf, em, ixgbe, and ixgbevf.
>> Implementation for physical PMDs i40e, i40evf, enic, and fm10k left
>> to their respective maintainers.
>>
>> Ivan Boule (5):
>>ethdev: add multicast address filtering
>>app/testpmd: new command to add/remove multicast MAC addresses
>>e1000: add multicast MAC address filtering
>>ixgbe: add multicast MAC address filtering
>>app/testpmd: fix reply to a multicast ICMP request
>>
>>   app/test-pmd/cmdline.c   |   52 ++
>>   app/test-pmd/config.c|  142 
>> ++
>>   app/test-pmd/icmpecho.c  |   65 +++--
>>   app/test-pmd/testpmd.h   |6 ++
>>   drivers/net/e1000/em_ethdev.c|   17 +
>>   drivers/net/e1000/igb_ethdev.c   |   18 +
>>   drivers/net/ixgbe/ixgbe_ethdev.c |   32 +
>>   lib/librte_ether/rte_ethdev.c|   17 +
>>   lib/librte_ether/rte_ethdev.h|   26 +++
>>   9 files changed, 369 insertions(+), 6 deletions(-)
>>
>
> Looks good, could you also add support for virtio and vmxnet3?
>
As for physical NICs (i40e, etc.) listed above, I let the maintainers of 
the remaining NICs where this function is relevant to implement and to 
test it.
By the way, I supposed that Guest front-end vNICs were always in 
promiscuous mode by construction.
Said differently: that all packets supplied "from the outside" to the 
Host vNIC back-end driver were systematically delivered to the Guest 
vNIC front-end, whatever their destination MAC address, VLAN id., etc.
Did I missed something ?

[dpdk-dev] [PATCH v2 1/5] ethdev: add multicast address filtering

2015-05-29 Thread Ivan Boule

With the current PMD API, the receipt of multicast packets on a given
port can only be enabled by invoking the "rte_eth_allmulticast_enable"
function.
This method may not work on Virtual Functions in SR-IOV architectures
when the host PF driver does not allow such operation on VFs.
In such cases, joined multicast addresses must be individually added
in the set of multicast addresses that are filtered by the [VF] port.

For this purpose, a new function "set_mc_addr_list" is introduced
into the set of functions that are exported by a Poll Mode Driver.

Signed-off-by: Ivan Boule 
---
v2:
Use the dedicated function "rte_eth_dev_is_valid_port" to check
the "port_id" parameter in "rte_eth_dev_set_mc_addr_list".

 lib/librte_ether/rte_ethdev.c |   17 +
 lib/librte_ether/rte_ethdev.h |   26 ++
 2 files changed, 43 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 024fe8b..8001b16 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3628,3 +3628,20 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t 
queue_id,
/* Callback wasn't found. */
return -EINVAL;
 }
+
+int
+rte_eth_dev_set_mc_addr_list(uint8_t port_id,
+struct ether_addr *mc_addr_set,
+uint32_t nb_mc_addr)
+{
+   struct rte_eth_dev *dev;
+
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+
+   dev = _eth_devices[port_id];
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_mc_addr_list, -ENOTSUP);
+   return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+}
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 16dbe00..04c192d 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1228,6 +1228,10 @@ typedef int (*eth_udp_tunnel_del_t)(struct rte_eth_dev 
*dev,
struct rte_eth_udp_tunnel *tunnel_udp);
 /**< @internal Delete tunneling UDP info */

+typedef int (*eth_set_mc_addr_list_t)(struct rte_eth_dev *dev,
+ struct ether_addr *mc_addr_set,
+ uint32_t nb_mc_addr);
+/**< @internal set the list of multicast addresses on an Ethernet device */

 #ifdef RTE_NIC_BYPASS

@@ -1386,6 +1390,7 @@ struct eth_dev_ops {
/** Get current RSS hash configuration. */
rss_hash_conf_get_t rss_hash_conf_get;
eth_filter_ctrl_t  filter_ctrl;  /**< common filter 
control*/
+   eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
 };

 /**
@@ -3615,4 +3620,25 @@ int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t 
queue_id,
 }
 #endif

+/**
+ * Set the list of multicast addresses to filter on an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param mc_addr_set
+ *   The array of multicast addresses to set. Equal to NULL when the function
+ *   is invoked to flush the set of filtered addresses.
+ * @param nb_mc_addr
+ *   The number of multicast addresses in the *mc_addr_set* array. Equal to 0
+ *   when the function is invoked to flush the set of filtered addresses.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if *port_id* invalid.
+ *   - (-ENOTSUP) if PMD of *port_id* doesn't support multicast filtering.
+ *   - (-ENOSPC) if *port_id* has not enough multicast filtering resources.
+ */
+int rte_eth_dev_set_mc_addr_list(uint8_t port_id,
+struct ether_addr *mc_addr_set,
+uint32_t nb_mc_addr);
+
 #endif /* _RTE_ETHDEV_H_ */
-- 
1.7.10.4

[dpdk-dev] [PATCH 1/5] ethdev: add multicast address filtering

2015-05-29 Thread Ivan Boule

On 05/28/2015 06:22 PM, Stephen Hemminger wrote:
> On Thu, 28 May 2015 17:05:19 +0200
> Ivan Boule  wrote:
>
>> +if (port_id >= nb_ports) {
>> +PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>> +return -ENODEV;
>> +}
>> +
>
> Use rte_eth_dev_is_valid_port() function instead.
>
I missed that.
Thanks.

[dpdk-dev] [PATCH 5/5] app/testpmd: fix reply to a multicast ICMP request

2015-05-28 Thread Ivan Boule

Set the IP source and destination addresses in the IP header of the
ICMP reply as follows:
  - Use the request IP source address as the reply IP destination address
  - If the request IP destination address is a multicast IP address
  - choose a reply IP source address different from the request IP
source address,
  - re-compute the IP header checksum.
Otherwise
  - switch the request IP source and destination addresses in the
reply,
  - keep the IP header checksum unchanged.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/icmpecho.c |   65 ++-
 1 file changed, 59 insertions(+), 6 deletions(-)

diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 010c5a9..9e6f5e9 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -272,6 +272,30 @@ ipv4_addr_dump(const char *what, uint32_t be_ipv4_addr)
printf("%s", buf);
 }

+static uint16_t
+ipv4_hdr_cksum(struct ipv4_hdr *ip_h)
+{
+   uint16_t *v16_h;
+   uint32_t ip_cksum;
+
+   /*
+* Compute the sum of successive 16-bit words of the IPv4 header,
+* skipping the checksum field of the header.
+*/
+   v16_h = (uint16_t *) ip_h;
+   ip_cksum = v16_h[0] + v16_h[1] + v16_h[2] + v16_h[3] +
+   v16_h[4] + v16_h[6] + v16_h[7] + v16_h[8] + v16_h[9];
+
+   /* reduce 32 bit checksum to 16 bits and complement it */
+   ip_cksum = (ip_cksum & 0x) + (ip_cksum >> 16);
+   ip_cksum = (ip_cksum & 0x) + (ip_cksum >> 16);
+   ip_cksum = (~ip_cksum) & 0x;
+   return (ip_cksum == 0) ? 0x : (uint16_t) ip_cksum;
+}
+
+#define is_multicast_ipv4_addr(ipv4_addr) \
+   (((rte_be_to_cpu_32((ipv4_addr)) >> 24) & 0x00FF) == 0xE0)
+
 /*
  * Receive a burst of packets, lookup for ICMP echo requets, and, if any,
  * send back ICMP echo replies.
@@ -295,6 +319,7 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
uint16_t vlan_id;
uint16_t arp_op;
uint16_t arp_pro;
+   uint32_t cksum;
uint8_t  i;
int l2_len;
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
@@ -442,19 +467,47 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
/*
 * Prepare ICMP echo reply to be sent back.
 * - switch ethernet source and destinations addresses,
-* - switch IPv4 source and destinations addresses,
+* - use the request IP source address as the reply IP
+*destination address,
+* - if the request IP destination address is a multicast
+*   address:
+* - choose a reply IP source address different from the
+*   request IP source address,
+* - re-compute the IP header checksum.
+*   Otherwise:
+* - switch the request IP source and destination
+*   addresses in the reply IP header,
+* - keep the IP header checksum unchanged.
 * - set IP_ICMP_ECHO_REPLY in ICMP header.
-* No need to re-compute the IP header checksum.
-* Reset ICMP checksum.
+* ICMP checksum is computed by assuming it is valid in the
+* echo request and not verified.
 */
ether_addr_copy(_h->s_addr, _addr);
ether_addr_copy(_h->d_addr, _h->s_addr);
ether_addr_copy(_addr, _h->d_addr);
ip_addr = ip_h->src_addr;
-   ip_h->src_addr = ip_h->dst_addr;
-   ip_h->dst_addr = ip_addr;
+   if (is_multicast_ipv4_addr(ip_h->dst_addr)) {
+   uint32_t ip_src;
+
+   ip_src = rte_be_to_cpu_32(ip_addr);
+   if ((ip_src & 0x0003) == 1)
+   ip_src = (ip_src & 0xFFFC) | 0x0002;
+   else
+   ip_src = (ip_src & 0xFFFC) | 0x0001;
+   ip_h->src_addr = rte_cpu_to_be_32(ip_src);
+   ip_h->dst_addr = ip_addr;
+   ip_h->hdr_checksum = ipv4_hdr_cksum(ip_h);
+   } else {
+   ip_h->src_addr = ip_h->dst_addr;
+   ip_h->dst_addr = ip_addr;
+   }
icmp_h->icmp_type = IP_ICMP_ECHO_REPLY;
-   icmp_h->icmp_cksum = 0;
+   cksum = ~icmp_h->icmp_cksum & 0x;
+   cksum += ~htons(IP_ICMP_ECHO_REQUEST << 8) & 0x;
+   cksum += htons(IP_ICMP_ECHO_REPLY << 8);
+   cksum = (cksum & 0x) + (cksum >> 16);
+   cksum = (cksum & 0x) + (cksum >> 16);
+   icmp_h->icmp_cksum = ~cksum;
pkts_burst[nb_replies++] = pkt;
}

-- 
1.7.10.4

[dpdk-dev] [PATCH 4/5] ixgbe: add multicast MAC address filtering

2015-05-28 Thread Ivan Boule

Support the function "set_mc_addr_list" in the "ixgbe" and in the
"ixgbe-vf" Poll Mode Drivers.

Signed-off-by: Ivan Boule 
---
 drivers/net/ixgbe/ixgbe_ethdev.c |   32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 0d9f9b2..885ed8f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -257,6 +257,10 @@ static int ixgbe_dev_filter_ctrl(struct rte_eth_dev *dev,
 void *arg);
 static int ixgbevf_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu);

+static int ixgbe_dev_set_mc_addr_list(struct rte_eth_dev *dev,
+ struct ether_addr *mc_addr_set,
+ uint32_t nb_mc_addr);
+
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
  */
@@ -381,6 +385,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
.rss_hash_update  = ixgbe_dev_rss_hash_update,
.rss_hash_conf_get= ixgbe_dev_rss_hash_conf_get,
.filter_ctrl  = ixgbe_dev_filter_ctrl,
+   .set_mc_addr_list = ixgbe_dev_set_mc_addr_list,
 };

 /*
@@ -406,6 +411,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
.tx_queue_release = ixgbe_dev_tx_queue_release,
.mac_addr_add = ixgbevf_add_mac_addr,
.mac_addr_remove  = ixgbevf_remove_mac_addr,
+   .set_mc_addr_list = ixgbe_dev_set_mc_addr_list,
 };

 /**
@@ -4439,6 +4445,32 @@ ixgbe_dev_filter_ctrl(struct rte_eth_dev *dev,
return ret;
 }

+static u8 *
+ixgbe_dev_addr_list_itr(__attribute__((unused)) struct ixgbe_hw *hw,
+   u8 **mc_addr_ptr, u32 *vmdq)
+{
+   u8 *mc_addr;
+
+   *vmdq = 0;
+   mc_addr = *mc_addr_ptr;
+   *mc_addr_ptr = (mc_addr + sizeof(struct ether_addr));
+   return mc_addr;
+}
+
+static int
+ixgbe_dev_set_mc_addr_list(struct rte_eth_dev *dev,
+  struct ether_addr *mc_addr_set,
+  uint32_t nb_mc_addr)
+{
+   struct ixgbe_hw *hw;
+   u8 *mc_addr_list;
+
+   hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   mc_addr_list = (u8 *)mc_addr_set;
+   return ixgbe_update_mc_addr_list(hw, mc_addr_list, nb_mc_addr,
+ixgbe_dev_addr_list_itr, TRUE);
+}
+
 static struct rte_driver rte_ixgbe_driver = {
.type = PMD_PDEV,
.init = rte_ixgbe_pmd_init,
-- 
1.7.10.4

[dpdk-dev] [PATCH 3/5] e1000: add multicast MAC address filtering

2015-05-28 Thread Ivan Boule

Support the PMD function "set_mc_addr_list" in the "igb", "igb-vf",
and "em" Poll Mode Drivers.

Signed-off-by: Ivan Boule 
---
 drivers/net/e1000/em_ethdev.c  |   17 +
 drivers/net/e1000/igb_ethdev.c |   18 ++
 2 files changed, 35 insertions(+)

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index d28030e..2392942 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -116,6 +116,10 @@ static void eth_em_rar_set(struct rte_eth_dev *dev, struct 
ether_addr *mac_addr,
uint32_t index, uint32_t pool);
 static void eth_em_rar_clear(struct rte_eth_dev *dev, uint32_t index);

+static int eth_em_set_mc_addr_list(struct rte_eth_dev *dev,
+  struct ether_addr *mc_addr_set,
+  uint32_t nb_mc_addr);
+
 #define EM_FC_PAUSE_TIME 0x0680
 #define EM_LINK_UPDATE_CHECK_TIMEOUT  90  /* 9s */
 #define EM_LINK_UPDATE_CHECK_INTERVAL 100 /* ms */
@@ -161,6 +165,7 @@ static const struct eth_dev_ops eth_em_ops = {
.flow_ctrl_set= eth_em_flow_ctrl_set,
.mac_addr_add = eth_em_rar_set,
.mac_addr_remove  = eth_em_rar_clear,
+   .set_mc_addr_list = eth_em_set_mc_addr_list,
 };

 /**
@@ -1522,6 +1527,18 @@ eth_em_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
return 0;
 }

+static int
+eth_em_set_mc_addr_list(struct rte_eth_dev *dev,
+   struct ether_addr *mc_addr_set,
+   uint32_t nb_mc_addr)
+{
+   struct e1000_hw *hw;
+
+   hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   e1000_update_mc_addr_list(hw, (u8 *)mc_addr_set, nb_mc_addr);
+   return 0;
+}
+
 struct rte_driver em_pmd_drv = {
.type = PMD_PDEV,
.init = rte_em_pmd_init,
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index e4b370d..1c24edc 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -194,6 +194,10 @@ static int eth_igb_filter_ctrl(struct rte_eth_dev *dev,
 enum rte_filter_op filter_op,
 void *arg);

+static int eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
+   struct ether_addr *mc_addr_set,
+   uint32_t nb_mc_addr);
+
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
  */
@@ -269,6 +273,7 @@ static const struct eth_dev_ops eth_igb_ops = {
.rss_hash_update  = eth_igb_rss_hash_update,
.rss_hash_conf_get= eth_igb_rss_hash_conf_get,
.filter_ctrl  = eth_igb_filter_ctrl,
+   .set_mc_addr_list = eth_igb_set_mc_addr_list,
 };

 /*
@@ -289,6 +294,7 @@ static const struct eth_dev_ops igbvf_eth_dev_ops = {
.rx_queue_release = eth_igb_rx_queue_release,
.tx_queue_setup   = eth_igb_tx_queue_setup,
.tx_queue_release = eth_igb_tx_queue_release,
+   .set_mc_addr_list = eth_igb_set_mc_addr_list,
 };

 /**
@@ -3642,6 +3648,18 @@ eth_igb_filter_ctrl(struct rte_eth_dev *dev,
return ret;
 }

+static int
+eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
+struct ether_addr *mc_addr_set,
+uint32_t nb_mc_addr)
+{
+   struct e1000_hw *hw;
+
+   hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   e1000_update_mc_addr_list(hw, (u8 *)mc_addr_set, nb_mc_addr);
+   return 0;
+}
+
 static struct rte_driver pmd_igb_drv = {
.type = PMD_PDEV,
.init = rte_igb_pmd_init,
-- 
1.7.10.4

[dpdk-dev] [PATCH 2/5] app/testpmd: new command to add/remove multicast MAC addresses

2015-05-28 Thread Ivan Boule

Add the new interactive command:
mcast_addr add|remove X 
to add/remove the multicast MAC address  to/from the set of
multicast addresses filtered by port .
Command used to test the function "rte_eth_dev_set_mc_addr_list"
that has been added to the API of PMDs.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/cmdline.c |   52 ++
 app/test-pmd/config.c  |  142 
 app/test-pmd/testpmd.h |6 ++
 3 files changed, 200 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index f01db2a..952a9df 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -8733,6 +8733,57 @@ cmdline_parse_inst_t cmd_set_hash_global_config = {
},
 };

+/* *** ADD/REMOVE A MULTICAST MAC ADDRESS TO/FROM A PORT *** */
+struct cmd_mcast_addr_result {
+   cmdline_fixed_string_t mcast_addr_cmd;
+   cmdline_fixed_string_t what;
+   uint8_t port_num;
+   struct ether_addr mc_addr;
+};
+
+static void cmd_mcast_addr_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_mcast_addr_result *res = parsed_result;
+
+   if (!is_multicast_ether_addr(>mc_addr)) {
+   printf("Invalid multicast addr %02X:%02X:%02X:%02X:%02X:%02X\n",
+  res->mc_addr.addr_bytes[0], res->mc_addr.addr_bytes[1],
+  res->mc_addr.addr_bytes[2], res->mc_addr.addr_bytes[3],
+  res->mc_addr.addr_bytes[4], res->mc_addr.addr_bytes[5]);
+   return;
+   }
+   if (strcmp(res->what, "add") == 0)
+   mcast_addr_add(res->port_num, >mc_addr);
+   else
+   mcast_addr_remove(res->port_num, >mc_addr);
+}
+
+cmdline_parse_token_string_t cmd_mcast_addr_cmd =
+   TOKEN_STRING_INITIALIZER(struct cmd_mcast_addr_result,
+mcast_addr_cmd, "mcast_addr");
+cmdline_parse_token_string_t cmd_mcast_addr_what =
+   TOKEN_STRING_INITIALIZER(struct cmd_mcast_addr_result, what,
+"add#remove");
+cmdline_parse_token_num_t cmd_mcast_addr_portnum =
+   TOKEN_NUM_INITIALIZER(struct cmd_mcast_addr_result, port_num, UINT8);
+cmdline_parse_token_etheraddr_t cmd_mcast_addr_addr =
+   TOKEN_ETHERADDR_INITIALIZER(struct cmd_mac_addr_result, address);
+
+cmdline_parse_inst_t cmd_mcast_addr = {
+   .f = cmd_mcast_addr_parsed,
+   .data = (void *)0,
+   .help_str = "mcast_addr add|remove X : add/remove multicast 
MAC address on port X",
+   .tokens = {
+   (void *)_mcast_addr_cmd,
+   (void *)_mcast_addr_what,
+   (void *)_mcast_addr_portnum,
+   (void *)_mcast_addr_addr,
+   NULL,
+   },
+};
+
 /* 

 */

 /* list of instructions */
@@ -8862,6 +8913,7 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)_set_sym_hash_ena_per_port,
(cmdline_parse_inst_t *)_get_hash_global_config,
(cmdline_parse_inst_t *)_set_hash_global_config,
+   (cmdline_parse_inst_t *)_mcast_addr,
NULL,
 };

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index f788ed5..52917c7 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2130,3 +2130,145 @@ set_vf_rate_limit(portid_t port_id, uint16_t vf, 
uint16_t rate, uint64_t q_msk)
port_id, diag);
return diag;
 }
+
+/*
+ * Functions to manage the set of filtered Multicast MAC addresses.
+ *
+ * A pool of filtered multicast MAC addresses is associated with each port.
+ * The pool is allocated in chunks of MCAST_POOL_INC multicast addresses.
+ * The address of the pool and the number of valid multicast MAC addresses
+ * recorded in the pool are stored in the fields "mc_addr_pool" and
+ * "mc_addr_nb" of the "rte_port" data structure.
+ *
+ * The function "rte_eth_dev_set_mc_addr_list" of the PMDs API imposes
+ * to be supplied a contiguous array of multicast MAC addresses.
+ * To comply with this constraint, the set of multicast addresses recorded
+ * into the pool are systematically compacted at the beginning of the pool.
+ * Hence, when a multicast address is removed from the pool, all following
+ * addresses, if any, are copied back to keep the set contiguous.
+ */
+#define MCAST_POOL_INC 32
+
+static int
+mcast_addr_pool_extend(struct rte_port *port)
+{
+   struct ether_addr *mc_pool;
+   size_t mc_pool_size;
+
+   /*
+* If a free entry is available at the end of the pool, just
+* increment the number of recorded multicast addresses.
+*/
+   if ((port->mc_addr_nb % MCAST_POOL_INC) != 0) {
+   port->mc_addr_nb++;
+   retur

[dpdk-dev] [PATCH 1/5] ethdev: add multicast address filtering

2015-05-28 Thread Ivan Boule

With the current PMD API, the receipt of multicast packets on a given
port can only be enabled by invoking the "rte_eth_allmulticast_enable"
function.
This method may not work on Virtual Functions in SR-IOV architectures
when the host PF driver does not allow such operation on VFs.
In such cases, joined multicast addresses must be individually added
in the set of multicast addresses that are filtered by the [VF] port.

For this purpose, a new function "set_mc_addr_list" is introduced
into the set of functions that are exported by a Poll Mode Driver.

Signed-off-by: Ivan Boule 
---
 lib/librte_ether/rte_ethdev.c |   17 +
 lib/librte_ether/rte_ethdev.h |   26 ++
 2 files changed, 43 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 024fe8b..f5784de 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3628,3 +3628,20 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t 
queue_id,
/* Callback wasn't found. */
return -EINVAL;
 }
+
+int
+rte_eth_dev_set_mc_addr_list(uint8_t port_id,
+struct ether_addr *mc_addr_set,
+uint32_t nb_mc_addr)
+{
+   struct rte_eth_dev *dev;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+
+   dev = _eth_devices[port_id];
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_mc_addr_list, -ENOTSUP);
+   return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+}
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 16dbe00..04c192d 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1228,6 +1228,10 @@ typedef int (*eth_udp_tunnel_del_t)(struct rte_eth_dev 
*dev,
struct rte_eth_udp_tunnel *tunnel_udp);
 /**< @internal Delete tunneling UDP info */

+typedef int (*eth_set_mc_addr_list_t)(struct rte_eth_dev *dev,
+ struct ether_addr *mc_addr_set,
+ uint32_t nb_mc_addr);
+/**< @internal set the list of multicast addresses on an Ethernet device */

 #ifdef RTE_NIC_BYPASS

@@ -1386,6 +1390,7 @@ struct eth_dev_ops {
/** Get current RSS hash configuration. */
rss_hash_conf_get_t rss_hash_conf_get;
eth_filter_ctrl_t  filter_ctrl;  /**< common filter 
control*/
+   eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
 };

 /**
@@ -3615,4 +3620,25 @@ int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t 
queue_id,
 }
 #endif

+/**
+ * Set the list of multicast addresses to filter on an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param mc_addr_set
+ *   The array of multicast addresses to set. Equal to NULL when the function
+ *   is invoked to flush the set of filtered addresses.
+ * @param nb_mc_addr
+ *   The number of multicast addresses in the *mc_addr_set* array. Equal to 0
+ *   when the function is invoked to flush the set of filtered addresses.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if *port_id* invalid.
+ *   - (-ENOTSUP) if PMD of *port_id* doesn't support multicast filtering.
+ *   - (-ENOSPC) if *port_id* has not enough multicast filtering resources.
+ */
+int rte_eth_dev_set_mc_addr_list(uint8_t port_id,
+struct ether_addr *mc_addr_set,
+uint32_t nb_mc_addr);
+
 #endif /* _RTE_ETHDEV_H_ */
-- 
1.7.10.4

[dpdk-dev] [PATCH 0/5] multicast address filtering

2015-05-28 Thread Ivan Boule

Introduce PMD API to set the list of multicast MAC addresses filtered
by a port.
Implemented in the following PMDs: igb, igbvf, em, ixgbe, and ixgbevf.
Implementation for physical PMDs i40e, i40evf, enic, and fm10k left
to their respective maintainers.

Ivan Boule (5):
  ethdev: add multicast address filtering
  app/testpmd: new command to add/remove multicast MAC addresses
  e1000: add multicast MAC address filtering
  ixgbe: add multicast MAC address filtering
  app/testpmd: fix reply to a multicast ICMP request

 app/test-pmd/cmdline.c   |   52 ++
 app/test-pmd/config.c|  142 ++
 app/test-pmd/icmpecho.c  |   65 +++--
 app/test-pmd/testpmd.h   |6 ++
 drivers/net/e1000/em_ethdev.c|   17 +
 drivers/net/e1000/igb_ethdev.c   |   18 +
 drivers/net/ixgbe/ixgbe_ethdev.c |   32 +
 lib/librte_ether/rte_ethdev.c|   17 +
 lib/librte_ether/rte_ethdev.h|   26 +++
 9 files changed, 369 insertions(+), 6 deletions(-)

-- 
1.7.10.4

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Ivan Boule

Hi Avi,

On 05/07/2015 04:02 PM, Avi Kivity wrote:
> On Wed, Apr 22, 2015 at 6:11 PM, O'Driscoll, Tim 
> wrote:
>
>> Does anybody have any input or comments on this?
>>
>>
>>> -Original Message-
>>> From: O'Driscoll, Tim
>>> Sent: Thursday, April 16, 2015 11:39 AM
>>> To: dev at dpdk.org
>>> Subject: Beyond DPDK 2.0
>>>
>>> Following the launch of DPDK by Intel as an internal development
>>> project, the launch of dpdk.org by 6WIND in 2013, and the first DPDK RPM
>>> packages for Fedora in 2014, 6WIND, Red Hat and Intel would like to
>>> prepare for future releases after DPDK 2.0 by starting a discussion on
>>> its evolution. Anyone is welcome to join this initiative.
>>>
>>> Since then, the project has grown significantly:
>>> -The number of commits and mailing list posts has increased
>>> steadily.
>>> -Support has been added for a wide range of new NICs (Mellanox
>>> support submitted by 6WIND, Cisco VIC, Intel i40e and fm10k etc.).
>>> -DPDK is now supported on multiple architectures (IBM Power support
>>> in DPDK 1.8, Tile support submitted by EZchip but not yet reviewed or
>>> applied).
>>>
>>> While this is great progress, we need to make sure that the project is
>>> structured in a way that enables it to continue to grow. To achieve
>>> this, 6WIND, Red Hat and Intel would like to start a discussion about
>>> the future of the project, so that we can agree and establish processes
>>> that satisfy the needs of the current and future DPDK community.
>>>
>>> We're very interested in hearing the views of everybody in the
>>> community. In addition to debate on the mailing list, we'll also
>>> schedule community calls to discuss this.
>>>
>>>
>>> Project Goals
>>> -
>>>
>>> Some topics to be considered for the DPDK project include:
>>> -Project Charter: The charter of the DPDK project should be clearly
>>> defined, and should explain the limits of DPDK (what it does and does
>>> not cover). This does not mean that we would be stuck with a singular
>>> charter for all time, but the direction and intent of the project should
>>> be well understood.
>>
>
>
> One problem we've seen with dpdk is that it is a framework, not a library:
> it wants to create threads, manage memory, and generally take over.  This
> is a problem for us, as we are writing a framework (seastar, [1]) and need
> to create threads, manage memory, and generally take over ourselves.
>
> Perhaps dpdk can be split into two layers, a library layer that only
> provides mechanisms, and a framework layer that glues together those
> mechanisms and applies a policy, trading in generality for ease of use.
>
> [1] http://seastar-project.org
>
I fully agree with this analysis/proposal.
And I think that:
- the associated modifications should be done ASAP,
- the underlying design rules that this proposal refers to should drive 
the design and review of new DPDK features.

Regards,
Ivan

[dpdk-dev] [RFC] ethdev function to update MAC multicast addresses filtered by a port

2015-04-08 Thread Ivan Boule

This RFC describes a proposed change in the ethdev API for updating
the set of multicast MAC addresses that are filtered by a port.

The change consists in adding a new function "update_mc_addr_list" that
takes the whole set of multicast MAC addresses to be filtered by a port,
if any.

In case the whole set of multicast addresses is too large for the device
resources used to record them, the ethdev function "update_mc_addr_list" 
must transparently enable the allmulticast feature of the port, if such 
a feature is supported/enabled by the device.

Reason for Change
-

With the current ethdev API, the receipt of multicast packets on a given 
port can only be enabled by invoking the "rte_eth_allmulticast_enable()" 
function.
This approach may not work on Virtual Functions in SR-IOV architectures
when the host PF driver does not allow this operation on VFs.
In such a case, joined multicast addresses must be individually added 
into the set of multicast MAC addresses that are filtered by the [VF] port.

For this purpose, a new function "update_mc_addr_list()" must be 
introduced into the set of functions exported by a Poll Mode Driver.
Each time a DPDK application joins (respectively leaves) an IP multicast
group, it must add (respectively remove) the associated multicast MAC 
address from the set of multicast address to be filtered, and invoke the 
function "update_mc_addr_list()" with the updated list of multicast 
addresses on each relevant port.

Proposed API extension
--

The new function "update_mc_addr_list" is added into the "eth_dev_ops"
data structure:

typedef int (*eth_update_mc_addr_list_t)(struct rte_eth_dev *dev,
 struct ether_addr *mc_addr_set,
 uint32_t nb_mc_addr);

It is exported through the following new function:

/**
  * Update the set of multicast addresses to filter on an Ethernet device.
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
  * @param mc_addr_set
  *   The array of multicast addresses to filter. Equal to NULL when the 
function
  *   is invoked to flush all multicast MAC addresses filtered by the port.
  * @param nb_mc_addr
  *   The number of multicast addresses in the *mc_addr_set* array. 
Equal to 0
  *   when the function is invoked to flush the set of multicast MAC 
addresses.
  * @return
  *   - (0) if successful.
  *   - (-ENOTSUP) if hardware doesn't support multicast filtering.
  *   - (-ENODEV) if *port_id* invalid.
  */
int rte_eth_dev_update_mc_addr_list(uint8_t port_id,
struct ether_addr *mc_addr_set,
    uint32_t nb_mc_addr);

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH] net: fix conflict with libc

2014-11-27 Thread Ivan Boule

On 11/27/2014 12:48 PM, Thomas Monjalon wrote:
> It was impossible to include netinet/in.h and rte_ip.h
> because the IP protocols were redefined.
> It is removed because useless.
>
> Signed-off-by: Thomas Monjalon 

Acked by Ivan Boule 

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [RFC] librte_pmd_null: Add null PMD

2014-09-22 Thread Ivan Boule

ode = numa_node;
> +
> + data->dev_private = internals;
> + data->port_id = eth_dev->data->port_id;
> + data->nb_rx_queues = (uint16_t)nb_rx_queues;
> + data->nb_tx_queues = (uint16_t)nb_tx_queues;
> + data->dev_link = pmd_link;
> + data->mac_addrs = _addr;
> +
> + eth_dev->data = data;
> + eth_dev->dev_ops = 
> + eth_dev->pci_dev = pci_dev;
> +
> + /* finally assign rx and tx ops */
> + if (packet_copy) {
> + eth_dev->rx_pkt_burst = eth_null_copy_rx;
> + eth_dev->tx_pkt_burst = eth_null_copy_tx;
> + } else {
> + eth_dev->rx_pkt_burst = eth_null_rx;
> + eth_dev->tx_pkt_burst = eth_null_tx;
> + }
> +
> + return 0;
> +
> +error:
> + if (data)
> + rte_free(data);
> + if (pci_dev)
> + rte_free(pci_dev);
> + if (internals)
> + rte_free(internals);
> + return -1;
> +}
> +
> +static inline int
> +get_packet_size_arg(const char *key __rte_unused,
> + const char *value, void *extra_args)
> +{
> + const char *a = value;
> + unsigned *packet_size = extra_args;
> +
> + *packet_size = (unsigned)strtoul(a, NULL, 0);
> + if (*packet_size == UINT_MAX)
> + return -1;
> +
> + return 0;
> +}
> +
> +static inline int
> +get_packet_copy_arg(const char *key __rte_unused,
> + const char *value, void *extra_args)
> +{
> + const char *a = value;
> + unsigned *packet_copy = extra_args;
> +
> + *packet_copy = (unsigned)strtoul(a, NULL, 0);
> + if (*packet_copy == UINT_MAX)
> + return -1;
> +
> + return 0;
> +}
> +
> +static int
> +rte_pmd_null_devinit(const char *name, const char *params)
> +{
> + unsigned numa_node;
> + unsigned packet_size = default_packet_size;
> + unsigned packet_copy = default_packet_copy;
> + struct rte_kvargs *kvlist;
> + int ret;
> +
> + RTE_LOG(INFO, PMD, "Initializing pmd_null for %s\n", name);
> +
> + numa_node = rte_socket_id();
> +
> + kvlist = rte_kvargs_parse(params, valid_arguments);
> + if (kvlist == NULL)
> + return -1;
> +
> + if (rte_kvargs_count(kvlist, ETH_NULL_PACKET_SIZE_ARG) == 1) {
> +
> + ret = rte_kvargs_process(kvlist, ETH_NULL_PACKET_SIZE_ARG,
> + _packet_size_arg, _size);
> + if (ret < 0)
> + return -1;
> + }
> +
> + if (rte_kvargs_count(kvlist, ETH_NULL_PACKET_COPY_ARG) == 1) {
> +
> + ret = rte_kvargs_process(kvlist, ETH_NULL_PACKET_COPY_ARG,
> + _packet_copy_arg, _copy);
> + if (ret < 0)
> + return -1;
> + }
> +
> + RTE_LOG(INFO, PMD, "Configure pmd_null: packet size is %d, "
> + "packet copy is %s\n", packet_size,
> + packet_copy ? "enabled" : "disabled");
> +
> + return eth_dev_null_create(name, numa_node, packet_size, packet_copy);
> +}
> +
> +static struct rte_driver pmd_null_drv = {
> + .name = "eth_null",
> + .type = PMD_VDEV,
> + .init = rte_pmd_null_devinit,
> +};
> +
> +PMD_REGISTER_DRIVER(pmd_null_drv);
>


-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] Making space in mbuf data-structure

2014-07-08 Thread Ivan Boule

tions, and could be 
>> included in a
>> sort of non-regression test?
>>
>> Regards,
>> Olivier
>>
>> [1] http://dpdk.org/ml/archives/dev/2014-May/002537.html
>> [2] http://dpdk.org/ml/archives/dev/2014-May/002322.html
>
> Hi Olivier
>
> I am trying to convince you on the new field of "filter status".
> It is for matched Flow Director Filter ID, and might be reused for HASH 
> signature if it matches hash filter, or others.
> It is quite useful for Flow Director, and not a flag. I guess there should 
> have the similar feature even in non-Intel NICs.
>
By construction, since a packet cannot match more than 1 filter with an 
associated identifier, this is typically the kind of field that should 
be put in an union with the standard 32-bit RSS id.

Regards,
Ivan

> Regards,
> Helin
>


-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [v2 00/23] Packet Framework

2014-06-05 Thread Ivan Boule

On 06/04/2014 08:08 PM, Cristian Dumitrescu wrote:
> (Version 2 changes are exclusively style changes (checkpatch.pl) and patch 
> consolidation, no functional change)
>
> Intel DPDK Packet Framework provides a standard methodology (logically 
> similar to OpenFlow) for rapid development of complex packet processing 
> pipelines out of ports, tables and actions.
>
> A pipeline is constructed by connecting its input ports to its output ports 
> through a chain of lookup tables. As result of lookup operation into the 
> current table, one of the table entries (or the default table entry, in case 
> of lookup miss) is identified to provide the actions to be executed on the 
> current packet and the associated action meta-data. The behavior of user 
> actions is defined through the configurable table action handler, while the 
> reserved actions define the next hop for the current packet (either another 
> table, an output port or packet drop) and are handled transparently by the 
> framework.
>
> Three new Intel DPDK libraries are introduced for Packet Framework: 
> librte_port, librte_table, librte_pipeline. Please check the Intel DPDK 
> Programmer's Guide for full description of the Packet Framework design.
>
> Two sample applications are provided for Packet Framework: app/test-pipeline 
> and examples/ip_pipeline. Please check the Intel Sample Apps Guide for a 
> detailed description of how these sample apps.
>
> Cristian Dumitrescu (23):
>librte_lpm: rule_is_present
>mbuf: meta-data
>Packet Framework librte_port: Port API
>Packet Framework librte_port: ethdev ports
>Packet Framework librte_port: ring ports
>Packet Framework librte_port: IPv4 frag port
>Packet Framework librte_port: IPv4 reassembly
>Packet Framework librte_port: hierarchical scheduler port
>Packet Framework librte_port: Source/Sink ports
>Packet Framework librte_port: Build infrastructure
>Packet Framework librte_table: Table API
>Packet Framework librte_table: LPM IPv4 table
>Packet Framework librte_table: LPM IPv6 table
>Packet Framework librte_table: ACL table
>Packet Framework librte_table: Hash tables
>Packet Framework librte_table: array table
>Packet Framework librte_table: Stub table
>Packet Framework librte_table: Build infrastructure
>Packet Framework librte_pipeline: Pipeline
>librte_cfgfile: interpret config files
>    Packet Framework performance application
>Packet Framework IPv4 pipeline sample app
>Packet Framework unit tests
>

Acked by: Ivan Boule 

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in the packet buffer just after mbuf

2014-06-02 Thread Ivan Boule

ther, but this problem 
> is not at all new, nor introduced by the meta-data field: even currently, the 
> application has to make sure the headroom is dimensioned correctly, so that 
> even in the worst case scenario (application-specific), the packet bytes do 
> not run into the mbuf fields, right?
>
> - For indirect mbufs, the packet meta-data is associated with the indirect 
> mbuf rather than the direct mbuf (the one the indirect mbuf attaches to, as 
> the direct mbuf contains a different packet that has different meta-data), so 
> it makes sense that meta-data of the indirect mbuf is stored in the same 
> buffer as the indirect mbuf (rather than the buffer of the attached direct 
> mbuf). So this means that the buffer size used to store the indirect mbuf is 
> sized appropriately (mbuf size, currently 64 bytes, plus the max size for 
> additional meta-data). This is illustrated in the code of the IPv4 
> fragmentation port, where for every child packet (output fragment) we copy 
> the parent meta-data in the child buffer (which stores an indirect buffer 
> attached to the direct buffer of the input jumbo, plus now additional 
> meta-data).
>
> - We are also considering having a user_data field in the mbuf itself to 
> point to meta-data in the same buffer or any other buffer. The Packet 
> Framework functionally works with both approaches, but there is a performance 
> problem associated with the mbuf->user_data approach that we are not 
> addressing for this 1.7 release timeframe. The issue is the data dependency 
> that is created, as in order to find out the location of the meta-data, we 
> need to first read the mbuf  and then read meta-data from mbuf->user_data. 
> The cost of the additional memory access is high, due to data dependency 
> preventing efficient prefetch, therefore (at least for the time being) we 
> need to use a compile-time known meta-data offset. The application can fill 
> the meta-data on packet RX and then all additional CPU cores processing the 
> packet can leave of the meta-data for a long time with no need to access the 
> mbuf or the packet (efficiency). For the time being, if somebody needs more 
> yet meta-data in yet another 
buffer, they can add (for their application) a user_data pointer as part of 
their application meta-data (instead of standard mbuf).
>
> - As said, the mbuf->metadata points to the first location where meta-data 
> _can_ be placed, if somebody needs a different offset, they can add it on top 
> of the mbuf->metadata (e.g. by having a reserved field in their struct 
> app_pkt_metadata). We have demonstrated the use of meta-data in the 
> examples/ip_pipeline sample app (see struct app_pkt_metadata in "main.h").
>
> Please let me know if this makes sense?
>
> Regards,
> Cristian
>
> PS: This is where a white board would help a lot ...
>
>
> -Original Message-
> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
> Sent: Wednesday, May 28, 2014 1:03 PM
> To: Dumitrescu, Cristian; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data 
> in the packet buffer just after mbuf
>
> Hi Cristian,
>
> Currently, the DPDK framework does not make any assumption on the actual
> layout of a mbuf.
> More precisely, the DPDK does not impose any constraint on the actual
> location of additional metadata, if any, or on the actual location and
> size of its associated payload data buffer.
> This is coherent with the fact that mbuf pools are not created by the
> DPDK itself, but by network applications that are free to choose
> whatever packet mbuf layout that fits their particular needs.
>
> There is one exception to this basic DPDK rule: the mbuf cloning feature
> available through the RTE_MBUF_SCATTER_GATHER configuration option
> assumes that the payload data buffer of the mbuf immediately follows the
> rte_mbuf data structure (see the macros RTE_MBUF_FROM_BADDR,
> RTE_MBUF_TO_BADDR, RTE_MBUF_INDIRECT, and RTE_MBUF_DIRECT in the file
> lib/librte_mbuf/rte_mbuf.h).
>
> The cloning feature prevents to build packet mbufs with their metadata
> located immediately after the rte_mbuf data structure, which is exactly
> what your patch introduces.
>
> At least, a comment that clearly warns the user of this incompatibility
> might be worth adding into both the code and your patch log.
>
> Regards,
> Ivan
>
> On 05/27/2014 07:09 PM, Cristian Dumitrescu wrote:
>> Added zero-size field (offset in data structure) to specify the beginning of 
>> packet meta-data in the packet buffer just after the mbuf.
>>
>> The size of the packet meta-data is application specific and the packet 
>> meta-data is managed by the application.
>>
>

[dpdk-dev] [PATCH v2 0/3] Support setting link up and link down

2014-05-28 Thread Ivan Boule

On 05/28/2014 09:14 AM, Ouyang Changchun wrote:
> Please ignore the previous patch series with subject: "Support administrative 
> link up and link down"
> This v2 patch series will replace the previous patch series.
>
> This patch series contain the following 3 items:
> 1. Add API to support setting link up and down, it can be used to repeatedly 
> stop and restart
> RX/TX of a port without re-allocating resources for the port and 
> re-configuring the port.
> 2. Implement the functionality of setting link up and down in IXGBE PMD.
> 3. Add command in testpmd to test the functionality of setting link up and 
> down of PMD.
>
> Ouyang Changchun (3):
>Add API to support set link up and link down.
>Implement the functionality of setting link up and link down in IXGBE
>  PMD.
>Add command line to test the functionality of setting link up and link
>  down in testpmd.
>

Acked by: Ivan Boule 

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in the packet buffer just after mbuf

2014-05-28 Thread Ivan Boule

Hi Cristian,

Currently, the DPDK framework does not make any assumption on the actual
layout of a mbuf.
More precisely, the DPDK does not impose any constraint on the actual
location of additional metadata, if any, or on the actual location and
size of its associated payload data buffer.
This is coherent with the fact that mbuf pools are not created by the
DPDK itself, but by network applications that are free to choose
whatever packet mbuf layout that fits their particular needs.

There is one exception to this basic DPDK rule: the mbuf cloning feature 
available through the RTE_MBUF_SCATTER_GATHER configuration option 
assumes that the payload data buffer of the mbuf immediately follows the 
rte_mbuf data structure (see the macros RTE_MBUF_FROM_BADDR, 
RTE_MBUF_TO_BADDR, RTE_MBUF_INDIRECT, and RTE_MBUF_DIRECT in the file 
lib/librte_mbuf/rte_mbuf.h).

The cloning feature prevents to build packet mbufs with their metadata 
located immediately after the rte_mbuf data structure, which is exactly 
what your patch introduces.

At least, a comment that clearly warns the user of this incompatibility
might be worth adding into both the code and your patch log.

Regards,
Ivan

On 05/27/2014 07:09 PM, Cristian Dumitrescu wrote:
> Added zero-size field (offset in data structure) to specify the beginning of 
> packet meta-data in the packet buffer just after the mbuf.
>
> The size of the packet meta-data is application specific and the packet 
> meta-data is managed by the application.
>
> The packet meta-data should always be accessed through the provided macros.
>
> This is used by the Packet Framework libraries (port, table, pipeline).
>
> There is absolutely no performance impact due to this mbuf field, as it does 
> not take any space in the mbuf structure (zero-size field).
>
> Signed-off-by: Cristian Dumitrescu 
> ---
>   lib/librte_mbuf/rte_mbuf.h |   17 +
>   1 files changed, 17 insertions(+), 0 deletions(-)
>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 4a9ab41..bf09618 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -201,8 +201,25 @@ struct rte_mbuf {
>   struct rte_ctrlmbuf ctrl;
>   struct rte_pktmbuf pkt;
>   };
> + 
> + union {
> + uint8_t metadata[0];
> + uint16_t metadata16[0];
> + uint32_t metadata32[0];
> + uint64_t metadata64[0];
> + };
>   } __rte_cache_aligned;
>
> +#define RTE_MBUF_METADATA_UINT8(mbuf, offset)   (mbuf->metadata[offset])
> +#define RTE_MBUF_METADATA_UINT16(mbuf, offset)  
> (mbuf->metadata16[offset/sizeof(uint16_t)])
> +#define RTE_MBUF_METADATA_UINT32(mbuf, offset)  
> (mbuf->metadata32[offset/sizeof(uint32_t)])
> +#define RTE_MBUF_METADATA_UINT64(mbuf, offset)  
> (mbuf->metadata64[offset/sizeof(uint64_t)])
> +
> +#define RTE_MBUF_METADATA_UINT8_PTR(mbuf, offset)   (>metadata[offset])
> +#define RTE_MBUF_METADATA_UINT16_PTR(mbuf, offset)  
> (>metadata16[offset/sizeof(uint16_t)])
> +#define RTE_MBUF_METADATA_UINT32_PTR(mbuf, offset)  
> (>metadata32[offset/sizeof(uint32_t)])
> +#define RTE_MBUF_METADATA_UINT64_PTR(mbuf, offset)  
> (>metadata64[offset/sizeof(uint64_t)])
> +
>   /**
>* Given the buf_addr returns the pointer to corresponding mbuf.
>*/
>

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH 0/3] Support administrative link up and link down

2014-05-23 Thread Ivan Boule

On 05/23/2014 04:08 AM, Ouyang, Changchun wrote:
> Hi Ivan
>
> To some extent, I also agree with you.
> But customer hope DPDK can provide an interface like "ifconfig up" and 
> "ifconfig down" in linux,
> They can invoke such an interface in user application to repeated stop and 
> start dev frequently, and
> Make sure RX and TX work fine after each start, I think it is not necessary 
> to do really device start and stop at
> Each time, just need start and stop RX and TX function, so the 
> straightforward method is to enable and disable
> tx lazer in ixgbe.
> But in the ether level we need a more generic api name, here is 
> rte_eth_dev_admin_link_up/down, while enable_tx_laser is not suitable,
> Enable and disable tx laser is a way in ixgbe to fulfill the administrative 
> link up and link down.
> maybe Fortville and future generation NIC will use other ways to fulfill the 
> admin_link_up/down.
>

Hi Changchun,

I do not understand what your customer effectively needs.
First of all, if I understand well, your customer's application does not 
really need to invoke the DPDK functions "eth_dev_stop" and 
"eth_dev_start" for addressing its problem, for instance to reconfigure 
RX/TX queues of the port.
When considering the implementation in the ixgbe PMD of the function
"rte_eth_dev_admin_link_down", its only visible effects from the DPDK
application perspective is that no input packet can be received anymore, 
and output packets cannot be transmitted (once having filled the TX queues).

Conversely, the only visible effect of the "rte_eth_dev_admin_link_up"
function is that input packets are received again, and that output 
packets can be successfully transmitted.

In fact, by disabling the TX laser on a ixgbe port, the only interesting
effect of the function "rte_eth_dev_admin_link_down" is that it notifies
the peer system of a hardware link DOWN event (with no physical link
unplug on the peer side).
Conversely, by enabling the TX laser on a ixgbe port, the only 
interesting effect of the function "rte_eth_dev_admin_link_up" is that 
it notifies the peer system of a hardware link UP event.

Is that the actions that your customer's application actually needs to
perform? If so, then this certainly deserves a real operational use case
that it is worth describing in the patch log.
This would help DPDK PMD implementors to understand what such functions 
can be used for, and to decide whether they actually need to be 
supported by the PMD.

Assuming that these 2 functions need to be provided to address the issue
described above, I do not think that the word "admin" brings anything 
for understanding their role. In fact, the word "admin" rather suggests 
a pure "software" down/up setting, instead of a physical one.
Naming these 2 functions "rte_eth_dev_set_link_down"
and "rte_eth_dev_set_link_up" better describes their expected effect.

Regards,
Ivan

>
> On 05/22/2014 04:44 PM, Ouyang, Changchun wrote:
>> Hi Ivan
>> For this one, it seems long story for that...
>> In short,
>> Some customer have such kind of requirement, they want to repeatedly
>> start(rte_dev_start) and stop(rte_dev_stop) the port for RX and TX,
>> but they find after several times start and stop, the RX and TX can't work 
>> well even the port starts,  and the packets error number increase.
>>
>> To resolve this error number increase issue, and let port work fine
>> even after repeatedly start and stop, We need a new API to do it, after 
>> discussing, we have these 2 API, admin link up and admin link down.
>
> If I understand well, this "feature" is not needed by itself, but only as a 
> work-around to address issues when repeatedly invoking the functions 
> ixgbe_dev_stop and ixgbe_dev_start.
> Do such issues appear when performing the same operations with the Linux 
> kernel driver?
>
> Anyway, I suppose that such functions have to be automatically invoked by the 
> same code of the network application that invokes the functions 
> ixgbe_dev_stop and ixgbe_dev_start (said differently, there is no need for a 
> manual assistance !)
>
> In that case, would not it be possible - and highly preferable - to directly 
> invoke the functions ixgbe_disable_tx_laser and, then, ixgbe_enable_tx_laser 
> from the appropriate step during the execution of the function 
> ixgbe_dev_start(), waiting for some appropriate delays between the two 
> operations, if so needed?
>
> Regards,
> Ivan
>
>
>>
>> Any difference if use " dev_link_start/stop" or " dev_link_up/down"?
>> to me, admin_link_up/down is better than dev_link_start/stop,
>>
>> If most people think we need change

[dpdk-dev] [PATCH 0/3] Support administrative link up and link down

2014-05-22 Thread Ivan Boule

Hi Changchun,

On 05/22/2014 04:44 PM, Ouyang, Changchun wrote:
> Hi Ivan
> For this one, it seems long story for that...
> In short,
> Some customer have such kind of requirement,
> they want to repeatedly start(rte_dev_start) and stop(rte_dev_stop) the port 
> for RX and TX, but they find
> after several times start and stop, the RX and TX can't work well even the 
> port starts,  and the packets error number increase.
>
> To resolve this error number increase issue, and let port work fine even 
> after repeatedly start and stop,
> We need a new API to do it, after discussing, we have these 2 API, admin link 
> up and admin link down.

If I understand well, this "feature" is not needed by itself, but only 
as a work-around to address issues when repeatedly invoking the 
functions ixgbe_dev_stop and ixgbe_dev_start.
Do such issues appear when performing the same operations with the Linux 
kernel driver?

Anyway, I suppose that such functions have to be automatically invoked 
by the same code of the network application that invokes the functions 
ixgbe_dev_stop and ixgbe_dev_start (said differently, there is no need 
for a manual assistance !)

In that case, would not it be possible - and highly preferable - to 
directly invoke the functions ixgbe_disable_tx_laser and, then, 
ixgbe_enable_tx_laser from the appropriate step during the execution of 
the function ixgbe_dev_start(), waiting for some appropriate delays 
between the two operations, if so needed?

Regards,
Ivan


>
> Any difference if use " dev_link_start/stop" or " dev_link_up/down"? to me, 
> admin_link_up/down is better than dev_link_start/stop,
>
> If most people think we need change the name, it is ok to rename it.
>
> I don't think we need it in non-physical PMDs. So no implementation in virtio 
> PMD.
>
> Thanks
> Changchun
>
>
> -Original Message-
> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
> Sent: Thursday, May 22, 2014 9:17 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/3] Support administrative link up and link 
> down
>
> On 05/22/2014 08:11 AM, Ouyang Changchun wrote:
>> This patch series contain the following 3 items:
>> 1. Add API to support administrative link up and down.
>> 2. Implement the functionality of administrative link up and down in IXGBE 
>> PMD.
>> 3. Add command in testpmd to test the functionality of administrative link 
>> up and down of PMD.
>>
...

> Hi Changchun,
>
> The 2 functions "rte_eth_dev_admin_link_up" and "rte_eth_dev_admin_link_down"
> don't have an equivalent in the Linux kernel, thus I am wondering what is 
> their effective usage from a network application perspective.
> Could you briefly explain in which use case these functions can be used for?
>
> By the way, it's not completely evident to infer the exact semantics of these 
> 2 functions from their name.
> In particular, I do not see what the term "admin" brings to the understanding 
> of their role. If it is to suggest that these functions are intended to force 
> the link to a different state of its initial [self-detected] state, then the 
> term "force" would be more appropriate.
>
> Otherwise, if eventually these functions appear to be mandatory, I suggest to 
> rename them "rte_eth_dev_link_start" and "rte_eth_dev_link_stop" 
> respectively, and to apply the same naming conventions in the 2 other patches.
>
> It might also be worth documenting in the comment section of the prototype of 
> these 2 functions whether it makes sense or not to support a notion of link 
> that can be dynamically started or stopped in non-physical PMDs (vmxnet3, 
> virtio, etc).


-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH 5/6] ether: allow setting mac address

2014-05-21 Thread Ivan Boule

Hi Stephen,

Looking more precisely to your patch, I think that your requirement can 
beaddressed without adding
a new `mac_addr_set` function into the [already overloaded] set of 
operations exported by a PMD.

In fact, changing the default MAC address used by a port consists in 
executingthe following sequence of operations:
 1- rte_eth_macaddr_get(port_id, _mac_addr) to obtain the 
default MACaddress currently used by the port,
 2- rte_eth_dev_mac_addr_remove(port_id, _mac_addr) to remove 
thedefault MAC address currently used by the port,
 3- rte_eth_dev_mac_addr_add(port_id, _mac_addr) to add the new 
defaultMAC address to be used by the port.

This method takes advantage of the fact that the allocation of a 
freeentry into the array `data->mac_addrs` that
contains the MAC addresses of a port searches the pool from the entry at 
index 0, which contains the default MAC
address. This implementation simply needs to be mandatory from now.

The actual implementation might look like this.

Add the prototype of the function `rte_eth_dev_macaddr_set` in the 
file`rte_ethdev.h`:

 extern int rte_eth_dev_macaddr_set((uint8_t port_id, struct 
ether_addr *addr);

And add the following generic implementation of the function 
`rte_eth_dev_macaddr_set`in the file `rte_ethdev.c`:

int
rte_eth_dev_macaddr_set(uint8_t port_id, struct ether_addr *addr)
{
struct rte_eth_dev *dev;
struct ether_addr cur_addr;
int diag;

if (port_id >= nb_ports) {
PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
return (-ENODEV);
}

if (is_zero_ether_addr(addr)) {
PMD_DEBUG_TRACE("port %d: Cannot add NULL MAC address\n",
port_id);
return (-EINVAL);
}

dev = _eth_devices[port_id];

/*
 * Check that both mac_addr_remove and mac_addr_add operations are
 * supported by the PMD of the port.
 */
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mac_addr_remove, -ENOTSUP);
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mac_addr_add, -ENOTSUP);

rte_eth_macaddr_get(port_id, _addr);
diag = rte_eth_dev_mac_addr_remove(port_id, _addr);
if (diag < 0)
return diag;
return rte_eth_dev_mac_addr_add(port_id, addr);
}

Regards,
Ivan

On 05/15/2014 11:04 AM, Ivan Boule wrote:
> Hi Stephen,
>
> By default, changing the [current] default MAC address of a device is 
> not equivalent to adding a MAC address to it ! At least, the minimum 
> to do should consist in:
>1. checking that the PMD exports both mac_addr_remove() 
> and mac_addr_add() functions, returning a NON_SUPPORTED error code 
> otherwise,
>2. removing the current default MAC address by invoking the PMD's 
> mac_addr_remove() function,
>3. adding the new MAC address by invoking the PMD's mac_addr_add() 
> function.
>
> Regards,
> Ivan
>
> 2014-05-14 20:55 GMT+02:00 Stephen Hemminger 
> mailto:stephen at networkplumber.org>>:
>
> Allow setting the default Ethernet address used on device.
> The underlying drivers allow it but the DPDK was blocking any
> attempts to change Ethernet address on device.
>
> For most devices, this is just the same as filling in address
> in mac address table entry 0, but for some devices this requires
> special override.
>
> Signed-off-by: Stephen Hemminger  <mailto:shemming at brocade.com>>
>
>
> --- a/lib/librte_ether/rte_ethdev.h 2014-05-14
> 11:44:27.373014227 -0700
> +++ b/lib/librte_ether/rte_ethdev.h 2014-05-14
> 11:44:27.373014227 -0700
> @@ -982,6 +982,11 @@ typedef void (*eth_mac_addr_add_t)(struc
>   struct ether_addr *mac_addr,
>   uint32_t index,
>   uint32_t vmdq);
> +/**< @internal Change default MAC address */
> +
> +typedef void (*eth_mac_addr_set_t)(struct rte_eth_dev *dev,
> +  struct ether_addr *mac_addr);
> +
>  /**< @internal Set a MAC address into Receive Address Address
> Register */
>
>  typedef int (*eth_uc_hash_table_set_t)(struct rte_eth_dev *dev,
> @@ -1114,6 +1119,7 @@ struct eth_dev_ops {
> priority_flow_ctrl_set_t priority_flow_ctrl_set; /**<
> Setup priority flow control.*/
> eth_mac_addr_remove_t  mac_addr_remove; /**< Remove
> MAC address */
> eth_mac_addr_add_t mac_addr_add;  /**< Add a MAC
> address */
> +   eth_mac_addr_set_t mac_addr_set;  /**< Change
> default MAC address */
> eth_uc_hash_table_set_tuc_hash_table_set;  /**< Set
>

[dpdk-dev] [PATCH v2 5/5] app/testpmd: allow to configure RSS hash key

2014-05-16 Thread Ivan Boule

Add the command "port config X rss-hash-key key" in the 'testpmd'
application to configure the RSS hash key used to compute the RSS
hash of input [IP] packets received on port X.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/cmdline.c |   96 +++-
 app/test-pmd/config.c  |   28 ++
 app/test-pmd/testpmd.h |1 +
 3 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0b6749c..7b4f090 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1192,6 +1192,99 @@ cmdline_parse_inst_t cmd_config_rss = {
},
 };

+/* *** configure rss hash key *** */
+struct cmd_config_rss_hash_key {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t config;
+   uint8_t port_id;
+   cmdline_fixed_string_t rss_hash_key;
+   cmdline_fixed_string_t key;
+};
+
+#define RSS_HASH_KEY_LENGTH 40
+static uint8_t
+hexa_digit_to_value(char hexa_digit)
+{
+   if ((hexa_digit >= '0') && (hexa_digit <= '9'))
+   return (uint8_t) (hexa_digit - '0');
+   if ((hexa_digit >= 'a') && (hexa_digit <= 'f'))
+   return (uint8_t) ((hexa_digit - 'a') + 10);
+   if ((hexa_digit >= 'A') && (hexa_digit <= 'F'))
+   return (uint8_t) ((hexa_digit - 'A') + 10);
+   /* Invalid hexa digit */
+   return 0xFF;
+}
+
+static uint8_t
+parse_and_check_key_hexa_digit(char *key, int idx)
+{
+   uint8_t hexa_v;
+
+   hexa_v = hexa_digit_to_value(key[idx]);
+   if (hexa_v == 0xFF)
+   printf("invalid key: character %c at position %d is not a "
+  "valid hexa digit\n", key[idx], idx);
+   return hexa_v;
+}
+
+static void
+cmd_config_rss_hash_key_parsed(void *parsed_result,
+  __attribute__((unused)) struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_config_rss_hash_key *res = parsed_result;
+   uint8_t hash_key[RSS_HASH_KEY_LENGTH];
+   uint8_t xdgt0;
+   uint8_t xdgt1;
+   int i;
+
+   /* Check the length of the RSS hash key */
+   if (strlen(res->key) != (RSS_HASH_KEY_LENGTH * 2)) {
+   printf("key length: %d invalid - key must be a string of %d"
+  "hexa-decimal numbers\n", (int) strlen(res->key),
+  RSS_HASH_KEY_LENGTH * 2);
+   return;
+   }
+   /* Translate RSS hash key into binary representation */
+   for (i = 0; i < RSS_HASH_KEY_LENGTH; i++) {
+   xdgt0 = parse_and_check_key_hexa_digit(res->key, (i * 2));
+   if (xdgt0 == 0xFF)
+   return;
+   xdgt1 = parse_and_check_key_hexa_digit(res->key, (i * 2) + 1);
+   if (xdgt1 == 0xFF)
+   return;
+   hash_key[i]= (uint8_t) ((xdgt0 * 16) + xdgt1);
+   }
+   port_rss_hash_key_update(res->port_id, hash_key);
+}
+
+cmdline_parse_token_string_t cmd_config_rss_hash_key_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_rss_hash_key, port, "port");
+cmdline_parse_token_string_t cmd_config_rss_hash_key_config =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_rss_hash_key, config,
+"config");
+cmdline_parse_token_string_t cmd_config_rss_hash_key_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_config_rss_hash_key, port_id, UINT8);
+cmdline_parse_token_string_t cmd_config_rss_hash_key_rss_hash_key =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_rss_hash_key,
+rss_hash_key, "rss-hash-key");
+cmdline_parse_token_string_t cmd_config_rss_hash_key_value =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_rss_hash_key, key, NULL);
+
+cmdline_parse_inst_t cmd_config_rss_hash_key = {
+   .f = cmd_config_rss_hash_key_parsed,
+   .data = NULL,
+   .help_str = "port config X rss-hash-key 80 hexa digits",
+   .tokens = {
+   (void *)_config_rss_hash_key_port,
+   (void *)_config_rss_hash_key_config,
+   (void *)_config_rss_hash_key_port_id,
+   (void *)_config_rss_hash_key_rss_hash_key,
+   (void *)_config_rss_hash_key_value,
+   NULL,
+   },
+};
+
 /* *** Configure RSS RETA *** */
 struct cmd_config_rss_reta {
cmdline_fixed_string_t port;
@@ -1413,7 +1506,7 @@ cmdline_parse_inst_t cmd_showport_rss_hash = {

 cmdline_parse_inst_t cmd_showport_rss_hash_key = {
.f = cmd_showport_rss_hash_parsed,
-   .data = "show_rss_key",
+   .data = (void *)1,
.help_str = "show port X rss-hash key (X = port number)\n",
.tokens = {
(void *)_showport_rss_hash_show,
@@ -5322,6 +5415,7 @@ cmdline_parse_ctx_t mai

[dpdk-dev] [PATCH v2 4/5] ethdev: allow to get RSS hash functions and key

2014-05-16 Thread Ivan Boule

1) Add a new function "rss_hash_conf_get" in the PMD API to retrieve the
   current configuration of the RSS functions and/or of the RSS key used
   by a NIC to compute the RSS hash of input packets.
   The new function uses the existing data structure "rte_eth_rss_conf" for
   returning the RSS hash configuration.

2) Add the ixgbe-specific function "ixgbe_dev_rss_hash_conf_get" and the
   igb-specific function "eth_igb_rss_hash_conf_get" to retrieve the RSS
   hash configuration of ixgbe and igb controllers respectively.

3) Add the command "show port X rss-hash [key]" in the testpmd application
   to display the RSS hash configuration of port X.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/cmdline.c  |   63 +
 app/test-pmd/config.c   |   65 +++
 app/test-pmd/testpmd.h  |2 ++
 lib/librte_ether/rte_ethdev.c   |   15 
 lib/librte_ether/rte_ethdev.h   |   23 +
 lib/librte_pmd_e1000/e1000_ethdev.h |3 ++
 lib/librte_pmd_e1000/igb_ethdev.c   |1 +
 lib/librte_pmd_e1000/igb_rxtx.c |   52 
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |3 +-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |3 ++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |   53 
 11 files changed, 282 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 407a2b9..0b6749c 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -182,6 +182,10 @@ static void cmd_help_long_parsed(void *parsed_result,
"show port (info|stats|fdir|stat_qmap) (port_id|all)\n"
"Display information for port_id, or all.\n\n"

+   "show port rss-hash [key]\n"
+   "Display the RSS hash functions and RSS hash key"
+   " of port X\n\n"
+
"clear port (info|stats|fdir|stat_qmap) (port_id|all)\n"
"Clear information for port_id, or all.\n\n"

@@ -1364,6 +1368,63 @@ cmdline_parse_inst_t cmd_showport_reta = {
},
 };

+/* *** Show RSS hash configuration *** */
+struct cmd_showport_rss_hash {
+   cmdline_fixed_string_t show;
+   cmdline_fixed_string_t port;
+   uint8_t port_id;
+   cmdline_fixed_string_t rss_hash;
+   cmdline_fixed_string_t key; /* optional argument */
+};
+
+static void cmd_showport_rss_hash_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   void *show_rss_key)
+{
+   struct cmd_showport_rss_hash *res = parsed_result;
+
+   port_rss_hash_conf_show(res->port_id, show_rss_key != NULL);
+}
+
+cmdline_parse_token_string_t cmd_showport_rss_hash_show =
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_rss_hash, show, "show");
+cmdline_parse_token_string_t cmd_showport_rss_hash_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_rss_hash, port, "port");
+cmdline_parse_token_num_t cmd_showport_rss_hash_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_showport_rss_hash, port_id, UINT8);
+cmdline_parse_token_string_t cmd_showport_rss_hash_rss_hash =
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_rss_hash, rss_hash,
+"rss-hash");
+cmdline_parse_token_string_t cmd_showport_rss_hash_rss_key =
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_rss_hash, key, "key");
+
+cmdline_parse_inst_t cmd_showport_rss_hash = {
+   .f = cmd_showport_rss_hash_parsed,
+   .data = NULL,
+   .help_str = "show port X rss-hash (X = port number)\n",
+   .tokens = {
+   (void *)_showport_rss_hash_show,
+   (void *)_showport_rss_hash_port,
+   (void *)_showport_rss_hash_port_id,
+   (void *)_showport_rss_hash_rss_hash,
+   NULL,
+   },
+};
+
+cmdline_parse_inst_t cmd_showport_rss_hash_key = {
+   .f = cmd_showport_rss_hash_parsed,
+   .data = "show_rss_key",
+   .help_str = "show port X rss-hash key (X = port number)\n",
+   .tokens = {
+   (void *)_showport_rss_hash_show,
+   (void *)_showport_rss_hash_port,
+   (void *)_showport_rss_hash_port_id,
+   (void *)_showport_rss_hash_rss_hash,
+   (void *)_showport_rss_hash_rss_key,
+   NULL,
+   },
+};
+
 /* *** Configure DCB *** */
 struct cmd_config_dcb {
cmdline_fixed_string_t port;
@@ -5259,6 +5320,8 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)_set_mirror_mask,
(cmdline_parse_inst_t *)_set_mirror_link,
(cmdline_parse_inst_t *)_reset_mirror_rule,
+   (cmdline_parse_inst_t *)

[dpdk-dev] [PATCH v2 3/5] app/testpmd: configure RSS without restart

2014-05-16 Thread Ivan Boule

The function cmd_config_rss_parsed() associated with the command
"port config rss all" required to first stop all ports, in order to
then entirely re-configure all ports with the new RSS hash computation
parameters.
Use now the new function rte_eth_dev_rss_hash_conf_update() that dynamically
only changes the RSS hash computation parameters of a port, without needing
to previously stop the port.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/cmdline.c |   20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 9cc680f..407a2b9 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1145,26 +1145,22 @@ cmd_config_rss_parsed(void *parsed_result,
__attribute__((unused)) void *data)
 {
struct cmd_config_rss *res = parsed_result;
-
-   if (!all_ports_stopped()) {
-   printf("Please stop all ports first\n");
-   return;
-   }
+   struct rte_eth_rss_conf rss_conf;
+   uint8_t i;

if (!strcmp(res->value, "ip"))
-   rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6;
+   rss_conf.rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6;
else if (!strcmp(res->value, "udp"))
-   rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6 | ETH_RSS_IPV4_UDP;
+   rss_conf.rss_hf = ETH_RSS_IPV4_UDP | ETH_RSS_IPV6_UDP;
else if (!strcmp(res->value, "none"))
-   rss_hf = 0;
+   rss_conf.rss_hf = 0;
else {
printf("Unknown parameter\n");
return;
}
-
-   init_port_config();
-
-   cmd_reconfig_device_queue(RTE_PORT_ALL, 1, 1);
+   rss_conf.rss_key = NULL;
+   for (i = 0; i < rte_eth_dev_count(); i++)
+   rte_eth_dev_rss_hash_update(i, _conf);
 }

 cmdline_parse_token_string_t cmd_config_rss_port =
-- 
1.7.10.4

[dpdk-dev] [PATCH v2 2/5] ethdev: allow to set RSS hash computation flags and/or key

2014-05-16 Thread Ivan Boule

1) Add a new function "rss_hash_update" in the PMD API to dynamically
   update the RSS flags and/or the RSS key used by a NIC to compute the RSS
   hash of input packets.
   The new function uses the existing data structure "rte_eth_rss_conf" for
   the argument that contains the new hash flags and/or the new hash key to
   use.

2) Add the ixgbe-specific function "ixgbe_dev_rss_hash_update" and the
   igb-specific function "eth_igb_rss_hash_update" to update the RSS
   hash configuration of ixgbe and igb controllers respectively.
   Before changing anything, these 2 functions check that the update RSS
   operation does not attempt to disable RSS, if RSS was enabled at port
   initialization time, or does not attempt to enable RSS, if RSS was
   disabled at port initialization time.

Note:
   Configuring the RSS hash flags and the RSS key used by a NIC consists in
   updating appropriate PCI registers of the NIC.
   These operations have been manually tested with the interactive commands
   "write reg" and "write regbit" of the testpmd application.

Signed-off-by: Ivan Boule 
---
 lib/librte_ether/rte_ethdev.c   |   22 ++
 lib/librte_ether/rte_ethdev.h   |   24 +++
 lib/librte_pmd_e1000/e1000_ethdev.h |3 +
 lib/librte_pmd_e1000/igb_ethdev.c   |1 +
 lib/librte_pmd_e1000/igb_rxtx.c |  123 +++---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |3 +
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |  125 +--
 8 files changed, 227 insertions(+), 75 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 473c98b..91375a1 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1572,6 +1572,28 @@ rte_eth_dev_rss_reta_query(uint8_t port_id, struct 
rte_eth_rss_reta *reta_conf)
 }

 int
+rte_eth_dev_rss_hash_update(uint8_t port_id, struct rte_eth_rss_conf *rss_conf)
+{
+   struct rte_eth_dev *dev;
+   uint16_t rss_hash_protos;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return (-ENODEV);
+   }
+   rss_hash_protos = rss_conf->rss_hf;
+   if ((rss_hash_protos != 0) &&
+   ((rss_hash_protos & ETH_RSS_PROTO_MASK) == 0)) {
+   PMD_DEBUG_TRACE("Invalid rss_hash_protos=0x%x\n",
+   rss_hash_protos);
+   return (-EINVAL);
+   }
+   dev = _eth_devices[port_id];
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_update, -ENOTSUP);
+   return (*dev->dev_ops->rss_hash_update)(dev, rss_conf);
+}
+
+int
 rte_eth_led_on(uint8_t port_id)
 {
struct rte_eth_dev *dev;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index dea7471..efb421a 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -331,6 +331,8 @@ struct rte_eth_rss_conf {
 #define ETH_RSS_IPV4_UDP0x0040 /**< IPv4/UDP packet. */
 #define ETH_RSS_IPV6_UDP0x0080 /**< IPv6/UDP packet. */
 #define ETH_RSS_IPV6_UDP_EX 0x0100 /**< IPv6/UDP with extension headers. */
+
+#define ETH_RSS_PROTO_MASK  0x01FF /**< Mask of valid RSS hash protocols */
 /* Definitions used for redirection table entry size */
 #define ETH_RSS_RETA_NUM_ENTRIES 128
 #define ETH_RSS_RETA_MAX_QUEUE   16  
@@ -966,6 +968,10 @@ typedef int (*reta_query_t)(struct rte_eth_dev *dev,
struct rte_eth_rss_reta *reta_conf);
 /**< @internal Query RSS redirection table on an Ethernet device */

+typedef int (*rss_hash_update_t)(struct rte_eth_dev *dev,
+struct rte_eth_rss_conf *rss_conf);
+/**< @internal Update RSS hash configuration of an Ethernet device */
+
 typedef int (*eth_dev_led_on_t)(struct rte_eth_dev *dev);
 /**< @internal Turn on SW controllable LED on an Ethernet device */

@@ -1153,6 +1159,8 @@ struct eth_dev_ops {
   bypass_wd_reset_t bypass_wd_reset;
 #endif

+   /** Configure RSS hash protocols. */
+   rss_hash_update_t rss_hash_update;
 };

 /**
@@ -2859,6 +2867,22 @@ int rte_eth_dev_bypass_wd_timeout_show(uint8_t port, 
uint32_t *wd_timeout);
  */
 int rte_eth_dev_bypass_wd_reset(uint8_t port);

+ /**
+ * Configuration of Receive Side Scaling hash computation of Ethernet device.
+ *
+ * @param port
+ *   The port identifier of the Ethernet device.
+ * @param rss_conf
+ *   The new configuration to use for RSS hash computation on the port.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if port identifier is invalid.
+ *   - (-ENOTSUP) if hardware doesn't support.
+ *   - (-EINVAL) if bad parameter.
+ */
+int rte_eth_dev_rss_hash_update(uint8_t port_id,
+   struct rte_eth_rss_conf *rss_conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_pmd_

[dpdk-dev] [PATCH v2 1/5] ethdev: check RX queue indices in RETA config against number of queues

2014-05-16 Thread Ivan Boule

Each entry of the RSS redirection table (RETA) of igb and ixgbe ports
contains a 4-bit RX queue index, thus imposing RSS RX queue indices to
be strictly lower than 16.
In addition, if a RETA entry is configured with a RX queue index that is
strictly lower than 16, but is greater or equal to the number of RX queues
of the port, then all input packets whose RSS hash value indexes that RETA
entry are silently dropped by the NIC.

Make the function rte_eth_dev_rss_reta_update() check that RX queue indices
that are supplied in the reta_conf argument are strictly lower than
ETH_RSS_RETA_MAX_QUEUE (16) and are strictly lower than the number of
RX queues of the port.

Signed-off-by: Ivan Boule 
---
 lib/librte_ether/rte_ethdev.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a5727dd..473c98b 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1501,6 +1501,7 @@ int
 rte_eth_dev_rss_reta_update(uint8_t port_id, struct rte_eth_rss_reta 
*reta_conf)
 {
struct rte_eth_dev *dev;
+   uint16_t max_rxq;
uint8_t i,j;

if (port_id >= nb_ports) {
@@ -1514,10 +1515,13 @@ rte_eth_dev_rss_reta_update(uint8_t port_id, struct 
rte_eth_rss_reta *reta_conf)
return (-EINVAL);
}

+   dev = _eth_devices[port_id];
+   max_rxq = (dev->data->nb_rx_queues <= ETH_RSS_RETA_MAX_QUEUE) ?
+   dev->data->nb_rx_queues : ETH_RSS_RETA_MAX_QUEUE;
if (reta_conf->mask_lo != 0) {
for (i = 0; i < ETH_RSS_RETA_NUM_ENTRIES/2; i++) {
if ((reta_conf->mask_lo & (1ULL << i)) &&
-   (reta_conf->reta[i] >= ETH_RSS_RETA_MAX_QUEUE)) 
{
+   (reta_conf->reta[i] >= max_rxq)) {
PMD_DEBUG_TRACE("RETA hash index output"
"configration for port=%d,invalid"

"queue=%d\n",port_id,reta_conf->reta[i]);
@@ -1533,7 +1537,7 @@ rte_eth_dev_rss_reta_update(uint8_t port_id, struct 
rte_eth_rss_reta *reta_conf)

/* Check if the max entry >= 128 */
if ((reta_conf->mask_hi & (1ULL << i)) && 
-   (reta_conf->reta[j] >= ETH_RSS_RETA_MAX_QUEUE)) 
{
+   (reta_conf->reta[j] >= max_rxq)) {
PMD_DEBUG_TRACE("RETA hash index output"
"configration for port=%d,invalid"

"queue=%d\n",port_id,reta_conf->reta[j]);
@@ -1543,8 +1547,6 @@ rte_eth_dev_rss_reta_update(uint8_t port_id, struct 
rte_eth_rss_reta *reta_conf)
}
}

-   dev = _eth_devices[port_id];
-
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_update, -ENOTSUP);
return (*dev->dev_ops->reta_update)(dev, reta_conf);
 }
-- 
1.7.10.4

[dpdk-dev] [PATCH v2 0/5] allow to dynamically change RSS configuration

2014-05-16 Thread Ivan Boule

This set of patches allows to dynamically get and set the RSS configuration
of a port:
- rss functions (IPv4/IPv6//UDP/TCP ...)
- rss hash key

Changes included in v2:
- Rename functions "rss_hash_conf_update" to "rss_hash_update"
- In RSS hash update functions of igb and ixgbe PMDs, add tests
  that do not allow RSS to be dynamically enabled or disabled.

--
Ivan Boule

Ivan Boule (5):
  ethdev: check RX queue indices in RETA config against number of queues
  ethdev: allow to set RSS hash computation flags and/or key
  app/testpmd: configure RSS without restart
  ethdev: allow to get RSS hash functions and key
  app/testpmd: allow to configure RSS hash key

 app/test-pmd/cmdline.c  |  177 +++---
 app/test-pmd/config.c   |   93 ++
 app/test-pmd/testpmd.h  |3 +
 lib/librte_ether/rte_ethdev.c   |   47 -
 lib/librte_ether/rte_ethdev.h   |   47 +
 lib/librte_pmd_e1000/e1000_ethdev.h |6 ++
 lib/librte_pmd_e1000/igb_ethdev.c   |2 +
 lib/librte_pmd_e1000/igb_rxtx.c |  175 ++
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |2 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |6 ++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |  178 ---
 11 files changed, 645 insertions(+), 91 deletions(-)

-- 
1.7.10.4

[dpdk-dev] [PATCH v3] app/testpmd: list forwarding engines

2014-05-16 Thread Ivan Boule

pmd/config.c b/app/test-pmd/config.c
> index d0e21b7..41326fe 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -1231,6 +1231,25 @@ set_tx_pkt_segments(unsigned *seg_lengths, unsigned 
> nb_segs)
>   tx_pkt_nb_segs = (uint8_t) nb_segs;
>   }
>   
> +char*
> +list_pkt_forwarding_modes(void)
> +{
> + static char fwd_modes[128] = "";
> + const char *separator = "|";
> + struct fwd_engine *fwd_eng;
> + unsigned i = 0;
> +
> + if (strlen (fwd_modes) == 0) {
> + while ((fwd_eng = fwd_engines[i++]) != NULL) {
> + strcat(fwd_modes, fwd_eng->fwd_mode_name);
> + strcat(fwd_modes, separator);
> + }
> + fwd_modes[strlen(fwd_modes) - strlen(separator)] = '\0';
> + }
> +
> + return fwd_modes;
> +}
> +
>   void
>   set_pkt_forwarding_mode(const char *fwd_mode_name)
>   {
> diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
> index 67081d7..8a3ce2c 100644
> --- a/app/test-pmd/parameters.c
> +++ b/app/test-pmd/parameters.c
> @@ -158,7 +158,8 @@ usage(char* progname)
>   printf("  --disable-rss: disable rss.\n");
>   printf("  --port-topology=N: set port topology (N: paired (default) or "
>  "chained).\n");
> - printf("  --forward-mode=N: set forwarding mode.\n");
> + printf("  --forward-mode=N: set forwarding mode (N: %s).\n",
> +list_pkt_forwarding_modes());
>   printf("  --rss-ip: set RSS functions to IPv4/IPv6 only .\n");
>   printf("  --rss-udp: set RSS functions to IPv4/IPv6 + UDP.\n");
>   printf("  --rxq=N: set the number of RX queues per port to N.\n");
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index 52d3543..0e4a35e 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -491,6 +491,7 @@ void tx_cksum_set(portid_t port_id, uint8_t cksum_mask);
>   void set_verbose_level(uint16_t vb_level);
>   void set_tx_pkt_segments(unsigned *seg_lengths, unsigned nb_segs);
>   void set_nb_pkt_per_burst(uint16_t pkt_burst);
> +char *list_pkt_forwarding_modes(void);
>   void set_pkt_forwarding_mode(const char *fwd_mode);
>   void start_packet_forwarding(int with_tx_first);
>   void stop_packet_forwarding(void);

Acked by: Ivan Boule

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH 5/6] ether: allow setting mac address

2014-05-15 Thread Ivan Boule

Hi Stephen,

By default, changing the [current] default MAC address of a device is not
equivalent to adding a MAC address to it ! At least, the minimum to do
should consist in:
   1. checking that the PMD exports both mac_addr_remove()
and mac_addr_add() functions, returning a NON_SUPPORTED error code
otherwise,
   2. removing the current default MAC address by invoking the PMD's
mac_addr_remove() function,
   3. adding the new MAC address by invoking the PMD's mac_addr_add()
function.

Regards,
Ivan

2014-05-14 20:55 GMT+02:00 Stephen Hemminger :

> Allow setting the default Ethernet address used on device.
> The underlying drivers allow it but the DPDK was blocking any
> attempts to change Ethernet address on device.
>
> For most devices, this is just the same as filling in address
> in mac address table entry 0, but for some devices this requires
> special override.
>
> Signed-off-by: Stephen Hemminger 
>
>
> --- a/lib/librte_ether/rte_ethdev.h 2014-05-14 11:44:27.373014227 -0700
> +++ b/lib/librte_ether/rte_ethdev.h 2014-05-14 11:44:27.373014227 -0700
> @@ -982,6 +982,11 @@ typedef void (*eth_mac_addr_add_t)(struc
>   struct ether_addr *mac_addr,
>   uint32_t index,
>   uint32_t vmdq);
> +/**< @internal Change default MAC address */
> +
> +typedef void (*eth_mac_addr_set_t)(struct rte_eth_dev *dev,
> +  struct ether_addr *mac_addr);
> +
>  /**< @internal Set a MAC address into Receive Address Address Register */
>
>  typedef int (*eth_uc_hash_table_set_t)(struct rte_eth_dev *dev,
> @@ -1114,6 +1119,7 @@ struct eth_dev_ops {
> priority_flow_ctrl_set_t   priority_flow_ctrl_set; /**< Setup
> priority flow control.*/
> eth_mac_addr_remove_t  mac_addr_remove; /**< Remove MAC
> address */
> eth_mac_addr_add_t mac_addr_add;  /**< Add a MAC address */
> +   eth_mac_addr_set_t mac_addr_set;  /**< Change default MAC
> address */
> eth_uc_hash_table_set_tuc_hash_table_set;  /**< Set Unicast
> Table Array */
> eth_uc_all_hash_table_set_t uc_all_hash_table_set;  /**< Set
> Unicast hash bitmap */
> eth_mirror_rule_set_t  mirror_rule_set;  /**< Add a traffic
> mirror rule.*/
> @@ -2538,6 +2544,21 @@ int rte_eth_dev_mac_addr_add(uint8_t por
>  int rte_eth_dev_mac_addr_remove(uint8_t port, struct ether_addr
> *mac_addr);
>
>  /**
> + * Change the default MAC address
> + *
> + * @param port
> + *   The port identifier of the Ethernet device.
> + * @param mac_addr
> + *   MAC address to remove.
> + * @return
> + *   - (0) if successfully added or *mac_addr" was already added.
> + *   - (-ENOTSUP) if hardware doesn't support this feature.
> + *   - (-ENODEV) if *port* is invalid.
> + *   - (-EINVAL) if MAC address is invalid.
> + */
> +int rte_eth_dev_macaddr_set(uint8_t port, struct ether_addr *mac_addr);
> +
> +/**
>   * Update Redirection Table(RETA) of Receive Side Scaling of Ethernet
> device.
>   *
>   * @param port
> --- a/lib/librte_ether/rte_ethdev.c 2014-05-14 11:44:27.373014227 -0700
> +++ b/lib/librte_ether/rte_ethdev.c 2014-05-14 11:44:27.373014227 -0700
> @@ -1059,6 +1059,33 @@ rte_eth_macaddr_get(uint8_t port_id, str
>  }
>
>  int
> +rte_eth_dev_macaddr_set(uint8_t port_id, struct ether_addr *addr)
> +{
> +   struct rte_eth_dev *dev;
> +
> +   if (port_id >= nb_ports) {
> +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> +   return (-ENODEV);
> +   }
> +   if (is_zero_ether_addr(addr)) {
> +   PMD_DEBUG_TRACE("port %d: Cannot add NULL MAC address\n",
> port_id);
> +   return (-EINVAL);
> +   }
> +   dev = _eth_devices[port_id];
> +
> +   if (*dev->dev_ops->mac_addr_set)
> +   (*dev->dev_ops->mac_addr_set)(dev, addr);
> +   else {
> +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->mac_addr_add, -ENOTSUP);
> +   (*dev->dev_ops->mac_addr_add)(dev, addr, 0, 0);
> +   }
> +
> +   ether_addr_copy(addr, >data->mac_addrs[0]);
> +
> +   return 0;
> +}
> +
> +int
>  rte_eth_dev_vlan_filter(uint8_t port_id, uint16_t vlan_id, int on)
>  {
> struct rte_eth_dev *dev;
>
>

-- 
Ivan BOULE
6WIND
Software Engineer

[dpdk-dev] [PATCH 2/6] Subjet: ethdev: add common code to atomicly access link

2014-05-15 Thread Ivan Boule

Hi Stephen,

The new function rte_eth_dev_set_link() below is declared as void.
Thus, the comments about its return value are wrong, and should be removed,
unless they still refer to a desired behaviour, in which case it is the
function itself that should be reworked to invoke rte_atomic64_cmpset and
to return its result.

Regards,
Ivan

2014-05-14 20:55 GMT+02:00 Stephen Hemminger :

> Many drivers copy/paste same code to atomicly access link information.
> Add functions to do this instead.
>
> Signed-off-by: Stephen Hemminger 
>
> --- a/lib/librte_ether/rte_ethdev.h 2014-05-14 11:27:07.409461720 -0700
> +++ b/lib/librte_ether/rte_ethdev.h 2014-05-14 11:27:07.409461720 -0700
> @@ -1704,6 +1704,45 @@ extern void rte_eth_link_get_nowait(uint
> struct rte_eth_link *link);
>
> +
> +/**
> + * @internal
> + * Set the link status from device in atomic fashion.
> + * Returns non-zero on success; 0 on failure
> + * Callers should retry on failure
> + */
> +static inline void rte_eth_dev_set_link(struct rte_eth_dev *dev,
> +   struct rte_eth_link *link)
> +{
> +   rte_atomic64_t *dst = (rte_atomic64_t *)>data->dev_link;
> +
> +   rte_atomic64_set(dst, *(int64_t *)link);
> +}
> +
>
>

-- 
Ivan BOULE
6WIND
Software Engineer

[dpdk-dev] [PATCH 2/5] ethdev: allow to set RSS hash computation flags and/or key

2014-05-14 Thread Ivan Boule

Hi Konstantin,

I apologize for my late answer.
I had also read the Niantic documentation that stated to RSS should not 
be enabled/disabled on the fly.
In fact, RSS configuration patches are intended to enable the dynamic 
changement of the RSS configuration of a port (RETA, hash key, hash 
flags), rather than to only disable/enable RSS.
In my opinion, initializing a port with multiple RX queues and RSS 
disableddoes not make sense, and thus there is not reason for allowing 
RSS to be then dynamically enabled in such a case.
For the same reason, once having initialized a port with multiple RX 
queuesand RSS enabled, it does not make sense to then disable RSS on 
that port.

Conversely, creating a single RX queue on a port with RSS enabled does 
makesense to provide packet RSS hash values to the DPDK network stack, 
but, again,it does not make sense to then disable RSS on that port.

According to these considerations, I propose to update the dynamic RSS 
configuration with tests that prevent disabling RSS once initially 
enabled, or enabling RSS once initially disabled.

Regards,
Ivan

On 05/02/2014 01:21 AM, Ananyev, Konstantin wrote:
> Hi Ivan,
>
> About enabling/disabling RSS dynamically (without dev_stop/dev_start):
> As I know 82599 spec explicitly states that RSS shouldn't be enabled/disabled 
> dynamically.
>  From section 7.1.2.8 Receive-Side Scaling (RSS):
> "...
> Enabling rules:
> * RSS is enabled in the MRQC register.
> * RSS enabling cannot be done dynamically while it must be preceded by a 
> software
> reset.
> ...
> Disabling rules:
> * Disabling RSS on the fly is not allowed, and the 82599 must be reset after 
> RSS is
> disabled.
> ..."
>
> RETA table from other side can be updated on the fly.
> As I remember that was the reason why we allow only dev_rss_reta_update() to 
> be used dynamically (without dev_start/dev_stop).
>
> Konstantin
>
>
>
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ivan Boule
> Sent: Wednesday, April 30, 2014 2:55 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 2/5] ethdev: allow to set RSS hash computation 
> flags and/or key
>
> 1) Add a new function "rss_hash_conf_update" in the PMD API to dynamically
> update the RSS flags and/or the RSS key used by a NIC to compute the RSS
> hash of input packets.
> The new function uses the existing data structure "rte_eth_rss_conf" for
> the argument that contains the new hash flags and/or the new hash key to
> use.
>
> 2) Add the ixgbe-specific function "ixgbe_dev_rss_hash_conf_update" and the
> igb-specific function "eth_igb_rss_hash_conf_update" to update the RSS
> hash configuration of ixgbe and igb controllers respectively.
>
> Note:
> Configuring the RSS hash flags and the RSS key used by a NIC consists in
> updating appropriate PCI registers of the NIC.
> These operations have been manually tested with the interactive commands
> "write reg" and "write regbit" of the testpmd application.
>
> Signed-off-by: Ivan Boule 
> ---
>   lib/librte_ether/rte_ethdev.c   |   23 +
>   lib/librte_ether/rte_ethdev.h   |   25 ++
>   lib/librte_pmd_e1000/e1000_ethdev.h |3 ++
>   lib/librte_pmd_e1000/igb_ethdev.c   |1 +
>   lib/librte_pmd_e1000/igb_rxtx.c |   93 
> +--
>   lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 +
>   lib/librte_pmd_ixgbe/ixgbe_ethdev.h |3 ++
>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |   90 -
>   8 files changed, 169 insertions(+), 70 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 473c98b..c00dff0 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1572,6 +1572,29 @@ rte_eth_dev_rss_reta_query(uint8_t port_id, struct 
> rte_eth_rss_reta *reta_conf)
>   }
>   
>   int
> +rte_eth_dev_rss_hash_conf_update(uint8_t port_id,
> +  struct rte_eth_rss_conf *rss_conf)
> +{
> + struct rte_eth_dev *dev;
> + uint16_t rss_hash_protos;
> +
> + if (port_id >= nb_ports) {
> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> + return (-ENODEV);
> + }
> + rss_hash_protos = rss_conf->rss_hf;
> + if ((rss_hash_protos != 0) &&
> + ((rss_hash_protos & ETH_RSS_PROTO_MASK) == 0)) {
> + PMD_DEBUG_TRACE("Invalid rss_hash_protos=0x%x\n",
> + rss_hash_protos);
> + return (-EINVAL);
> + }
> + dev = _eth_devices[port_id];
> + FUNC_PTR_OR_ERR

[dpdk-dev] [PATCH 5/5] ixgbe: assign a default MAC address to a VF

2014-05-12 Thread Ivan Boule

When initializing a VF with no initial MAC address assigned by
the underlying Host PF driver, assign a default MAC address.

Signed-off-by: Ivan Boule 
---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   54 +--
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index 84d4701..a408161 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -58,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "ixgbe_logs.h"
 #include "ixgbe/ixgbe_api.h"
@@ -843,6 +844,22 @@ ixgbevf_negotiate_api(struct ixgbe_hw *hw)
;
 }

+static void
+generate_random_mac_addr(struct ether_addr *mac_addr)
+{
+   uint64_t random;
+
+   /* Set Organizationally Unique Identifier (OUI) prefix. */
+   mac_addr->addr_bytes[0] = 0x00;
+   mac_addr->addr_bytes[1] = 0x09;
+   mac_addr->addr_bytes[2] = 0xC0;
+   /* Force indication of locally assigned MAC address. */
+   mac_addr->addr_bytes[0] |= ETHER_LOCAL_ADMIN_ADDR;
+   /* Generate the last 3 bytes of the MAC address with a random number. */
+   random = rte_rand();
+   memcpy(_addr->addr_bytes[3], , 3);
+}
+
 /*
  * Virtual Function device init
  */
@@ -859,6 +876,7 @@ eth_ixgbevf_dev_init(__attribute__((unused)) struct 
eth_driver *eth_drv,
IXGBE_DEV_PRIVATE_TO_VFTA(eth_dev->data->dev_private);
struct ixgbe_hwstrip *hwstrip = 
IXGBE_DEV_PRIVATE_TO_HWSTRIP_BITMAP(eth_dev->data->dev_private);
+   struct ether_addr *perm_addr = (struct ether_addr *) hw->mac.perm_addr;

PMD_INIT_LOG(DEBUG, "eth_ixgbevf_dev_init");

@@ -903,13 +921,13 @@ eth_ixgbevf_dev_init(__attribute__((unused)) struct 
eth_driver *eth_drv,
hw->mac.num_rar_entries = 128; /* The MAX of the underlying PF */
diag = hw->mac.ops.reset_hw(hw);

-   if (diag != IXGBE_SUCCESS) {
+   /*
+* The VF reset operation returns the IXGBE_ERR_INVALID_MAC_ADDR when
+* the underlying PF driver has not assigned a MAC address to the VF.
+* In this case, assign a random MAC address.
+*/
+   if ((diag != IXGBE_SUCCESS) && (diag != IXGBE_ERR_INVALID_MAC_ADDR)) {
PMD_INIT_LOG(ERR, "VF Initialization Failure: %d", diag);
-   RTE_LOG(ERR, PMD, "\tThe MAC address is not valid.\n"
-   "\tThe most likely cause of this error 
is that the VM host\n"
-   "\thas not assigned a valid MAC address 
to this VF device.\n"
-   "\tPlease consult the DPDK Release 
Notes (FAQ section) for\n"
-   "\ta possible solution to this 
problem.\n");
return (diag);
}

@@ -930,9 +948,29 @@ eth_ixgbevf_dev_init(__attribute__((unused)) struct 
eth_driver *eth_drv,
return -ENOMEM;
}

+   /* Generate a random MAC address, if none was assigned by PF. */
+   if (is_zero_ether_addr(perm_addr)) {
+   generate_random_mac_addr(perm_addr);
+   diag = ixgbe_set_rar_vf(hw, 1, perm_addr->addr_bytes, 0, 1);
+   if (diag) {
+   rte_free(eth_dev->data->mac_addrs);
+   eth_dev->data->mac_addrs = NULL;
+   return diag;
+   }
+   RTE_LOG(INFO, PMD,
+   "\tVF MAC address not assigned by Host PF\n"
+   "\tAssign randomly generated MAC address "
+   "%02x:%02x:%02x:%02x:%02x:%02x\n",
+   perm_addr->addr_bytes[0],
+   perm_addr->addr_bytes[1],
+   perm_addr->addr_bytes[2],
+   perm_addr->addr_bytes[3],
+   perm_addr->addr_bytes[4],
+   perm_addr->addr_bytes[5]);
+   }
+
/* Copy the permanent MAC address */
-   ether_addr_copy((struct ether_addr *) hw->mac.perm_addr,
-   _dev->data->mac_addrs[0]);
+   ether_addr_copy(perm_addr, _dev->data->mac_addrs[0]);

/* reset the hardware with the new settings */
diag = hw->mac.ops.start_hw(hw);
-- 
1.7.10.4

[dpdk-dev] [PATCH 4/5] ixgbe: reset unused VF mailbox data registers

2014-05-12 Thread Ivan Boule

The VF_RESET message of the 82599 PF/VF communication protocol issued by a
a Guest VF driver may include an optional permanent MAC address assigned to
the VF by the Guest OS, in order to make it recorded into the 82599 RAR
registers by the Host PF driver.

To indicate the absence of this optional MAC address, the VF_RESET command
assumes that a NULL MAC address is sent, instead of using a dedicated bit
for this purpose. However, when sending a VF_RESET command with no permanent
MAC address, the function ixgbe_reset_hw_vf() of the 82599 VF driver
directly invokes the function ixgbe_write_mbx_vf() with a message that does
not include a NULL MAC address, wrongly assuming that this function fills in
with zero all unused mailbox data registers.

More globally, it is safer to explicitely reset to zero all remaining mailbox
data registers that are not used to store the content of a message, in order
to reset the data sent in a previous VF/PF exchange (in either side),
including the last exchange performed by another Guest OS to which that VF
was previously assigned.

Signed-off-by: Ivan Boule 
---
 lib/librte_pmd_ixgbe/ixgbe/ixgbe_mbx.c |   22 ++
 1 file changed, 22 insertions(+)

diff --git a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_mbx.c 
b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_mbx.c
index 1dc13dc..d0ab8b3 100644
--- a/lib/librte_pmd_ixgbe/ixgbe/ixgbe_mbx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe/ixgbe_mbx.c
@@ -431,6 +431,17 @@ STATIC s32 ixgbe_write_mbx_vf(struct ixgbe_hw *hw, u32 
*msg, u16 size,
for (i = 0; i < size; i++)
IXGBE_WRITE_REG_ARRAY(hw, IXGBE_VFMBMEM, i, msg[i]);

+   /*
+* Complete the remaining mailbox data registers with zero to reset
+* the data sent in a previous exchange (in either side) with the PF,
+* including exchanges performed by another Guest OS to which that VF
+* was previously assigned.
+*/
+   while (i < hw->mbx.size) {
+   IXGBE_WRITE_REG_ARRAY(hw, IXGBE_VFMBMEM, i, 0);
+   i++;
+   }
+
/* update stats */
hw->mbx.stats.msgs_tx++;

@@ -661,6 +672,17 @@ STATIC s32 ixgbe_write_mbx_pf(struct ixgbe_hw *hw, u32 
*msg, u16 size,
for (i = 0; i < size; i++)
IXGBE_WRITE_REG_ARRAY(hw, IXGBE_PFMBMEM(vf_number), i, msg[i]);

+   /*
+* Complete the remaining mailbox data registers with zero to reset
+* the data sent in a previous exchange (in either side) with the VF,
+* including exchanges performed by another Guest OS to which that VF
+* was previously assigned.
+*/
+   while (i < hw->mbx.size) {
+   IXGBE_WRITE_REG_ARRAY(hw, IXGBE_PFMBMEM(vf_number), i, 0);
+   i++;
+   }
+
/* Interrupt VF to tell it a message has been sent and release buffer*/
IXGBE_WRITE_REG(hw, IXGBE_PFMAILBOX(vf_number), IXGBE_PFMAILBOX_STS);

-- 
1.7.10.4

[dpdk-dev] [PATCH 0/5] ixgbe : support to add/remove a MAC address to/from a VF

2014-05-12 Thread Ivan Boule

The first patch add 2 functions to dynamically add/remove a MAC address to/from
a ixgbe VF device.
The 2 next patches fix the behaviour of the ixgbe_vf functions used in the
first patch.
The fourth patch fix the communication framework used by the VF/PF protocol.
The last patch assigns a default MAC address to a VF device, when none is
assigned by the underlying PF driver.

The last patch requires that the most recent PF/VF API version could be
negociated, and thus assumes that the following patch of Konstantin Ananyev
has been previously applied: "fix for jumbo frame issue with DPDK VF".

Ivan Boule (5):
  [PATCH 1/5] ixgbe: add/remove a MAC address to/from a VF
  [PATCH 2/5] ixgbe: avoid adding twice the permanent MAC address of a VF
  [PATCH 3/5] ixgbe: skip NULL & permanent MAC addresses of a VF
  [PATCH 4/5] ixgbe: reset unused VF mailbox data registers
  [PATCH 5/5] ixgbe: assign a default MAC address to a VF

 lib/librte_pmd_ixgbe/ixgbe/ixgbe_mbx.c |   22 ++
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c|  132 +---
 2 files changed, 145 insertions(+), 9 deletions(-)

-- 
1.7.10.4

[dpdk-dev] [PATCH] Use proper mac type for 82576 VF e1000_vfadapt type corresponds to 82576 VF devices, check e1000_set_mac_type() for more details.

2014-05-07 Thread Ivan Boule

On 05/06/2014 04:33 PM, Konstantin Ananyev wrote:
> Signed-off-by: Konstantin Ananyev 
> ---
>   lib/librte_pmd_e1000/igb_rxtx.c |2 +-
>   1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
> index 6b454a5..7fe1780 100644
> --- a/lib/librte_pmd_e1000/igb_rxtx.c
> +++ b/lib/librte_pmd_e1000/igb_rxtx.c
> @@ -2154,7 +2154,7 @@ eth_igbvf_rx_init(struct rte_eth_dev *dev)
>   rxdctl &= 0xFFF0;
>   rxdctl |= (rxq->pthresh & 0x1F);
>   rxdctl |= ((rxq->hthresh & 0x1F) << 8);
> - if (hw->mac.type == e1000_82576) {
> + if (hw->mac.type == e1000_vfadapt) {
>   /*
>* Workaround of 82576 VF Erratum
>* force set WTHRESH to 1

Acked.

Thanks.

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH] fix for jumbo frame issue with DPDK VF

2014-05-07 Thread Ivan Boule

,6 +2534,9 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
>   
>   hw->mac.ops.reset_hw(hw);
>   
> + /* negotiate mailbox API version to use with the PF. */
> + ixgbevf_negotiate_api(hw);
> +
>   ixgbevf_dev_tx_init(dev);
>   
>   /* This can fail when allocating mbufs for descriptor rings */
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index 55414b9..7930dbd 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> @@ -3594,6 +3594,10 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
>   PMD_INIT_FUNC_TRACE();
>   hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>   
> + /* setup MTU */
> + ixgbevf_rlpml_set_vf(hw,
> + (uint16_t)dev->data->dev_conf.rxmode.max_rx_pkt_len);
> +
>   /* Setup RX queues */
>   dev->rx_pkt_burst = ixgbe_recv_pkts;
>   for (i = 0; i < dev->data->nb_rx_queues; i++) {
Acked.

Thanks.

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH] ixgbevf jumbo frame issue with DPDK VF

2014-05-05 Thread Ivan Boule

On 05/01/2014 07:24 PM, Ananyev, Konstantin wrote:
> Hi Ivan,
> So do I need to split that patch into 2 and resubmit, or can it be applied 
> like that?
Yes, split the patch into 2 independent patches and resubmit them, please.
Thanks.

Regards,
Ivan

> Thanks
> Konstantin
>
> -Original Message-----
> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
> Sent: Wednesday, April 30, 2014 10:49 AM
> To: Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] ixgbevf jumbo frame issue with DPDK VF
>
> Hi Konstantin,
>
> The content's of your patch is OK.
> However, the following fix in the function `eth_igbvf_rx_init`:
>
> - if (hw->mac.type == e1000_82576) {
> + if (hw->mac.type ==  e1000_vfadapt) {
>
> should be sent in a different patch (without the spurious blank
> character ) ,
> as this fix is not related to the jumbo frame issue that is addressed here.
>
> Regards,
> Ivan
>
>On 04/24/2014 06:45 PM, Ananyev, Konstantin wrote:
>> When latest Linux ixgbe PF is used, and DPDK VF is used in DPDK application,
>> jumbo frames are not received.
>> Also - if Linux ixgbe PF has MTU set to 1500 (default),
>> then normal sized packets can be received by DPDK VF.
>> However, if Linux PF has MTU > 1500, then DPDK VF receives no packets
>> (normal or jumbo).
>> With ixgbe_mbox_api_10 ixgbe simply didn't allow set VF MTU > 1514 for 82599.
>> With ixgbe_mbox_ajpi_11 it does, though now, if PF uses jumbo frames,
>> it simply disables RX for all VFs.
>> So to work with PF ithat using jumbo frames, at startup each VF has to:
>> 1. negotiate with PF mbox_api_11.
>> 2. Send to PF SET_LPE message with desired MTU.
>> Note, that if PF already uses MTU bigger then asked by the VF,
>> then PF wouldn't take any action.
>>
>> Signed-off-by: Konstantin Ananyev 
>> ---
>>lib/librte_pmd_e1000/igb_rxtx.c |7 -
>>lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   47 
>> --
>>lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |4 +++
>>3 files changed, 43 insertions(+), 15 deletions(-)
>>
>> diff --git a/lib/librte_pmd_e1000/igb_rxtx.c 
>> b/lib/librte_pmd_e1000/igb_rxtx.c
>> index 4608595..1ebe2f5 100644
>> --- a/lib/librte_pmd_e1000/igb_rxtx.c
>> +++ b/lib/librte_pmd_e1000/igb_rxtx.c
>> @@ -2077,6 +2077,11 @@ eth_igbvf_rx_init(struct rte_eth_dev *dev)
>>
>>  hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>>
>> +/* setup MTU */
>> +e1000_rlpml_set_vf(hw,
>> +(uint16_t)(dev->data->dev_conf.rxmode.max_rx_pkt_len +
>> +VLAN_TAG_SIZE));
>> +
>>  /* Configure and enable each RX queue. */
>>  rctl_bsize = 0;
>>  dev->rx_pkt_burst = eth_igb_recv_pkts;
>> @@ -2149,7 +2154,7 @@ eth_igbvf_rx_init(struct rte_eth_dev *dev)
>>  rxdctl &= 0xFFF0;
>>  rxdctl |= (rxq->pthresh & 0x1F);
>>  rxdctl |= ((rxq->hthresh & 0x1F) << 8);
>> -if (hw->mac.type == e1000_82576) {
>> +if (hw->mac.type ==  e1000_vfadapt) {
>>  /*
>>   * Workaround of 82576 VF Erratum
>>   * force set WTHRESH to 1
>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
>> b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
>> index a5a7f9a..6dd52d7 100644
>> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
>> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
>> @@ -808,19 +808,30 @@ eth_ixgbe_dev_init(__attribute__((unused)) struct 
>> eth_driver *eth_drv,
>>  return 0;
>>}
>>
>> -static void ixgbevf_get_queue_num(struct ixgbe_hw *hw)
>> +
>> +/*
>> + * Negotiate mailbox API version with the PF.
>> + * After reset API version is always set to the basic one 
>> (ixgbe_mbox_api_10).
>> + * Then we try to negotiate starting with the most recent one.
>> + * If all negotiation attempts fail, then we will proceed with
>> + * the default one (ixgbe_mbox_api_10).
>> + */
>> +static void
>> +ixgbevf_negotiate_api(struct ixgbe_hw *hw)
>>{
>> -/* Traffic classes are not supported by now */
>> -unsigned int tcs, tc;
>> +int32_t i;
>>
>> -/*
>> - * Must let PF know we are at mailbox API version 1.1.
>> - * Otherwise PF won't answer properly.
>> - * In case that PF fails to provide Rx/Tx queue number,
>> - * max_tx_queues and max_rx_queues remain to be 1.
>> - */
>>

[dpdk-dev] [PATCH 5/5] app/testpmd: allow to configure RSS hash key

2014-04-30 Thread Ivan Boule

Add the command "port config X rss-hash-key key" in the 'testpmd'
application to configure the RSS hash key used to compute the RSS
hash of input [IP] packets received on port X.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/cmdline.c |   96 +++-
 app/test-pmd/config.c  |   28 ++
 app/test-pmd/testpmd.h |1 +
 3 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 3454c0b..d561bf3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1192,6 +1192,99 @@ cmdline_parse_inst_t cmd_config_rss = {
},
 };

+/* *** configure rss hash key *** */
+struct cmd_config_rss_hash_key {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t config;
+   uint8_t port_id;
+   cmdline_fixed_string_t rss_hash_key;
+   cmdline_fixed_string_t key;
+};
+
+#define RSS_HASH_KEY_LENGTH 40
+static uint8_t
+hexa_digit_to_value(char hexa_digit)
+{
+   if ((hexa_digit >= '0') && (hexa_digit <= '9'))
+   return (uint8_t) (hexa_digit - '0');
+   if ((hexa_digit >= 'a') && (hexa_digit <= 'f'))
+   return (uint8_t) ((hexa_digit - 'a') + 10);
+   if ((hexa_digit >= 'A') && (hexa_digit <= 'F'))
+   return (uint8_t) ((hexa_digit - 'A') + 10);
+   /* Invalid hexa digit */
+   return 0xFF;
+}
+
+static uint8_t
+parse_and_check_key_hexa_digit(char *key, int idx)
+{
+   uint8_t hexa_v;
+
+   hexa_v = hexa_digit_to_value(key[idx]);
+   if (hexa_v == 0xFF)
+   printf("invalid key: character %c at position %d is not a "
+  "valid hexa digit\n", key[idx], idx);
+   return hexa_v;
+}
+
+static void
+cmd_config_rss_hash_key_parsed(void *parsed_result,
+  __attribute__((unused)) struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_config_rss_hash_key *res = parsed_result;
+   uint8_t hash_key[RSS_HASH_KEY_LENGTH];
+   uint8_t xdgt0;
+   uint8_t xdgt1;
+   int i;
+
+   /* Check the length of the RSS hash key */
+   if (strlen(res->key) != (RSS_HASH_KEY_LENGTH * 2)) {
+   printf("key length: %d invalid - key must be a string of %d"
+  "hexa-decimal numbers\n", (int) strlen(res->key),
+  RSS_HASH_KEY_LENGTH * 2);
+   return;
+   }
+   /* Translate RSS hash key into binary representation */
+   for (i = 0; i < RSS_HASH_KEY_LENGTH; i++) {
+   xdgt0 = parse_and_check_key_hexa_digit(res->key, (i * 2));
+   if (xdgt0 == 0xFF)
+   return;
+   xdgt1 = parse_and_check_key_hexa_digit(res->key, (i * 2) + 1);
+   if (xdgt1 == 0xFF)
+   return;
+   hash_key[i]= (uint8_t) ((xdgt0 * 16) + xdgt1);
+   }
+   port_rss_hash_key_update(res->port_id, hash_key);
+}
+
+cmdline_parse_token_string_t cmd_config_rss_hash_key_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_rss_hash_key, port, "port");
+cmdline_parse_token_string_t cmd_config_rss_hash_key_config =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_rss_hash_key, config,
+"config");
+cmdline_parse_token_string_t cmd_config_rss_hash_key_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_config_rss_hash_key, port_id, UINT8);
+cmdline_parse_token_string_t cmd_config_rss_hash_key_rss_hash_key =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_rss_hash_key,
+rss_hash_key, "rss-hash-key");
+cmdline_parse_token_string_t cmd_config_rss_hash_key_value =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_rss_hash_key, key, NULL);
+
+cmdline_parse_inst_t cmd_config_rss_hash_key = {
+   .f = cmd_config_rss_hash_key_parsed,
+   .data = NULL,
+   .help_str = "port config X rss-hash-key 80 hexa digits",
+   .tokens = {
+   (void *)_config_rss_hash_key_port,
+   (void *)_config_rss_hash_key_config,
+   (void *)_config_rss_hash_key_port_id,
+   (void *)_config_rss_hash_key_rss_hash_key,
+   (void *)_config_rss_hash_key_value,
+   NULL,
+   },
+};
+
 /* *** Configure RSS RETA *** */
 struct cmd_config_rss_reta {
cmdline_fixed_string_t port;
@@ -1413,7 +1506,7 @@ cmdline_parse_inst_t cmd_showport_rss_hash = {

 cmdline_parse_inst_t cmd_showport_rss_hash_key = {
.f = cmd_showport_rss_hash_parsed,
-   .data = "show_rss_key",
+   .data = (void *)1,
.help_str = "show port X rss-hash key (X = port number)\n",
.tokens = {
(void *)_showport_rss_hash_show,
@@ -5322,6 +5415,7 @@ cmdline_parse_ctx_t mai

[dpdk-dev] [PATCH 4/5] ethdev: allow to get RSS hash functions and key

2014-04-30 Thread Ivan Boule

1) Add a new function "rss_hash_conf_get" in the PMD API to retrieve the
   current configuration of the RSS functions and/or of the RSS key used
   by a NIC to compute the RSS hash of input packets.
   The new function uses the existing data structure "rte_eth_rss_conf" for
   returning the RSS hash configuration.

2) Add the ixgbe-specific function "ixgbe_dev_rss_hash_conf_get" and the
   igb-specific function "eth_igb_rss_hash_conf_get" to retrieve the RSS
   hash configuration of ixgbe and igb controllers respectively.

3) Add the command "show port X rss-hash [key]" in the testpmd application
   to display the RSS hash configuration of port X.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/cmdline.c  |   63 +
 app/test-pmd/config.c   |   65 +++
 app/test-pmd/testpmd.h  |2 ++
 lib/librte_ether/rte_ethdev.c   |   15 
 lib/librte_ether/rte_ethdev.h   |   23 +
 lib/librte_pmd_e1000/e1000_ethdev.h |3 ++
 lib/librte_pmd_e1000/igb_ethdev.c   |1 +
 lib/librte_pmd_e1000/igb_rxtx.c |   52 
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |3 ++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |   53 
 11 files changed, 281 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index e06379c..3454c0b 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -182,6 +182,10 @@ static void cmd_help_long_parsed(void *parsed_result,
"show port (info|stats|fdir|stat_qmap) (port_id|all)\n"
"Display information for port_id, or all.\n\n"

+   "show port rss-hash [key]\n"
+   "Display the RSS hash functions and RSS hash key"
+   " of port X\n\n"
+
"clear port (info|stats|fdir|stat_qmap) (port_id|all)\n"
"Clear information for port_id, or all.\n\n"

@@ -1364,6 +1368,63 @@ cmdline_parse_inst_t cmd_showport_reta = {
},
 };

+/* *** Show RSS hash configuration *** */
+struct cmd_showport_rss_hash {
+   cmdline_fixed_string_t show;
+   cmdline_fixed_string_t port;
+   uint8_t port_id;
+   cmdline_fixed_string_t rss_hash;
+   cmdline_fixed_string_t key; /* optional argument */
+};
+
+static void cmd_showport_rss_hash_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   void *show_rss_key)
+{
+   struct cmd_showport_rss_hash *res = parsed_result;
+
+   port_rss_hash_conf_show(res->port_id, show_rss_key != NULL);
+}
+
+cmdline_parse_token_string_t cmd_showport_rss_hash_show =
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_rss_hash, show, "show");
+cmdline_parse_token_string_t cmd_showport_rss_hash_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_rss_hash, port, "port");
+cmdline_parse_token_num_t cmd_showport_rss_hash_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_showport_rss_hash, port_id, UINT8);
+cmdline_parse_token_string_t cmd_showport_rss_hash_rss_hash =
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_rss_hash, rss_hash,
+"rss-hash");
+cmdline_parse_token_string_t cmd_showport_rss_hash_rss_key =
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_rss_hash, key, "key");
+
+cmdline_parse_inst_t cmd_showport_rss_hash = {
+   .f = cmd_showport_rss_hash_parsed,
+   .data = NULL,
+   .help_str = "show port X rss-hash (X = port number)\n",
+   .tokens = {
+   (void *)_showport_rss_hash_show,
+   (void *)_showport_rss_hash_port,
+   (void *)_showport_rss_hash_port_id,
+   (void *)_showport_rss_hash_rss_hash,
+   NULL,
+   },
+};
+
+cmdline_parse_inst_t cmd_showport_rss_hash_key = {
+   .f = cmd_showport_rss_hash_parsed,
+   .data = "show_rss_key",
+   .help_str = "show port X rss-hash key (X = port number)\n",
+   .tokens = {
+   (void *)_showport_rss_hash_show,
+   (void *)_showport_rss_hash_port,
+   (void *)_showport_rss_hash_port_id,
+   (void *)_showport_rss_hash_rss_hash,
+   (void *)_showport_rss_hash_rss_key,
+   NULL,
+   },
+};
+
 /* *** Configure DCB *** */
 struct cmd_config_dcb {
cmdline_fixed_string_t port;
@@ -5259,6 +5320,8 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)_set_mirror_mask,
(cmdline_parse_inst_t *)_set_mirror_link,
(cmdline_parse_inst_t *)_reset_mirror_rule,
+   (cmdline_parse_inst_t *)_showport_rss_h

[dpdk-dev] [PATCH 3/5] app/testpmd: configure RSS without restart

2014-04-30 Thread Ivan Boule

The function cmd_config_rss_parsed() associated with the command
"port config rss all" required to first stop all ports, in order to
then entirely re-configure all ports with the new RSS hash computation
parameters.
Use now the new function rte_eth_dev_rss_hash_conf_update() that dynamically
only changes the RSS hash computation parameters of a port, without needing
to previously stop the port.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/cmdline.c |   20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 9cc680f..e06379c 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1145,26 +1145,22 @@ cmd_config_rss_parsed(void *parsed_result,
__attribute__((unused)) void *data)
 {
struct cmd_config_rss *res = parsed_result;
-
-   if (!all_ports_stopped()) {
-   printf("Please stop all ports first\n");
-   return;
-   }
+   struct rte_eth_rss_conf rss_conf;
+   uint8_t i;

if (!strcmp(res->value, "ip"))
-   rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6;
+   rss_conf.rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6;
else if (!strcmp(res->value, "udp"))
-   rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6 | ETH_RSS_IPV4_UDP;
+   rss_conf.rss_hf = ETH_RSS_IPV4_UDP | ETH_RSS_IPV6_UDP;
else if (!strcmp(res->value, "none"))
-   rss_hf = 0;
+   rss_conf.rss_hf = 0;
else {
printf("Unknown parameter\n");
return;
}
-
-   init_port_config();
-
-   cmd_reconfig_device_queue(RTE_PORT_ALL, 1, 1);
+   rss_conf.rss_key = NULL;
+   for (i = 0; i < rte_eth_dev_count(); i++)
+   rte_eth_dev_rss_hash_conf_update(i, _conf);
 }

 cmdline_parse_token_string_t cmd_config_rss_port =
-- 
1.7.10.4

[dpdk-dev] [PATCH 2/5] ethdev: allow to set RSS hash computation flags and/or key

2014-04-30 Thread Ivan Boule

1) Add a new function "rss_hash_conf_update" in the PMD API to dynamically
   update the RSS flags and/or the RSS key used by a NIC to compute the RSS
   hash of input packets.
   The new function uses the existing data structure "rte_eth_rss_conf" for
   the argument that contains the new hash flags and/or the new hash key to
   use.

2) Add the ixgbe-specific function "ixgbe_dev_rss_hash_conf_update" and the
   igb-specific function "eth_igb_rss_hash_conf_update" to update the RSS
   hash configuration of ixgbe and igb controllers respectively.

Note:
   Configuring the RSS hash flags and the RSS key used by a NIC consists in
   updating appropriate PCI registers of the NIC.
   These operations have been manually tested with the interactive commands
   "write reg" and "write regbit" of the testpmd application.

Signed-off-by: Ivan Boule 
---
 lib/librte_ether/rte_ethdev.c   |   23 +
 lib/librte_ether/rte_ethdev.h   |   25 ++
 lib/librte_pmd_e1000/e1000_ethdev.h |3 ++
 lib/librte_pmd_e1000/igb_ethdev.c   |1 +
 lib/librte_pmd_e1000/igb_rxtx.c |   93 +--
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |3 ++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |   90 -
 8 files changed, 169 insertions(+), 70 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 473c98b..c00dff0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1572,6 +1572,29 @@ rte_eth_dev_rss_reta_query(uint8_t port_id, struct 
rte_eth_rss_reta *reta_conf)
 }

 int
+rte_eth_dev_rss_hash_conf_update(uint8_t port_id,
+struct rte_eth_rss_conf *rss_conf)
+{
+   struct rte_eth_dev *dev;
+   uint16_t rss_hash_protos;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return (-ENODEV);
+   }
+   rss_hash_protos = rss_conf->rss_hf;
+   if ((rss_hash_protos != 0) &&
+   ((rss_hash_protos & ETH_RSS_PROTO_MASK) == 0)) {
+   PMD_DEBUG_TRACE("Invalid rss_hash_protos=0x%x\n",
+   rss_hash_protos);
+   return (-EINVAL);
+   }
+   dev = _eth_devices[port_id];
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rss_hash_conf_update, -ENOTSUP);
+   return (*dev->dev_ops->rss_hash_conf_update)(dev, rss_conf);
+}
+
+int
 rte_eth_led_on(uint8_t port_id)
 {
struct rte_eth_dev *dev;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index dea7471..a970761 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -331,6 +331,8 @@ struct rte_eth_rss_conf {
 #define ETH_RSS_IPV4_UDP0x0040 /**< IPv4/UDP packet. */
 #define ETH_RSS_IPV6_UDP0x0080 /**< IPv6/UDP packet. */
 #define ETH_RSS_IPV6_UDP_EX 0x0100 /**< IPv6/UDP with extension headers. */
+
+#define ETH_RSS_PROTO_MASK  0x01FF /**< Mask of valid RSS hash protocols */
 /* Definitions used for redirection table entry size */
 #define ETH_RSS_RETA_NUM_ENTRIES 128
 #define ETH_RSS_RETA_MAX_QUEUE   16  
@@ -966,6 +968,10 @@ typedef int (*reta_query_t)(struct rte_eth_dev *dev,
struct rte_eth_rss_reta *reta_conf);
 /**< @internal Query RSS redirection table on an Ethernet device */

+typedef int (*rss_hash_conf_update_t)(struct rte_eth_dev *dev,
+ struct rte_eth_rss_conf *rss_conf);
+/**< @internal Update RSS hash configuration of an Ethernet device */
+
 typedef int (*eth_dev_led_on_t)(struct rte_eth_dev *dev);
 /**< @internal Turn on SW controllable LED on an Ethernet device */

@@ -1153,6 +1159,8 @@ struct eth_dev_ops {
   bypass_wd_reset_t bypass_wd_reset;
 #endif

+   /** Configure RSS hash protocols. */
+   rss_hash_conf_update_t rss_hash_conf_update;
 };

 /**
@@ -2859,6 +2867,23 @@ int rte_eth_dev_bypass_wd_timeout_show(uint8_t port, 
uint32_t *wd_timeout);
  */
 int rte_eth_dev_bypass_wd_reset(uint8_t port);

+ /**
+ * Configuration of Receive Side Scaling hash computation of Ethernet device.
+ *
+ * @param port
+ *   The port identifier of the Ethernet device.
+ * @param rss_conf
+ *   The new configuration to use for RSS hash computation on the port.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if port identifier is invalid.
+ *   - (-ENOTSUP) if hardware doesn't support.
+ *   - (-EINVAL) if bad parameter.
+ */
+int
+rte_eth_dev_rss_hash_conf_update(uint8_t port_id,
+struct rte_eth_rss_conf *rss_conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_pmd_e1000/e1000_ethdev.h 
b/lib/librte_pmd_e1000/e1000_ethdev.h
index d09064e..e228c53 100644
--- a/lib/librte_pmd_e1000/e1000_ethdev.h
+++ b/

[dpdk-dev] [PATCH 1/5] ethdev: check RX queue indices in RETA config against number of queues

2014-04-30 Thread Ivan Boule

Each entry of the RSS redirection table (RETA) of igb and ixgbe ports
contains a 4-bit RX queue index, thus imposing RSS RX queue indices to
be strictly lower than 16.
In addition, if a RETA entry is configured with a RX queue index that is
strictly lower than 16, but is greater or equal to the number of RX queues
of the port, then all input packets whose RSS hash value indexes that RETA
entry are silently dropped by the NIC.

Make the function rte_eth_dev_rss_reta_update() check that RX queue indices
that are supplied in the reta_conf argument are strictly lower than
ETH_RSS_RETA_MAX_QUEUE (16) and are strictly lower than the number of
RX queues of the port.

Signed-off-by: Ivan Boule 
---
 lib/librte_ether/rte_ethdev.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a5727dd..473c98b 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1501,6 +1501,7 @@ int
 rte_eth_dev_rss_reta_update(uint8_t port_id, struct rte_eth_rss_reta 
*reta_conf)
 {
struct rte_eth_dev *dev;
+   uint16_t max_rxq;
uint8_t i,j;

if (port_id >= nb_ports) {
@@ -1514,10 +1515,13 @@ rte_eth_dev_rss_reta_update(uint8_t port_id, struct 
rte_eth_rss_reta *reta_conf)
return (-EINVAL);
}

+   dev = _eth_devices[port_id];
+   max_rxq = (dev->data->nb_rx_queues <= ETH_RSS_RETA_MAX_QUEUE) ?
+   dev->data->nb_rx_queues : ETH_RSS_RETA_MAX_QUEUE;
if (reta_conf->mask_lo != 0) {
for (i = 0; i < ETH_RSS_RETA_NUM_ENTRIES/2; i++) {
if ((reta_conf->mask_lo & (1ULL << i)) &&
-   (reta_conf->reta[i] >= ETH_RSS_RETA_MAX_QUEUE)) 
{
+   (reta_conf->reta[i] >= max_rxq)) {
PMD_DEBUG_TRACE("RETA hash index output"
"configration for port=%d,invalid"

"queue=%d\n",port_id,reta_conf->reta[i]);
@@ -1533,7 +1537,7 @@ rte_eth_dev_rss_reta_update(uint8_t port_id, struct 
rte_eth_rss_reta *reta_conf)

/* Check if the max entry >= 128 */
if ((reta_conf->mask_hi & (1ULL << i)) && 
-   (reta_conf->reta[j] >= ETH_RSS_RETA_MAX_QUEUE)) 
{
+   (reta_conf->reta[j] >= max_rxq)) {
PMD_DEBUG_TRACE("RETA hash index output"
"configration for port=%d,invalid"

"queue=%d\n",port_id,reta_conf->reta[j]);
@@ -1543,8 +1547,6 @@ rte_eth_dev_rss_reta_update(uint8_t port_id, struct 
rte_eth_rss_reta *reta_conf)
}
}

-   dev = _eth_devices[port_id];
-
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->reta_update, -ENOTSUP);
return (*dev->dev_ops->reta_update)(dev, reta_conf);
 }
-- 
1.7.10.4

[dpdk-dev] [PATCH 0/5] allow to dynamically change RSS configuration

2014-04-30 Thread Ivan Boule

This set of patches allows to dynamically get and set the RSS configuration of a
port:
- rss functions (IP/UDP/TCP ...)
- rss hash key

-- 
Ivan Boule

Ivan Boule (5):
  ethdev: check RX queue indices in RETA config against number of
queues
  ethdev: allow to set RSS hash computation flags and/or key
  app/testpmd: configure RSS without restart
  ethdev: allow to get RSS hash functions and key
  app/testpmd: allow to configure RSS hash key

 app/test-pmd/cmdline.c  |  177 ---
 app/test-pmd/config.c   |   93 ++
 app/test-pmd/testpmd.h  |3 +
 lib/librte_ether/rte_ethdev.c   |   48 +-
 lib/librte_ether/rte_ethdev.h   |   48 ++
 lib/librte_pmd_e1000/e1000_ethdev.h |6 ++
 lib/librte_pmd_e1000/igb_ethdev.c   |2 +
 lib/librte_pmd_e1000/igb_rxtx.c |  145 
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |2 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |6 ++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |  143 +---
 11 files changed, 587 insertions(+), 86 deletions(-)

-- 
1.7.10.4

[dpdk-dev] [PATCH] ixgbevf jumbo frame issue with DPDK VF

2014-04-30 Thread Ivan Boule

> + struct rte_pci_device *pci_dev;
> + struct ixgbe_hw *hw =
> + IXGBE_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
>   struct ixgbe_vfta * shadow_vfta =
>   IXGBE_DEV_PRIVATE_TO_VFTA(eth_dev->data->dev_private);
>   struct ixgbe_hwstrip *hwstrip =
> @@ -891,8 +904,11 @@ eth_ixgbevf_dev_init(__attribute__((unused)) struct 
> eth_driver *eth_drv,
>   return (diag);
>   }
>   
> + /* negotiate mailbox API version to use with the PF. */
> + ixgbevf_negotiate_api(hw);
> +
>   /* Get Rx/Tx queue count via mailbox, which is ready after reset_hw */
> - ixgbevf_get_queue_num(hw);
> + ixgbevf_get_queues(hw, , );
>   
>   /* Allocate memory for storing MAC addresses */
>   eth_dev->data->mac_addrs = rte_zmalloc("ixgbevf", ETHER_ADDR_LEN *
> @@ -2518,6 +2534,9 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
>   
>   hw->mac.ops.reset_hw(hw);
>   
> + /* negotiate mailbox API version to use with the PF. */
> + ixgbevf_negotiate_api(hw);
> +
>   ixgbevf_dev_tx_init(dev);
>   
>   /* This can fail when allocating mbufs for descriptor rings */
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index 55414b9..7930dbd 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> @@ -3594,6 +3594,10 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
>   PMD_INIT_FUNC_TRACE();
>   hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>   
> + /* setup MTU */
> + ixgbevf_rlpml_set_vf(hw,
> + (uint16_t)dev->data->dev_conf.rxmode.max_rx_pkt_len);
> +
>   /* Setup RX queues */
>   dev->rx_pkt_burst = ixgbe_recv_pkts;
>   for (i = 0; i < dev->data->nb_rx_queues; i++) {


-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] Redirection Table

2014-01-07 Thread Ivan Boule

On 01/06/2014 05:52 PM, Michael Quicquaro wrote:
> Thanks for the details.  Can the hash function be modified so that I 
> can provide my own RSS function?  i.e.  my ultimate goal is to provide 
> RSS that is not dependent on packet contents.
No, the RSS function is "hard-wired" and only works on IPv4/IPv6 
packets. All other packets are stored in the same queue (0 by default).
You can change the RSS key used by the RSS function to compute the hash 
value.
See the following testpmd command:

port config X rss-hash-key <80 hexa digits>

to set the 320-bit RSS key of port X.

Best regards,
Ivan

> You may have seen my thread "generic load balancing".  At this point, 
> I'm realizing that the only way to accomplish this is to let the 
> packets land where they may (the queue where the NIC places the 
> packet) and distribute them (to other queues) by having some of the 
> CPU processing devoted to this task.  Can you verify this?
>
> Regards,
> - Michael.
>
>
> On Mon, Jan 6, 2014 at 10:21 AM, Ivan Boule  <mailto:ivan.boule at 6wind.com>> wrote:
>
> On 12/31/2013 08:45 PM, Michael Quicquaro wrote:
>
> Has anyone used the "port config all reta (hash,queue)"
> command of testpmd
> with any success?
>
> I haven't found much documentation on it.
>
> Can someone provide an example on why and how it was used.
>
> Regards and Happy New Year,
> Michael Quicquaro
>
> Hi Michael,
>
> "RETA" stands for Redirection Table.
> It is a per-port configurable table of 128 entries that is used by the
> RSS filtering feature of Intel 1GbE and 10GbE controllers to
> select the
> RX queue into which to store a received IP packet.
> When receiving an IPv4/IPv6 packet, the controller computes a 32-bit
> hash on:
>
>   * the source address and the destination address of the IP header of
> the packet,
>   * the source port and the destination port of the UDP/TCP
> header, if any.
>
> Then, the controller takes the 7 lower bits of the RSS hash as an
> index
> into the RETA table to get the RX queue number where to store the
> packet.
>
> The API of the DPDK includes a function that is exported by Poll Mode
> Drivers to configure RETA entries of a given port.
>
> For test purposes, the testpmd application includes the following
> command
>
> "port config X rss reta (hash,queue)[,(hash,queue)]"
>
> to configure RETA entries of a port X, with each couple (hash,queue)
> contains the index of a RETA entry (between 0 and 127 included)
> and the
> RX queue number (between 0 and 15) to be stored into that RETA entry.
>
> Best regards
> Ivan
>
> -- 
> Ivan Boule
> 6WIND Development Engineer
>
>


-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] Redirection Table

2014-01-06 Thread Ivan Boule

On 12/31/2013 08:45 PM, Michael Quicquaro wrote:
> Has anyone used the "port config all reta (hash,queue)" command of testpmd
> with any success?
>
> I haven't found much documentation on it.
>
> Can someone provide an example on why and how it was used.
>
> Regards and Happy New Year,
> Michael Quicquaro
Hi Michael,

"RETA" stands for Redirection Table.
It is a per-port configurable table of 128 entries that is used by the
RSS filtering feature of Intel 1GbE and 10GbE controllers to select the
RX queue into which to store a received IP packet.
When receiving an IPv4/IPv6 packet, the controller computes a 32-bit
hash on:

   * the source address and the destination address of the IP header of
 the packet,
   * the source port and the destination port of the UDP/TCP header, if any.

Then, the controller takes the 7 lower bits of the RSS hash as an index
into the RETA table to get the RX queue number where to store the packet.

The API of the DPDK includes a function that is exported by Poll Mode
Drivers to configure RETA entries of a given port.

For test purposes, the testpmd application includes the following command

 "port config X rss reta (hash,queue)[,(hash,queue)]"

to configure RETA entries of a port X, with each couple (hash,queue)
contains the index of a RETA entry (between 0 and 127 included) and the
RX queue number (between 0 and 15) to be stored into that RETA entry.

Best regards
Ivan

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH 1/2] igb/ixgbe: ETH_MQ_RX_NONE should disable RSS

2013-11-20 Thread Ivan Boule

On 11/19/2013 02:03 PM, Maxime Leroy wrote:
> As explained in rte_ethdev.h, ETH_MQ_RX_NONE allows to not choose 
> between RSS, DCB
> or VMDQ modes for the selection of a rx queue.
>
> But the igb/ixgbe code always silently selects the RSS mode with 
> ETH_MQ_RX_NONE. This patch
> fixes this incoherence between the API and the implementation.
>
> Signed-off-by: Maxime Leroy 
> ---
>   lib/librte_pmd_e1000/igb_rxtx.c   |4 ++--
>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c |4 ++--
>   2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/lib/librte_pmd_e1000/igb_rxtx.c 
> b/lib/librte_pmd_e1000/igb_rxtx.c
> index f785d9f..641ceea 100644
> --- a/lib/librte_pmd_e1000/igb_rxtx.c
> +++ b/lib/librte_pmd_e1000/igb_rxtx.c
> @@ -1745,8 +1745,6 @@ igb_dev_mq_rx_configure(struct rte_eth_dev *dev)
>*/
>   if (dev->data->nb_rx_queues > 1)
>   switch (dev->data->dev_conf.rxmode.mq_mode) {
> -case ETH_MQ_RX_NONE:
> -/* if mq_mode not assign, we use rss mode.*/
>   case ETH_MQ_RX_RSS:
>   igb_rss_configure(dev);
>   break;
> @@ -1754,6 +1752,8 @@ igb_dev_mq_rx_configure(struct rte_eth_dev *dev)
>   /*Configure general VMDQ only RX parameters*/
>   igb_vmdq_rx_hw_configure(dev);
>   break;
> +case ETH_MQ_RX_NONE:
> +/* if mq_mode is none, disable rss mode.*/
>   default:
>   igb_rss_disable(dev);
>   break;
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index 0f7be95..e1b90f9 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> @@ -3217,8 +3217,6 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
>*/
>   if (dev->data->nb_rx_queues > 1)
>   switch (dev->data->dev_conf.rxmode.mq_mode) {
> -case ETH_MQ_RX_NONE:
> -/* if mq_mode not assign, we use rss mode.*/
>   case ETH_MQ_RX_RSS:
>   ixgbe_rss_configure(dev);
>   break;
> @@ -3231,6 +3229,8 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
>   ixgbe_vmdq_rx_hw_configure(dev);
>   break;
>
> +    case ETH_MQ_RX_NONE:
> +/* if mq_mode is none, disable rss mode.*/
>   default: ixgbe_rss_disable(dev);
>   }
>   else

Acked by Ivan Boule 

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH 2/2] igb/ixgbe: allow RSS with only one Rx queue

2013-11-20 Thread Ivan Boule

On 11/19/2013 02:03 PM, Maxime Leroy wrote:
> It should be possible to enable RSS with a single Rx queue.
> The RSS hash can be useful independently of the number of RX queues.
> Hence, DPDK applications can use the hardware-computed RSS hash to 
> manage in software different IP packet flows.
>
> Signed-off-by: Maxime Leroy 
> ---
>   lib/librte_pmd_e1000/igb_rxtx.c   |7 ++-
>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c |7 ++-
>   2 files changed, 4 insertions(+), 10 deletions(-)
>
> diff --git a/lib/librte_pmd_e1000/igb_rxtx.c 
> b/lib/librte_pmd_e1000/igb_rxtx.c
> index 641ceea..8c1e2cc 100644
> --- a/lib/librte_pmd_e1000/igb_rxtx.c
> +++ b/lib/librte_pmd_e1000/igb_rxtx.c
> @@ -1743,8 +1743,7 @@ igb_dev_mq_rx_configure(struct rte_eth_dev *dev)
>   /*
>* SRIOV inactive scheme
>*/
> -if (dev->data->nb_rx_queues > 1)
> -switch (dev->data->dev_conf.rxmode.mq_mode) {
> +switch (dev->data->dev_conf.rxmode.mq_mode) {
>   case ETH_MQ_RX_RSS:
>   igb_rss_configure(dev);
>   break;
> @@ -1757,9 +1756,7 @@ igb_dev_mq_rx_configure(struct rte_eth_dev *dev)
>   default:
>   igb_rss_disable(dev);
>   break;
> -}
> -else
> -igb_rss_disable(dev);
> +}
>   }
>  return 0;
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index e1b90f9..ae9eda8 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> @@ -3215,8 +3215,7 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
> * SRIOV inactive scheme
>* any DCB/RSS w/o VMDq multi-queue setting
>*/
> -if (dev->data->nb_rx_queues > 1)
> -switch (dev->data->dev_conf.rxmode.mq_mode) {
> +switch (dev->data->dev_conf.rxmode.mq_mode) {
>   case ETH_MQ_RX_RSS:
>   ixgbe_rss_configure(dev);
>   break;
> @@ -3232,9 +3231,7 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
>   case ETH_MQ_RX_NONE:
>   /* if mq_mode is none, disable rss mode.*/
>   default: ixgbe_rss_disable(dev);
> -}
> -else
> -ixgbe_rss_disable(dev);
> +}
>   } else {
>   switch (RTE_ETH_DEV_SRIOV(dev).active) {
>   /*

Acked by Ivan Boule 

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH] igb/ixgbe: fix index overflow when resetting 4096 queue rings

2013-11-15 Thread Ivan Boule

On 11/15/2013 02:19 PM, Thomas Monjalon wrote:
> Rings are resetted with a loop because memset cannot be used without
> issuing a warning about volatile casting.
> The index of the loop was a 16-bit variable which is is sufficient for
> ring entries number but not for the byte size of the whole ring.
> The overflow happens when rings are configured for 4096 entries
> (descriptor size is 16 bytes). The result is an endless loop.
>
> It is fixed by indexing ring entries and resetting all bytes of the entry
> with a simple assignment.
> The descriptor initializer is zeroed thanks to its static declaration.
>
> Signed-off-by: Thomas Monjalon 
> ---
>   lib/librte_pmd_e1000/igb_rxtx.c   |   14 ++
>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c |   10 ++
>   2 files changed, 12 insertions(+), 12 deletions(-)
>
>
Acked-by: Ivan Boule 

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH] ethdev: fix non-reconfigurable pmd init

2013-09-16 Thread Ivan Boule

On 09/13/2013 03:38 PM, Thomas Monjalon wrote:
> Some Poll-Mode Drivers (PMD) are not reconfigurable and,
> thus, do not implement (rx|tx)_queue_release functions.
> For these drivers, the functions rte_eth_dev_(rx|tx)_queue_config
> must return an ENOTSUP error only when reconfiguring,
> but not at initial configuration.
>
> Move the FUNC_PTR_OR_ERR_RET check into the case of reconfiguration.
>
> Signed-off-by: Thomas Monjalon 
>
Acked-by: Ivan Boule 

-- 
Ivan Boule
6WIND Development Engineer

[dpdk-dev] [PATCH] ethdev: add pause frame counters for em/igb/ixgbe

2013-09-13 Thread Ivan Boule

Add into the `rte_eth_stats` data structure 4 (64-bit) counters
of XOFF/XON pause frames received and sent on a given port.

Update em, igb, and ixgbe drivers to return the value of the 4 XOFF/XON
counters through the `rte_eth_stats_get` function exported by the DPDK
API.

Display the value of the 4 XOFF/XON counters in the `testpmd` application.

Signed-off-by: Ivan Boule 
---
 app/test-pmd/config.c   |8 
 app/test-pmd/testpmd.c  |   10 ++
 lib/librte_ether/rte_ethdev.h   |4 
 lib/librte_pmd_e1000/em_ethdev.c|6 ++
 lib/librte_pmd_e1000/igb_ethdev.c   |6 ++
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 ++
 6 files changed, 40 insertions(+)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 0b66671..0be0db8 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -167,6 +167,14 @@ nic_stats_display(portid_t port_id)
}
}

+   /* Display statistics of XON/XOFF pause frames, if any. */
+   if ((stats.tx_pause_xon  | stats.rx_pause_xon |
+stats.tx_pause_xoff | stats.rx_pause_xoff) > 0) {
+   printf("  RX-XOFF:%-10"PRIu64" RX-XON:%-10"PRIu64"\n",
+  stats.rx_pause_xoff, stats.rx_pause_xon);
+   printf("  TX-XOFF:%-10"PRIu64" TX-XON:%-10"PRIu64"\n",
+  stats.tx_pause_xoff, stats.tx_pause_xon);
+   }
printf("  %s%s\n",
   nic_stats_border, nic_stats_border);
 }
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 58833ac..69eb26c 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -674,6 +674,16 @@ fwd_port_stats_display(portid_t port_id, struct 
rte_eth_stats *stats)
if (stats->rx_nombuf > 0)
printf("  RX-nombufs:%14"PRIu64"\n", stats->rx_nombuf);
}
+
+   /* Display statistics of XON/XOFF pause frames, if any. */
+   if ((stats->tx_pause_xon  | stats->rx_pause_xon |
+stats->tx_pause_xoff | stats->rx_pause_xoff) > 0) {
+   printf("  RX-XOFF:%-14"PRIu64" RX-XON: %-14"PRIu64"\n",
+  stats->rx_pause_xoff, stats->rx_pause_xon);
+   printf("  TX-XOFF:%-14"PRIu64" TX-XON: %-14"PRIu64"\n",
+  stats->tx_pause_xoff, stats->tx_pause_xon);
+   }
+
 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
if (port->rx_stream)
pkt_burst_stats_display("RX",
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 86eb295..12ecad1 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -193,6 +193,10 @@ struct rte_eth_stats {
uint64_t rx_nombuf; /**< Total number of RX mbuf allocation failures. */
uint64_t fdirmatch; /**< Total number of RX packets matching a filter. 
*/
uint64_t fdirmiss;  /**< Total number of RX packets not matching any 
filter. */
+   uint64_t tx_pause_xon;  /**< Total nb. of XON pause frame sent. */
+   uint64_t rx_pause_xon;  /**< Total nb. of XON pause frame received. */
+   uint64_t tx_pause_xoff; /**< Total nb. of XOFF pause frame sent. */
+   uint64_t rx_pause_xoff; /**< Total nb. of XOFF pause frame received. */
uint64_t q_ipackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
/**< Total number of queue RX packets. */
uint64_t q_opackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
diff --git a/lib/librte_pmd_e1000/em_ethdev.c b/lib/librte_pmd_e1000/em_ethdev.c
index d0d2d79..5e00b98 100644
--- a/lib/librte_pmd_e1000/em_ethdev.c
+++ b/lib/librte_pmd_e1000/em_ethdev.c
@@ -801,6 +801,12 @@ eth_em_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *rte_stats)
rte_stats->opackets = stats->gptc;
rte_stats->ibytes   = stats->gorc;
rte_stats->obytes   = stats->gotc;
+
+   /* XON/XOFF pause frames stats registers */
+   rte_stats->tx_pause_xon  = stats->xontxc;
+   rte_stats->rx_pause_xon  = stats->xonrxc;
+   rte_stats->tx_pause_xoff = stats->xofftxc;
+   rte_stats->rx_pause_xoff = stats->xoffrxc;
 }

 static void
diff --git a/lib/librte_pmd_e1000/igb_ethdev.c 
b/lib/librte_pmd_e1000/igb_ethdev.c
index 9936583..f62426e 100644
--- a/lib/librte_pmd_e1000/igb_ethdev.c
+++ b/lib/librte_pmd_e1000/igb_ethdev.c
@@ -911,6 +911,12 @@ eth_igb_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *rte_stats)
/* Tx Errors */
rte_stats->oerrors = stats->ecol + stats->latecol;

+   /* XON/XOFF pause frames */
+   rte_stats->tx_pause_xon  = stats->xontxc;
+   rte_stats->rx_pause_xon  = stats->xonrxc;
+   rte_stat

[dpdk-dev] PEG6I Six Port Gigabit

2013-05-30 Thread Ivan Boule

On 05/30/2013 03:31 AM, Nulik Nol wrote:
> Hi,
> has anyone tested dpdk on this card?
>
> PEG6I Six Port Gigabit
> http://www.servethehome.com/dell-silicom-peg6i-port-gigabit-gige-intel-82571eb-network-adapter/
>
> It is an 82571EB chip.
>
> Regards
>
Hi,

The igb/e1000 driver in the current DPDK version on dpdk.org does not 
yet support the family of Intel 82571 controllers.

Note:
to check if a PCI ethernet controller is supported by the DPDK, find its 
PCI vendor and device identifiers
(for instance with the 'lspci' command) and check them against the list 
of supported PCI device identifiers in the file:
  lib/librte_eal/common/include/rte_pci_dev_ids.h

Best regards,
Ivan

-- 
Ivan Boule
6WIND Development Engineer

65 matches

Mail list logo