[dpdk-dev] [PATCH 00/24] Refactor mlx5 to improve performance

2016-06-20 Thread Nélio Laranjeiro
On Mon, Jun 20, 2016 at 04:03:00PM +0100, Ferruh Yigit wrote:
> On 6/20/2016 8:38 AM, N?lio Laranjeiro wrote:
> > On Fri, Jun 17, 2016 at 05:09:43PM +0100, Ferruh Yigit wrote:
> >> On 6/8/2016 10:47 AM, Nelio Laranjeiro wrote:
> >>> Enhance mlx5 with a data path that bypasses Verbs.
> >>>
> >>> The first half of this patchset removes support for functionality 
> >>> completely
> >>> rewritten in the second half (scatter/gather, inline send), while the data
> >>> path is refactored without Verbs.
> >>>
> >>> The PMD remains usable during the transition.
> >>>
> >>> This patchset must be applied after "Miscellaneous fixes for mlx4 and 
> >>> mlx5".
> >>>
> >>> Adrien Mazarguil (8):
> >>>   mlx5: replace countdown with threshold for TX completions
> >>>   mlx5: add debugging information about TX queues capabilities
> >>>   mlx5: check remaining space while processing TX burst
> >>>   mlx5: resurrect TX gather support
> >>>   mlx5: work around spurious compilation errors
> >>>   mlx5: remove redundant RX queue initialization code
> >>>   mlx5: make RX queue reinitialization safer
> >>>   mlx5: resurrect RX scatter support
> >>>
> >>> Nelio Laranjeiro (15):
> >>>   mlx5: split memory registration function for better performance
> >>>   mlx5: remove TX gather support
> >>>   mlx5: remove RX scatter support
> >>>   mlx5: remove configuration variable for maximum number of segments
> >>>   mlx5: remove inline TX support
> >>>   mlx5: split TX queue structure
> >>>   mlx5: split RX queue structure
> >>>   mlx5: update prerequisites for upcoming enhancements
> >>>   mlx5: add definitions for data path without Verbs
> >>>   mlx5: add support for configuration through kvargs
> >>>   mlx5: add TX/RX burst function selection wrapper
> >>>   mlx5: refactor RX data path
> >>>   mlx5: refactor TX data path
> >>>   mlx5: handle RX CQE compression
> >>>   mlx5: add support for multi-packet send
> >>>
> >>> Yaacov Hazan (1):
> >>>   mlx5: add support for inline send
> >>>
> >>
> >> I run basic checks to the patchset:
> >>
> >> There are various checkpatch warnings, all are warning or check level
> >>
> >> Patch 8 and 13 failed to apply with via git, -looks line line numbers
> >> shifted a little, this is not a problem since eventually it applies but
> >> just for your information.
> >>
> >> check-git-log is giving following errors, it is mainly case issue in Rx/Tx:
> >> Wrong headline lowercase:
> >> mlx5: resurrect RX scatter support
> >> mlx5: make RX queue reinitialization safer
> >> mlx5: remove redundant RX queue initialization code
> >> mlx5: resurrect TX gather support
> >> mlx5: check remaining space while processing TX burst
> >> mlx5: add debugging information about TX queues capabilities
> >> mlx5: replace countdown with threshold for TX completions
> >> mlx5: handle RX CQE compression
> >> mlx5: refactor RX data path
> >> mlx5: add TX/RX burst function selection wrapper
> >> mlx5: split RX queue structure
> >> mlx5: split TX queue structure
> >> mlx5: remove inline TX support
> >> mlx5: remove RX scatter support
> >> mlx5: remove TX gather support
> >> Headline too long:
> >> mlx5: remove configuration variable for maximum number of segments
> >> mlx5: split memory registration function for better performance
> >>
> >>
> >> It compiles fine.
> >>
> >> Regards,
> >> ferruh
> > 
> > Hi ferruh,
> > 
> > In fact, It does not apply well on top the current DPDK master branch.
> > 
> 
> I did able to apply on top of rel_16_07 branch using "patch" binary. but
> if you think it doesn't apply well, any plan to send a new version?
> 
> Thanks,
> ferruh

I am finishing the V2, with some small fixes (it will be more detailed
in the cover letter).

It will be sent in few minutes the necessary time to run the check-*
scripts on it.

Thanks,

-- 
N?lio Laranjeiro
6WIND


[dpdk-dev] [PATCH 00/24] Refactor mlx5 to improve performance

2016-06-20 Thread Ferruh Yigit
On 6/20/2016 8:38 AM, N?lio Laranjeiro wrote:
> On Fri, Jun 17, 2016 at 05:09:43PM +0100, Ferruh Yigit wrote:
>> On 6/8/2016 10:47 AM, Nelio Laranjeiro wrote:
>>> Enhance mlx5 with a data path that bypasses Verbs.
>>>
>>> The first half of this patchset removes support for functionality completely
>>> rewritten in the second half (scatter/gather, inline send), while the data
>>> path is refactored without Verbs.
>>>
>>> The PMD remains usable during the transition.
>>>
>>> This patchset must be applied after "Miscellaneous fixes for mlx4 and mlx5".
>>>
>>> Adrien Mazarguil (8):
>>>   mlx5: replace countdown with threshold for TX completions
>>>   mlx5: add debugging information about TX queues capabilities
>>>   mlx5: check remaining space while processing TX burst
>>>   mlx5: resurrect TX gather support
>>>   mlx5: work around spurious compilation errors
>>>   mlx5: remove redundant RX queue initialization code
>>>   mlx5: make RX queue reinitialization safer
>>>   mlx5: resurrect RX scatter support
>>>
>>> Nelio Laranjeiro (15):
>>>   mlx5: split memory registration function for better performance
>>>   mlx5: remove TX gather support
>>>   mlx5: remove RX scatter support
>>>   mlx5: remove configuration variable for maximum number of segments
>>>   mlx5: remove inline TX support
>>>   mlx5: split TX queue structure
>>>   mlx5: split RX queue structure
>>>   mlx5: update prerequisites for upcoming enhancements
>>>   mlx5: add definitions for data path without Verbs
>>>   mlx5: add support for configuration through kvargs
>>>   mlx5: add TX/RX burst function selection wrapper
>>>   mlx5: refactor RX data path
>>>   mlx5: refactor TX data path
>>>   mlx5: handle RX CQE compression
>>>   mlx5: add support for multi-packet send
>>>
>>> Yaacov Hazan (1):
>>>   mlx5: add support for inline send
>>>
>>
>> I run basic checks to the patchset:
>>
>> There are various checkpatch warnings, all are warning or check level
>>
>> Patch 8 and 13 failed to apply with via git, -looks line line numbers
>> shifted a little, this is not a problem since eventually it applies but
>> just for your information.
>>
>> check-git-log is giving following errors, it is mainly case issue in Rx/Tx:
>> Wrong headline lowercase:
>> mlx5: resurrect RX scatter support
>> mlx5: make RX queue reinitialization safer
>> mlx5: remove redundant RX queue initialization code
>> mlx5: resurrect TX gather support
>> mlx5: check remaining space while processing TX burst
>> mlx5: add debugging information about TX queues capabilities
>> mlx5: replace countdown with threshold for TX completions
>> mlx5: handle RX CQE compression
>> mlx5: refactor RX data path
>> mlx5: add TX/RX burst function selection wrapper
>> mlx5: split RX queue structure
>> mlx5: split TX queue structure
>> mlx5: remove inline TX support
>> mlx5: remove RX scatter support
>> mlx5: remove TX gather support
>> Headline too long:
>> mlx5: remove configuration variable for maximum number of segments
>> mlx5: split memory registration function for better performance
>>
>>
>> It compiles fine.
>>
>> Regards,
>> ferruh
> 
> Hi ferruh,
> 
> In fact, It does not apply well on top the current DPDK master branch.
> 

I did able to apply on top of rel_16_07 branch using "patch" binary. but
if you think it doesn't apply well, any plan to send a new version?

Thanks,
ferruh



[dpdk-dev] [PATCH 00/24] Refactor mlx5 to improve performance

2016-06-20 Thread Nélio Laranjeiro
On Fri, Jun 17, 2016 at 05:09:43PM +0100, Ferruh Yigit wrote:
> On 6/8/2016 10:47 AM, Nelio Laranjeiro wrote:
> > Enhance mlx5 with a data path that bypasses Verbs.
> > 
> > The first half of this patchset removes support for functionality completely
> > rewritten in the second half (scatter/gather, inline send), while the data
> > path is refactored without Verbs.
> > 
> > The PMD remains usable during the transition.
> > 
> > This patchset must be applied after "Miscellaneous fixes for mlx4 and mlx5".
> > 
> > Adrien Mazarguil (8):
> >   mlx5: replace countdown with threshold for TX completions
> >   mlx5: add debugging information about TX queues capabilities
> >   mlx5: check remaining space while processing TX burst
> >   mlx5: resurrect TX gather support
> >   mlx5: work around spurious compilation errors
> >   mlx5: remove redundant RX queue initialization code
> >   mlx5: make RX queue reinitialization safer
> >   mlx5: resurrect RX scatter support
> > 
> > Nelio Laranjeiro (15):
> >   mlx5: split memory registration function for better performance
> >   mlx5: remove TX gather support
> >   mlx5: remove RX scatter support
> >   mlx5: remove configuration variable for maximum number of segments
> >   mlx5: remove inline TX support
> >   mlx5: split TX queue structure
> >   mlx5: split RX queue structure
> >   mlx5: update prerequisites for upcoming enhancements
> >   mlx5: add definitions for data path without Verbs
> >   mlx5: add support for configuration through kvargs
> >   mlx5: add TX/RX burst function selection wrapper
> >   mlx5: refactor RX data path
> >   mlx5: refactor TX data path
> >   mlx5: handle RX CQE compression
> >   mlx5: add support for multi-packet send
> > 
> > Yaacov Hazan (1):
> >   mlx5: add support for inline send
> > 
> 
> I run basic checks to the patchset:
> 
> There are various checkpatch warnings, all are warning or check level
> 
> Patch 8 and 13 failed to apply with via git, -looks line line numbers
> shifted a little, this is not a problem since eventually it applies but
> just for your information.
> 
> check-git-log is giving following errors, it is mainly case issue in Rx/Tx:
> Wrong headline lowercase:
> mlx5: resurrect RX scatter support
> mlx5: make RX queue reinitialization safer
> mlx5: remove redundant RX queue initialization code
> mlx5: resurrect TX gather support
> mlx5: check remaining space while processing TX burst
> mlx5: add debugging information about TX queues capabilities
> mlx5: replace countdown with threshold for TX completions
> mlx5: handle RX CQE compression
> mlx5: refactor RX data path
> mlx5: add TX/RX burst function selection wrapper
> mlx5: split RX queue structure
> mlx5: split TX queue structure
> mlx5: remove inline TX support
> mlx5: remove RX scatter support
> mlx5: remove TX gather support
> Headline too long:
> mlx5: remove configuration variable for maximum number of segments
> mlx5: split memory registration function for better performance
> 
> 
> It compiles fine.
> 
> Regards,
> ferruh

Hi ferruh,

In fact, It does not apply well on top the current DPDK master branch.

Thanks.

-- 
N?lio Laranjeiro
6WIND


[dpdk-dev] [PATCH 00/24] Refactor mlx5 to improve performance

2016-06-17 Thread Ferruh Yigit
On 6/8/2016 10:47 AM, Nelio Laranjeiro wrote:
> Enhance mlx5 with a data path that bypasses Verbs.
> 
> The first half of this patchset removes support for functionality completely
> rewritten in the second half (scatter/gather, inline send), while the data
> path is refactored without Verbs.
> 
> The PMD remains usable during the transition.
> 
> This patchset must be applied after "Miscellaneous fixes for mlx4 and mlx5".
> 
> Adrien Mazarguil (8):
>   mlx5: replace countdown with threshold for TX completions
>   mlx5: add debugging information about TX queues capabilities
>   mlx5: check remaining space while processing TX burst
>   mlx5: resurrect TX gather support
>   mlx5: work around spurious compilation errors
>   mlx5: remove redundant RX queue initialization code
>   mlx5: make RX queue reinitialization safer
>   mlx5: resurrect RX scatter support
> 
> Nelio Laranjeiro (15):
>   mlx5: split memory registration function for better performance
>   mlx5: remove TX gather support
>   mlx5: remove RX scatter support
>   mlx5: remove configuration variable for maximum number of segments
>   mlx5: remove inline TX support
>   mlx5: split TX queue structure
>   mlx5: split RX queue structure
>   mlx5: update prerequisites for upcoming enhancements
>   mlx5: add definitions for data path without Verbs
>   mlx5: add support for configuration through kvargs
>   mlx5: add TX/RX burst function selection wrapper
>   mlx5: refactor RX data path
>   mlx5: refactor TX data path
>   mlx5: handle RX CQE compression
>   mlx5: add support for multi-packet send
> 
> Yaacov Hazan (1):
>   mlx5: add support for inline send
> 

I run basic checks to the patchset:

There are various checkpatch warnings, all are warning or check level

Patch 8 and 13 failed to apply with via git, -looks line line numbers
shifted a little, this is not a problem since eventually it applies but
just for your information.

check-git-log is giving following errors, it is mainly case issue in Rx/Tx:
Wrong headline lowercase:
mlx5: resurrect RX scatter support
mlx5: make RX queue reinitialization safer
mlx5: remove redundant RX queue initialization code
mlx5: resurrect TX gather support
mlx5: check remaining space while processing TX burst
mlx5: add debugging information about TX queues capabilities
mlx5: replace countdown with threshold for TX completions
mlx5: handle RX CQE compression
mlx5: refactor RX data path
mlx5: add TX/RX burst function selection wrapper
mlx5: split RX queue structure
mlx5: split TX queue structure
mlx5: remove inline TX support
mlx5: remove RX scatter support
mlx5: remove TX gather support
Headline too long:
mlx5: remove configuration variable for maximum number of segments
mlx5: split memory registration function for better performance


It compiles fine.

Regards,
ferruh


[dpdk-dev] [PATCH 00/24] Refactor mlx5 to improve performance

2016-06-14 Thread Nélio Laranjeiro
On Mon, Jun 13, 2016 at 11:50:48AM -0700, Javier Blazquez wrote:
>[...] 
> This is a very exciting patch. I applied it and reran some microbenchmarks
> of mine that test the TX and RX paths separately. These are the results I
> got:
> 
> TX path (burst = 64 packets)
> 
> 1 thread - 2 ports - 4 queues per port: 39Mpps => 48Mpps
> 2 threads - 2 ports - 2 queues per port: 60Mpps => 60Mpps (hardware
> limitation?)

To be able to reach higher values you will need to configure the inline
feature with the device argument txq_inline, and only activate it with
more than 1 queue, this can be done with the txq_min_inline argument.

This feature helps the NIC by reducing the PCI back-pressure, in
counterpart it will consume more CPU cycles.

You can take a look to the NIC documentation (doc/guides/nics/mlx5.rst)
updated in this path-set which explains both txq_inline and
txqs_min_inline device arguments.

> RX path (burst = 32 packets)
> 
> 1 thread - 2 ports - 4 queues per port: 38Mpps => 46Mpps
> 2 threads - 2 ports - 2 queues per port: 43Mpps => 50Mpps
> 
> The tests were run on the following hardware, using DPDK master with this
> patch and the "Miscellaneous fixes for mlx4 and mlx5" patch applied:
> 
> 2x Intel Xeon E5-2680 v3 2.5GHz
> 64GB DDR4-2133
> 1x Mellanox ConnectX-4 EN, 40/56GbE dual-port, PCIe3.0 x8 (MCX414A-BCAT)
> 
> I haven't test it extensively outside of these microbenchmarks, but so far
> this patch has been working great on my end, so:
> 
> tested-by: Javier Blazquez 

Regards,

-- 
N?lio Laranjeiro
6WIND


[dpdk-dev] [PATCH 00/24] Refactor mlx5 to improve performance

2016-06-13 Thread Javier Blazquez
> Enhance mlx5 with a data path that bypasses Verbs.
>
> The first half of this patchset removes support for functionality
completely
> rewritten in the second half (scatter/gather, inline send), while the data
> path is refactored without Verbs.
>
> The PMD remains usable during the transition.
>
> This patchset must be applied after "Miscellaneous fixes for mlx4 and
mlx5".
>
> Adrien Mazarguil (8):
>   mlx5: replace countdown with threshold for TX completions
>   mlx5: add debugging information about TX queues capabilities
>   mlx5: check remaining space while processing TX burst
>   mlx5: resurrect TX gather support
>   mlx5: work around spurious compilation errors
>   mlx5: remove redundant RX queue initialization code
>   mlx5: make RX queue reinitialization safer
>   mlx5: resurrect RX scatter support
>
> Nelio Laranjeiro (15):
>   mlx5: split memory registration function for better performance
>   mlx5: remove TX gather support
>   mlx5: remove RX scatter support
>   mlx5: remove configuration variable for maximum number of segments
>   mlx5: remove inline TX support
>   mlx5: split TX queue structure
>   mlx5: split RX queue structure
>   mlx5: update prerequisites for upcoming enhancements
>   mlx5: add definitions for data path without Verbs
>   mlx5: add support for configuration through kvargs
>   mlx5: add TX/RX burst function selection wrapper
>   mlx5: refactor RX data path
>   mlx5: refactor TX data path
>   mlx5: handle RX CQE compression
>   mlx5: add support for multi-packet send
>
> Yaacov Hazan (1):
>   mlx5: add support for inline send
>
>  config/common_base |2 -
>  doc/guides/nics/mlx5.rst   |   94 +-
>  drivers/net/mlx5/Makefile  |   49 +-
>  drivers/net/mlx5/mlx5.c|  158 ++-
>  drivers/net/mlx5/mlx5.h|   10 +
>  drivers/net/mlx5/mlx5_defs.h   |   26 +-
>  drivers/net/mlx5/mlx5_ethdev.c |  188 +++-
>  drivers/net/mlx5/mlx5_fdir.c   |   20 +-
>  drivers/net/mlx5/mlx5_mr.c |  280 +
>  drivers/net/mlx5/mlx5_prm.h|  155 +++
>  drivers/net/mlx5/mlx5_rxmode.c |8 -
>  drivers/net/mlx5/mlx5_rxq.c|  757 +-
>  drivers/net/mlx5/mlx5_rxtx.c   | 2206
+++-
>  drivers/net/mlx5/mlx5_rxtx.h   |  176 ++--
>  drivers/net/mlx5/mlx5_txq.c|  362 ---
>  drivers/net/mlx5/mlx5_vlan.c   |6 +-
>  16 files changed, 2578 insertions(+), 1919 deletions(-)
>  create mode 100644 drivers/net/mlx5/mlx5_mr.c
>  create mode 100644 drivers/net/mlx5/mlx5_prm.h
>
> --
> 2.1.4

This is a very exciting patch. I applied it and reran some microbenchmarks
of mine that test the TX and RX paths separately. These are the results I
got:

TX path (burst = 64 packets)

1 thread - 2 ports - 4 queues per port: 39Mpps => 48Mpps
2 threads - 2 ports - 2 queues per port: 60Mpps => 60Mpps (hardware
limitation?)

RX path (burst = 32 packets)

1 thread - 2 ports - 4 queues per port: 38Mpps => 46Mpps
2 threads - 2 ports - 2 queues per port: 43Mpps => 50Mpps

The tests were run on the following hardware, using DPDK master with this
patch and the "Miscellaneous fixes for mlx4 and mlx5" patch applied:

2x Intel Xeon E5-2680 v3 2.5GHz
64GB DDR4-2133
1x Mellanox ConnectX-4 EN, 40/56GbE dual-port, PCIe3.0 x8 (MCX414A-BCAT)

I haven't test it extensively outside of these microbenchmarks, but so far
this patch has been working great on my end, so:

tested-by: Javier Blazquez 


[dpdk-dev] [PATCH 00/24] Refactor mlx5 to improve performance

2016-06-08 Thread Nelio Laranjeiro
Enhance mlx5 with a data path that bypasses Verbs.

The first half of this patchset removes support for functionality completely
rewritten in the second half (scatter/gather, inline send), while the data
path is refactored without Verbs.

The PMD remains usable during the transition.

This patchset must be applied after "Miscellaneous fixes for mlx4 and mlx5".

Adrien Mazarguil (8):
  mlx5: replace countdown with threshold for TX completions
  mlx5: add debugging information about TX queues capabilities
  mlx5: check remaining space while processing TX burst
  mlx5: resurrect TX gather support
  mlx5: work around spurious compilation errors
  mlx5: remove redundant RX queue initialization code
  mlx5: make RX queue reinitialization safer
  mlx5: resurrect RX scatter support

Nelio Laranjeiro (15):
  mlx5: split memory registration function for better performance
  mlx5: remove TX gather support
  mlx5: remove RX scatter support
  mlx5: remove configuration variable for maximum number of segments
  mlx5: remove inline TX support
  mlx5: split TX queue structure
  mlx5: split RX queue structure
  mlx5: update prerequisites for upcoming enhancements
  mlx5: add definitions for data path without Verbs
  mlx5: add support for configuration through kvargs
  mlx5: add TX/RX burst function selection wrapper
  mlx5: refactor RX data path
  mlx5: refactor TX data path
  mlx5: handle RX CQE compression
  mlx5: add support for multi-packet send

Yaacov Hazan (1):
  mlx5: add support for inline send

 config/common_base |2 -
 doc/guides/nics/mlx5.rst   |   94 +-
 drivers/net/mlx5/Makefile  |   49 +-
 drivers/net/mlx5/mlx5.c|  158 ++-
 drivers/net/mlx5/mlx5.h|   10 +
 drivers/net/mlx5/mlx5_defs.h   |   26 +-
 drivers/net/mlx5/mlx5_ethdev.c |  188 +++-
 drivers/net/mlx5/mlx5_fdir.c   |   20 +-
 drivers/net/mlx5/mlx5_mr.c |  280 +
 drivers/net/mlx5/mlx5_prm.h|  155 +++
 drivers/net/mlx5/mlx5_rxmode.c |8 -
 drivers/net/mlx5/mlx5_rxq.c|  757 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 2206 +++-
 drivers/net/mlx5/mlx5_rxtx.h   |  176 ++--
 drivers/net/mlx5/mlx5_txq.c|  362 ---
 drivers/net/mlx5/mlx5_vlan.c   |6 +-
 16 files changed, 2578 insertions(+), 1919 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mr.c
 create mode 100644 drivers/net/mlx5/mlx5_prm.h

-- 
2.1.4