[dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms

2015-01-27 Thread Wang, Zhihong
> -Original Message- > From: Wodkowski, PawelX > Sent: Monday, January 26, 2015 10:43 PM > To: Wang, Zhihong; dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in > arch/x86/rte_memcpy.h for both SSE and AVX platforms > > Hi, > > I must say: greate

[dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms

2015-01-26 Thread Wodkowski, PawelX
Hi, I must say: greate work. I have some small comments: > +/** > + * Macro for copying unaligned block from one location to another, > + * 47 bytes leftover maximum, > + * locations should not overlap. > + * Requirements: > + * - Store is aligned > + * - Load offset is , which must be

[dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms

2015-01-25 Thread Jim Thompson
> On Jan 20, 2015, at 11:15 AM, Stephen Hemminger networkplumber.org> wrote: > > On Mon, 19 Jan 2015 09:53:34 +0800 > zhihong.wang at intel.com wrote: > >> Main code changes: >> >> 1. Differentiate architectural features based on CPU flags >> >>a. Implement separated move functions

[dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms

2015-01-21 Thread Wang, Zhihong
> -Original Message- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Wednesday, January 21, 2015 3:16 AM > To: Stephen Hemminger > Cc: Wang, Zhihong; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in > arch/x86/rte_memcpy.h for both

[dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms

2015-01-20 Thread Neil Horman
On Tue, Jan 20, 2015 at 09:15:38AM -0800, Stephen Hemminger wrote: > On Mon, 19 Jan 2015 09:53:34 +0800 > zhihong.wang at intel.com wrote: > > > Main code changes: > > > > 1. Differentiate architectural features based on CPU flags > > > > a. Implement separated move functions for

[dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms

2015-01-20 Thread Stephen Hemminger
On Mon, 19 Jan 2015 09:53:34 +0800 zhihong.wang at intel.com wrote: > Main code changes: > > 1. Differentiate architectural features based on CPU flags > > a. Implement separated move functions for SSE/AVX/AVX2 to make full > utilization of cache bandwidth > > b. Implement separated

[dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms

2015-01-19 Thread zhihong.w...@intel.com
Main code changes: 1. Differentiate architectural features based on CPU flags a. Implement separated move functions for SSE/AVX/AVX2 to make full utilization of cache bandwidth b. Implement separated copy flow specifically optimized for target architecture 2. Rewrite the memcpy