Re: SIMD implementation of dot-product. Benchmarks

2013-08-25 Thread Manu
On 25 August 2013 01:01, Ilya Yaroshenko wrote: > On Sunday, 18 August 2013 at 05:26:00 UTC, Manu wrote: > >> movups is not good. It'll be a lot faster (and portable) if you use >> movaps. >> >> Process looks something like: >> * do the first few from a[0] until a's alignment interval as scalar

Re: SIMD implementation of dot-product. Benchmarks

2013-08-24 Thread Ilya Yaroshenko
On Sunday, 18 August 2013 at 05:26:00 UTC, Manu wrote: movups is not good. It'll be a lot faster (and portable) if you use movaps. Process looks something like: * do the first few from a[0] until a's alignment interval as scalar * load the left of b's aligned pair * loop for each aligne

Re: SIMD implementation of dot-product. Benchmarks

2013-08-21 Thread Andrei Alexandrescu
On 8/17/13 11:50 AM, Ilya Yaroshenko wrote: http://spiceandmath.blogspot.ru/2013/08/simd-implementation-of-dot-product_17.html http://www.reddit.com/r/programming/comments/1ktue0/benchmarking_a_simd_implementation_of_dot_product/ Andrei

Re: SIMD implementation of dot-product. Benchmarks

2013-08-18 Thread Andrei Alexandrescu
On 8/18/13 10:24 AM, Ilya Yaroshenko wrote: On Sunday, 18 August 2013 at 16:32:33 UTC, Andrei Alexandrescu wrote: On 8/17/13 11:50 AM, Ilya Yaroshenko wrote: http://spiceandmath.blogspot.ru/2013/08/simd-implementation-of-dot-product_17.html Ilya The images never load for me, all I see is s

Re: SIMD implementation of dot-product. Benchmarks

2013-08-18 Thread Iain Buclaw
On 17 August 2013 19:50, Ilya Yaroshenko wrote: > http://spiceandmath.blogspot.ru/2013/08/simd-implementation-of-dot-product_17.html > > Ilya > > Having a quick flick through the simd.d source, I see LDC's and GDC's implementation couldn't be any more wildly different... (LDC's doesn't even look

Re: SIMD implementation of dot-product. Benchmarks

2013-08-18 Thread Ilya Yaroshenko
On Sunday, 18 August 2013 at 16:32:33 UTC, Andrei Alexandrescu wrote: On 8/17/13 11:50 AM, Ilya Yaroshenko wrote: http://spiceandmath.blogspot.ru/2013/08/simd-implementation-of-dot-product_17.html Ilya The images never load for me, all I see is some "Request timed out" stripes after the tex

Re: SIMD implementation of dot-product. Benchmarks

2013-08-18 Thread Andrei Alexandrescu
On 8/17/13 11:50 AM, Ilya Yaroshenko wrote: http://spiceandmath.blogspot.ru/2013/08/simd-implementation-of-dot-product_17.html Ilya The images never load for me, all I see is some "Request timed out" stripes after the text. Typo: Ununtu Andrei

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread Ilya Yaroshenko
On Sunday, 18 August 2013 at 05:26:00 UTC, Manu wrote: movups is not good. It'll be a lot faster (and portable) if you use movaps. Process looks something like: * do the first few from a[0] until a's alignment interval as scalar * load the left of b's aligned pair * loop for each aligne

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread Manu
movups is not good. It'll be a lot faster (and portable) if you use movaps. Process looks something like: * do the first few from a[0] until a's alignment interval as scalar * load the left of b's aligned pair * loop for each aligned vector in a - load a[n..n+4] aligned - load the ri

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread Ilya Yaroshenko
On Sunday, 18 August 2013 at 05:07:12 UTC, Manu wrote: On 18 August 2013 14:39, Ilya Yaroshenko wrote: On Sunday, 18 August 2013 at 01:53:53 UTC, Manu wrote: It doesn't look like you account for alignment. This is basically not-portable (I doubt unaligned loads in this context are faster

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread Manu
On 18 August 2013 14:39, Ilya Yaroshenko wrote: > On Sunday, 18 August 2013 at 01:53:53 UTC, Manu wrote: > >> It doesn't look like you account for alignment. >> This is basically not-portable (I doubt unaligned loads in this context >> are >> faster than performing scalar operations), and possibl

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread Ilya Yaroshenko
On Saturday, 17 August 2013 at 19:38:52 UTC, John Colvin wrote: On Saturday, 17 August 2013 at 19:24:52 UTC, Ilya Yaroshenko wrote: BTW: -march=native automatically implies -mtune=native Thanks, I`ll remove mtune) It would be really interesting if you could try writing the same code in c, b

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread Ilya Yaroshenko
On Sunday, 18 August 2013 at 01:53:53 UTC, Manu wrote: It doesn't look like you account for alignment. This is basically not-portable (I doubt unaligned loads in this context are faster than performing scalar operations), and possibly inefficient on x86 too. dotProduct uses unaligned loads (

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread Manu
It doesn't look like you account for alignment. This is basically not-portable (I doubt unaligned loads in this context are faster than performing scalar operations), and possibly inefficient on x86 too. To make it account for potentially random alignment will be awkward, but it might be possible t

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread bearophile
Ilya Yaroshenko: http://spiceandmath.blogspot.ru/2013/08/simd-implementation-of-dot-product_17.html From the blog post: Compile fast_math code from other program separately and then link it. This is easy solution. However this is a step back to C.< To introduce a @fast_math attribute. This i

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread John Colvin
On Saturday, 17 August 2013 at 19:24:52 UTC, Ilya Yaroshenko wrote: BTW: -march=native automatically implies -mtune=native Thanks, I`ll remove mtune) It would be really interesting if you could try writing the same code in c, both a scalar version and a version using gcc's vector instrinsic

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread Ilya Yaroshenko
BTW: -march=native automatically implies -mtune=native Thanks, I`ll remove mtune)

Re: SIMD implementation of dot-product. Benchmarks

2013-08-17 Thread John Colvin
On Saturday, 17 August 2013 at 18:50:15 UTC, Ilya Yaroshenko wrote: http://spiceandmath.blogspot.ru/2013/08/simd-implementation-of-dot-product_17.html Ilya Nice, that's a good speedup. BTW: -march=native automatically implies -mtune=native