Re: dmd codegen improvements

via Digitalmars-d Wed, 19 Aug 2015 03:20:57 -0700

On Wednesday, 19 August 2015 at 10:08:48 UTC, ponce wrote:

Even in video codec, AVX2 is not that useful and barely bringsa 10% improvements over SSE, while being extra careful withSSE-AVX transition penalty. And to reap this benefit you wouldhave to write in intrinsics/assembly.

Masked AVX instructions are turned into NOPs. So you can removeconditionals from inner loops. Performance of new instructionstend to improve generation by generation.

For AVX-512 I can't even imagine what to use such largeregister for. Larger registers => more spilling because ofcalling conventions, and more fiddling around with complicatedshuffle instructions. There is a steep diminishing returns withincreasing registers size.

You have to plan your data layout. Which is why libraries shouldtarget it, so end users don't have to think too much about it. Ifyour computations are trivial, then you are essentially memoryI/O limited. SOA processing isn't really limited by shuffling.Stuff like mapping a pure function over a collection of arrays.

Re: dmd codegen improvements

Reply via email to