Re: dmd codegen improvements

ponce via Digitalmars-d Wed, 19 Aug 2015 03:32:12 -0700

On Wednesday, 19 August 2015 at 10:16:18 UTC, Ola Fosheim Grøstadwrote:

On Wednesday, 19 August 2015 at 10:08:48 UTC, ponce wrote:
Even in video codec, AVX2 is not that useful and barely bringsa 10% improvements over SSE, while being extra careful withSSE-AVX transition penalty. And to reap this benefit you wouldhave to write in intrinsics/assembly.
Masked AVX instructions are turned into NOPs. So you can removeconditionals from inner loops. Performance of new instructionstend to improve generation by generation.

Loops in video coding already have no conditional. And for theone who have, conditionals were already removeable with existinginstructions.

For AVX-512 I can't even imagine what to use such largeregister for. Larger registers => more spilling because ofcalling conventions, and more fiddling around with complicatedshuffle instructions. There is a steep diminishing returnswith increasing registers size.
You have to plan your data layout. Which is why librariesshould target it, so end users don't have to think too muchabout it. If your computations are trivial, then you areessentially memory I/O limited. SOA processing isn't reallylimited by shuffling. Stuff like mapping a pure function over acollection of arrays.

I stand by what I know and measured: previously few things arespeed up by AVX-xxx. It almost always better investing this timeto optimize somewhere else.

Re: dmd codegen improvements

Reply via email to