Re: Taking pipeline processing to the next level

finalpatch via Digitalmars-d Tue, 06 Sep 2016 07:41:27 -0700

On Tuesday, 6 September 2016 at 14:26:22 UTC, finalpatch wrote:

On Tuesday, 6 September 2016 at 14:21:01 UTC, finalpatch wrote:
Then some template magic will figure out the LCM of the 2kernels' pixel width is 3*4=12 and therefore they are fusedtogether into a composite kernel of pixel width 12. The aboveline compiles down into a single function invokation, with amain loop that reads the source buffer in 4 pixels step, callMySimpleKernel 3 times, then call AnotherKernel 4 times.
Correction:
with a main loop that reads the source buffer in *12* pixelsstep, call MySimpleKernel 3 times, then call AnotherKernel 4times.

And of course the key to the speed is all function calls getinlined by the compiler.

Re: Taking pipeline processing to the next level

Reply via email to