On Wednesday, 1 June 2016 at 23:23:49 UTC, ZILtoid1991 wrote:
Here's the assembly code for my alpha-blending routine:
ubyte[4] src = *cast(ubyte[4]*)(palette.ptr + 4 * *c);
ubyte[4] *p = cast(ubyte[4]*)(workpad + (offsetX + x)*4 +
offsetY);
asm{//moving the values to their destinations
On Thursday, 2 June 2016 at 07:17:23 UTC, Johan Engelen wrote:
On Wednesday, 1 June 2016 at 23:23:49 UTC, ZILtoid1991 wrote:
Here's the assembly code for my alpha-blending routine:
Could you also paste the D version of your code? Perhaps the
compiler (LDC, GDC) will generate similarly
On Thursday, 2 June 2016 at 00:51:15 UTC, ZILtoid1991 wrote:
On Wednesday, 1 June 2016 at 23:35:40 UTC, Era Scarecrow wrote:
On Wednesday, 1 June 2016 at 23:23:49 UTC, ZILtoid1991 wrote:
I could get the code working with a bug after replacing pmulhuw
with pmullw, but due to integer overflow
On Wednesday, 1 June 2016 at 23:35:40 UTC, Era Scarecrow wrote:
On Wednesday, 1 June 2016 at 23:23:49 UTC, ZILtoid1991 wrote:
After some debugging, I found out that the p pointer becomes
null at the end instead of pointing to a value. I have no
experience with using in-line assemblers
Here's the assembly code for my alpha-blending routine:
ubyte[4] src = *cast(ubyte[4]*)(palette.ptr + 4 * *c);
ubyte[4] *p = cast(ubyte[4]*)(workpad + (offsetX + x)*4 +
offsetY);
asm{//moving the values to their destinations
movdMM0, p;
movdMM1, src;
movqMM5, alpha;
movqMM7,
While I technically finished the 0.2 version of my graphics
engine which has a reasonable speed at low internal resolutions
and with only a couple of sprites, but it still gets bottlenecked
a lot. First I'll throw out the top-down determination
algorhythm as it requires constant memory paging