Using inline assembler

2014-10-09 Thread Etienne via Digitalmars-d-learn
I'm a bit new to the inline assembler, I'm trying to use the `movdqu` 
operation to move a 128 bit double quadword from a pointer location into 
another location like this:


align(16) union __m128i { ubyte[16] data };

void store(__m128i* src, __m128i* dst) {
asm { movdqu [dst], src; }
}


The compiler complains about a bad type/size of operands 'movdqu', but 
these two data segments are 16 byte align so they should be in an XMM# 
register? Is there something I'm missing here?


Re: Using inline assembler

2014-10-09 Thread anonymous via Digitalmars-d-learn

On Thursday, 9 October 2014 at 12:37:20 UTC, Etienne wrote:
I'm a bit new to the inline assembler, I'm trying to use the 
`movdqu` operation to move a 128 bit double quadword from a 
pointer location into another location like this:


align(16) union __m128i { ubyte[16] data };

void store(__m128i* src, __m128i* dst) {
asm { movdqu [dst], src; }
}


The compiler complains about a bad type/size of operands 
'movdqu', but these two data segments are 16 byte align so 
they should be in an XMM# register? Is there something I'm 
missing here?


I know virtually nothing about SSE, but you can't move directly
from memory to memory, can you? You need go through a register,
no?

This compiles:

align(16) union __m128i { ubyte[16] data; } /* note the position
of the semicolon */

void store(__m128i* src, __m128i* dst) {
 asm
 {
 movdqu XMM0, [src]; /* note: [src] */
 movdqu [dst], XMM0;
 }
}


Re: Using inline assembler

2014-10-09 Thread Etienne via Digitalmars-d-learn

On 2014-10-09 8:54 AM, anonymous wrote:

This compiles:

align(16) union __m128i { ubyte[16] data; } /* note the position
of the semicolon */

void store(__m128i* src, __m128i* dst) {
  asm
  {
  movdqu XMM0, [src]; /* note: [src] */
  movdqu [dst], XMM0;
  }
}


Yes, this does compile, but the value from src never ends up stored in dst.

void main() {
__m128i src;
src.data[0] = 255;
__m128i dst;
writeln(src.data); // shows 255 at offset 0
store(src, dst);
writeln(dst.data); // remains set as the initial array
}

http://x86.renejeschke.de/html/file_module_x86_id_184.html

Is this how it's meant to be used?


Re: Using inline assembler

2014-10-09 Thread Etienne via Digitalmars-d-learn
Maybe someone can help with the more specific problem. I'm translating a 
crypto engine here:


https://github.com/etcimon/botan/blob/master/source/botan/block/aes_ni/aes_ni.d

But I need this to work on DMD, LDC and GDC. I decided to write the 
assembler code directly for the functions in this module:


https://github.com/etcimon/botan/blob/master/source/botan/utils/simd/xmmintrin.d

If there's anything someone can tell me about this, I'd be thankful. I'm 
very experienced in every aspect of programming, but still at my first 
baby steps in assembler.


Re: Using inline assembler

2014-10-09 Thread anonymous via Digitalmars-d-learn

On Thursday, 9 October 2014 at 13:29:27 UTC, Etienne wrote:

On 2014-10-09 8:54 AM, anonymous wrote:

This compiles:

align(16) union __m128i { ubyte[16] data; } /* note the 
position

of the semicolon */

void store(__m128i* src, __m128i* dst) {
 asm
 {
 movdqu XMM0, [src]; /* note: [src] */
 movdqu [dst], XMM0;
 }
}


Yes, this does compile, but the value from src never ends up 
stored in dst.


void main() {
__m128i src;
src.data[0] = 255;
__m128i dst;
writeln(src.data); // shows 255 at offset 0
store(src, dst);
writeln(dst.data); // remains set as the initial array
}

http://x86.renejeschke.de/html/file_module_x86_id_184.html

Is this how it's meant to be used?


I'm out of my knowledge zone here, but it seems to work when you
move the pointers to registers first:

void store(__m128i* src, __m128i* dst) {
 asm
 {
 mov RAX, src;
 mov RBX, dst;
 movdqu XMM0, [RAX];
 movdqu [RBX], XMM0;
 }
}


Re: Using inline assembler

2014-10-09 Thread Etienne via Digitalmars-d-learn

On 2014-10-09 9:46 AM, anonymous wrote:

I'm out of my knowledge zone here, but it seems to work when you
move the pointers to registers first:

void store(__m128i* src, __m128i* dst) {
  asm
  {
  mov RAX, src;
  mov RBX, dst;
  movdqu XMM0, [RAX];
  movdqu [RBX], XMM0;
  }
}


Absolutely incredible! My first useful working assembler code. You save 
the day. Now I can probably write a whole SIMD library ;)