Re: LDC2 win64 calling convention

2018-12-01 Thread Johan Engelen via Digitalmars-d-learn
On Thursday, 29 November 2018 at 15:10:41 UTC, realhet wrote: In conclusion: Maybe LDC2 generates a lot of extra code, but I always make longer asm routines, so it's not a problem for me at all while it helps me a lot. An extra note: I recommend you look into using `ldc.llvmasm.__asm` to

Re: LDC2 win64 calling convention

2018-11-29 Thread realhet via Digitalmars-d-learn
On Wednesday, 28 November 2018 at 21:58:16 UTC, kinke wrote: You're not using naked asm; this entails a prologue (spilling the params to stack etc.). Additionally, LDC doesn't really like accessing params and locals in DMD-style inline asm, see

Re: LDC2 win64 calling convention

2018-11-28 Thread kinke via Digitalmars-d-learn
You're not using naked asm; this entails a prologue (spilling the params to stack etc.). Additionally, LDC doesn't really like accessing params and locals in DMD-style inline asm, see https://github.com/ldc-developers/ldc/issues/2854. You can check the final asm trivially online, e.g.,

Re: LDC2 win64 calling convention

2018-11-28 Thread realhet via Digitalmars-d-learn
Thank You for the explanation! But my tests has different results: void* SSE_sobelRow(ubyte* src, ubyte* dst, size_t srcStride){ asm{ push RDI; mov RAX, 0; mov RDX, 0; mov RCX, 0; //clear 'parameter' registers mov RAX, src; mov RDI, dst; //gen movups XMM0,[RAX]; movaps

Re: LDC2 win64 calling convention

2018-11-28 Thread kinke via Digitalmars-d-learn
On Wednesday, 28 November 2018 at 20:17:53 UTC, kinke wrote: The stack isn't used at all To prevent confusion: it's used of course, e.g., if there are more than 4 total parameters. Just not in the classical sense, i.e., a 16-bytes struct isn't pushed directly onto the stack, but the caller

Re: LDC2 win64 calling convention

2018-11-28 Thread kinke via Digitalmars-d-learn
On Wednesday, 28 November 2018 at 18:56:14 UTC, realhet wrote: 1. Is there register parameters? (I think no) Of course, e.g., POD structs of power-of-2 sizes <= 8 bytes and integral scalars as well as float/double/vectors. The stack isn't used at all, aggregates > 8 bytes are passed by ref

LDC2 win64 calling convention

2018-11-28 Thread realhet via Digitalmars-d-learn
Hi, Is there a documentation about the win64 calling convention used with LDC2 compiler? So far I try to use the Microsoft x64 calling convention, but I'm not sure if that's the one I have to. But it's not too accurate becaues I think it uses the stack only.