On Jan 12, 9:38 pm, Bill Woessner <woess...@nospicedham.gmail.com> wrote: > I'm a 100% total newbie at writing assembly. But I figured it would > be a good exercise. And besides, this tiny chunk of code is > definitely in the critical path of something I'm working on. Any and > all advice would be appreciated. > > I'm trying to rewrite the following function in x86 assembly: > > inline double DiffAngle(double theta1, double theta2) > { > double delta(theta1 - theta2); > > return std::abs(delta) <= M_PI ? delta : delta - copysign(2 * M_PI, > delta); > > }
The gas assembler format is ghastly so I converted your code to Nasm creating the file DiffAngle.nasm as follows. I think it's easier to read and despite not being inlined it seemed to be faster. Time for 100 million calls: c function: 2.8 seconds asm func: 1.2 seconds I've not written floating point assembly code or linked C++ with assembly before so there's a chance something is not right. I'll list the steps I took so it can be recreated/challenged/corrected. Hopefully this will help with the things you asked. First the assembly code. ; ;DiffAngle ; ;Build with such as ; nasm -f elf32 DiffAngle.nasm -l DiffAngle.list ; bits 32 cpu ppro %define PI 3.1415926535897932384626433832795 %define TWO_PI 6.283185307179586476925286766559 %define NEG_TWO_PI -6.283185307179586476925286766559 global DiffAngle section .code DiffAngle: fld qword [esp + 4] fsub qword [esp + 12] fxam fnstsw ax fld qword [two_pi] test ah, 2 fld qword [neg_two_pi] fcmovne st0, st1 fstp st1 fsubr st0, st1 fldpi fld st2 fabs fcomip st0, st1 fstp st0 fcmovbe st0, st1 fstp st1 ret 0 section .data two_pi: dq NEG_TWO_PI ;NB wrong value neg_two_pi: dq TWO_PI ;NB wrong value > double DiffAngle(double theta1, double theta2) > { > asm( > "fldl 4(%esp);" > "fsubl 12(%esp);" > "fxam;" > "fnstsw %ax;" > "fldl TWO_PI;" > "testb $2, %ah;" > "fldl NEG_TWO_PI;" > "fcmovne %st(1), %st;" > "fstp %st(1);" > "fsubr %st(1), %st;" > "fldpi;" > "fld %st(2);" > "fabs;" > "fcomip %st(1), %st;" > "fstp %st(0);" > "fcmovbe %st(1), %st;" > "fstp %st(1);" > "rep;" > "ret;" > "NEG_TWO_PI:;" > ".long 1413754136;" > ".long 1075388923;" > "TWO_PI:;" > ".long 1413754136;" > ".long -1072094725;" > ); > > } > > This compiles, runs and produces the correct answers. But I have a > few issues with it: I'm not sure your code is right. The constants seem to be the other way round from what they are intended to be. I had to swap them over to get the same results as your C program but it is late.... If someone points out some faults I'll respond another day. > 1) If I declare this function inline, it gives me garbage (like > 10^-304) To try it out I made a test routine called dtest1.c. Build steps on Linux for the whole thing were nasm -f elf32 DiffAngle.nasm -l DiffAngle.list g++ dtest1.c -c g++ dtest1.o DiffAngle.o -o dtest1 > 2) If I compile with -Wall, I get a warning that the function doesn't > return a value, which is absolutely true, but I don't know how to fix > it. In dtest1.c I had to include the following prototype extern "C" { double DiffAngle(double, double); } so that g++ didn't expect a mangled routine name. > 3) I don't like how TWO_PI and NEG_TWO_PI are defined. I had to steal > it from some generated assembly. These can be defined much more easily in the assembly code. The "dq" code defines what the assembler calls a quadword, 8 bytes, in the .data section. For example, two_pi: dq 6.283185307179586476925286766559 > It would be nice to use M_PI, > 4*atan(1) or something like that. I know 4*atan(1) is pi but I don't know what M_PI is supposed to be. I've made no attempt to understand the maths of your solution; I just copied your code. So both the blame for faults and the credit for increased performance go to you. :-) James _______________________________________________ help-gplusplus mailing list help-gplusplus@gnu.org https://lists.gnu.org/mailman/listinfo/help-gplusplus