On 12.01.2012 22:38, Bill Woessner wrote: > I'm a 100% total newbie at writing assembly. But I figured it would > be a good exercise. And besides, this tiny chunk of code is > definitely in the critical path of something I'm working on. Any and > all advice would be appreciated. > > I'm trying to rewrite the following function in x86 assembly: > > inline double DiffAngle(double theta1, double theta2) > { > double delta(theta1 - theta2); > > return std::abs(delta) <= M_PI ? delta : delta - copysign(2 * M_PI, > delta); > } > > To my great surprise, I've actually been somewhat successful. Here's > what I have so far: > > double DiffAngle(double theta1, double theta2) > { > asm( > "fldl 4(%esp);" > "fsubl 12(%esp);" > "fxam;" > "fnstsw %ax;" > "fldl TWO_PI;" > "testb $2, %ah;" > "fldl NEG_TWO_PI;" > "fcmovne %st(1), %st;" > "fstp %st(1);" > "fsubr %st(1), %st;" > "fldpi;" > "fld %st(2);" > "fabs;" > "fcomip %st(1), %st;" > "fstp %st(0);" > "fcmovbe %st(1), %st;" > "fstp %st(1);" > "rep;" > "ret;" > "NEG_TWO_PI:;" > ".long 1413754136;" > ".long 1075388923;" > "TWO_PI:;" > ".long 1413754136;" > ".long -1072094725;" > ); > } > > This compiles, runs and produces the correct answers. But I have a > few issues with it: > > 1) If I declare this function inline, it gives me garbage (like > 10^-304)
That is because you actually require a real call to the function. If the above assembly is inlined, the compiler doesn't really know where to put the input and output variables. I'm rewriting your C++ first, so I can put it into assembly more easily: double DiffAngle(double theta1, double theta2) { double diff = theta1 - theta2; if (abs(diff) <= M_PI) return diff; else if (diff < 0) return diff + 2 * M_PI; else return diff - 2 * M_PI; //Or, in a more SSE-like manor: double subtract; subtract = copysign(2*M_PI, diff); if (abs(diff) <= M_PI) subtract = 0; return diff - subtract; } Because you might want to rewrite the stuff anyway in SSE2, I'd change it to something like: double DiffAngle(double theta1, double theta2) { double res; const uint64_t no_sign_mask = 0x7fffffffffffffff; asm("movsd %1, %xmm0": : "m" (theta1) : ); asm("subsd %1, %xmm0": : "m" (theta2) : ); asm("movsd %xmm0, %xmm1" : : ); asm("movq %1, %xmm2" : : "m" (no_sign_mask) : ); asm("andpd %xmm2, %xmm0" : : ); //xmm0 = abs asm("cmpgtsd %1, %xmm0": : "m" (M_PI) : ); //if abs(diff) <= M_PI // %xmm0 = 0, else %xmm0 == 0xffff... asm("movsd %1, %xmm3": : "m" (2 * M_PI) : ); asm("movsd %1, %xmm2": : "m" (~no_sign_mask) : ); asm("movsd %xmm1, %xmm4" : : : ); asm("andpd %xmm2, %xmm4" : : : ); asm("orpd %xmm4, %xmm3" : : : ); asm("andpd %xmm0, %xmm3" : : : ); asm("subsd %xmm3, %xmm1" : : :); asm("movsd %xmm1, %0" : "=m" (res) : : ); return res; } Does that work for you? It's untested! > 2) If I compile with -Wall, I get a warning that the function doesn't > return a value, which is absolutely true, but I don't know how to fix > it. double ret; asm("fld %1; fld %2; blablabla; fstp %0" : "m" (theta1), "m" (theta2) : "=m" (ret) : ); return ret; This should also clear your previous question. > 3) I don't like how TWO_PI and NEG_TWO_PI are defined. I had to steal > it from some generated assembly. It would be nice to use M_PI, > 4*atan(1) or something like that. > Just define it as new inputs and let the compiler worry. Like: double ret; asm("fld %1; fld %2; blabla; fld %3; blabli; fld %4; bla; fstp %0" : "=m" (ret) : "m" (theta1), "m" (theta2), "m" (2*M_PI), "m" (-2*M_PI) : ); return ret; The "m" means "memory operand" (let the compiler worry about the addresses!), the "=" means "write only operand". > Thanks in advance, > Bill HTH, Markus _______________________________________________ help-gplusplus mailing list help-gplusplus@gnu.org https://lists.gnu.org/mailman/listinfo/help-gplusplus