woess...@gmail.com schrieb:
> On Monday, March 19, 2012 4:41:21 PM UTC-4, Jan Seiffert wrote:
>> a) filter the const with an #ifndef __cplusplus
> 
> In terms of performance, I suppose making TWO_PI static is sufficient.  
> However, that's very unsatisfying - it really should be const.
> 

I know what you mean.

>> b) ask the compiler to put the const into a xmm register for you, beware of
>> clobbering it (you already did so with the "x" constrain for the other
>> constants, so why not for this?)
> 
> This really shouldn't be necessary.  The andpd instruction can operate on a 
> memory location.  So I'd rather not waste the register (and cycles) by 
> loading a value that doesn't need to be loaded.
> 

Hmmm, yes, most x86 instructions can take a memory operand, it's a CISC arch,
but it does not mean it comes for free.
The CPU still has to fetch the data from memory, it gets broken down into a load
and the op, and often it is not a win (depending on the micro architecture) to
bundle the load with the op (the load creating a severe pipeline stall "right
there" not independently schedulable, because accessing everything but regs is
slow, even some "fetch buffer" (several clocks access time)), only because the
x86-32 integer part is so register starved this is desirable to prevent even
more costly spill code.
Even if this bundling would be a win, in your case you use the operand two
times, so you "fetch" it two times (maybe one time cache warm, the other is at
least with a fetch buffer stall, if the µarch can not play some clever tricks
with register renaming).
And register, pffff, you are using 3, then it would be 4, far away from the 8
SSE register you have on x86-32, would be 16 for x86-64. (yes, even for windows
32 Bit can be slowly seen as deprecated, the code should not be horrible for 32
bit (those machines are here to stay), but i would not loose an arm and a leg
over 32 bit).

>> c) does it help if you use the old extern-"C"-trick around the function?
> 
> No.  I suppose I could put the function in a separate file, compile it with 
> gcc and link to it.  But that omits the possibility of inline-ing the 
> function.
> 

Since you speak of inline-ing, there is a chance the compiler can hold your
constants in registers over the course of a loop, if you do not clobber them
(only inputs). At least it may schedule the loads far away from the first use.

> I can also use a const_cast to do away with the const-ness, but it generates 
> the following warning: "use of memory input without lvalue in asm operand 3 
> is deprecated".  But at least it works.
> 

WARNING!
It may look like it works.
The last time i got this warning with several inline asm the compiler removed
some of the input operands in certain cases.
I would not let this warning stand, no matter how innocent it may sound.

> Thanks,
> Bill

Greetings
        Jan
_______________________________________________
help-gplusplus mailing list
help-gplusplus@gnu.org
https://lists.gnu.org/mailman/listinfo/help-gplusplus

Reply via email to