Martin v. Löwis <[email protected]> added the comment:
Looking at this further, it seems that the rdtsc code got miscompiled on
x64 for some time already. Consider this code
typedef unsigned long long uint64;
uint64 f(uint64 b)
{
uint64 a;
__asm__ __volatile__("rdtsc" : "=A" (a));
return a+b;
}
My Apple gcc 4.0.1 compiles that into
_f:
pushq %rbp
movq %rsp, %rbp
rdtsc
addq %rdi, %rax
leave
ret
Here, %rdi is the incoming parameter; %rdx is not considered at all.
This seems to come from DImode (double integer) processing: gcc just
"knows" that a DImode variable lives in a single register on AMD64.
So even if your code is right in principle, I still think there is a gcc
bug here.
As for the specific code: I'm not sure whether it's guaranteed that you
can truncate output registers in an asm. If you can't, you should make
the output registers 64-bit integers on AMD64. If you can, I think you
can "simplify" the code by directly outputting to (int*)v and
((int*)v)[1]; this would be worthwhile only if the generated code
actually gets better by omitting the shift operation.
FWIW, I don't consider a bug that only occurs --with-tsc and only on
AMD64 critical.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue6603>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com