Martin v. Löwis <mar...@v.loewis.de> added the comment: Looking at this further, it seems that the rdtsc code got miscompiled on x64 for some time already. Consider this code
typedef unsigned long long uint64; uint64 f(uint64 b) { uint64 a; __asm__ __volatile__("rdtsc" : "=A" (a)); return a+b; } My Apple gcc 4.0.1 compiles that into _f: pushq %rbp movq %rsp, %rbp rdtsc addq %rdi, %rax leave ret Here, %rdi is the incoming parameter; %rdx is not considered at all. This seems to come from DImode (double integer) processing: gcc just "knows" that a DImode variable lives in a single register on AMD64. So even if your code is right in principle, I still think there is a gcc bug here. As for the specific code: I'm not sure whether it's guaranteed that you can truncate output registers in an asm. If you can't, you should make the output registers 64-bit integers on AMD64. If you can, I think you can "simplify" the code by directly outputting to (int*)v and ((int*)v)[1]; this would be worthwhile only if the generated code actually gets better by omitting the shift operation. FWIW, I don't consider a bug that only occurs --with-tsc and only on AMD64 critical. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6603> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com