Martin v. Löwis <mar...@v.loewis.de> added the comment:

Looking at this further, it seems that the rdtsc code got miscompiled on 
x64 for some time already. Consider this code

typedef unsigned long long uint64;
uint64 f(uint64 b)
{
   uint64 a;
   __asm__ __volatile__("rdtsc" : "=A" (a));
   return a+b;
}

My Apple gcc 4.0.1 compiles that into 

_f:
        pushq   %rbp
        movq    %rsp, %rbp
        rdtsc
        addq    %rdi, %rax
        leave
        ret

Here, %rdi is the incoming parameter; %rdx is not considered at all. 
This seems to come from DImode (double integer) processing: gcc just 
"knows" that a DImode variable lives in a single register on AMD64.

So even if your code is right in principle, I still think there is a gcc 
bug here.

As for the specific code: I'm not sure whether it's guaranteed that you 
can truncate output registers in an asm. If you can't, you should make 
the output registers 64-bit integers on AMD64. If you can, I think you 
can "simplify" the code by directly outputting to (int*)v and 
((int*)v)[1]; this would be worthwhile only if the generated code 
actually gets better by omitting the shift operation.

FWIW, I don't consider a bug that only occurs --with-tsc and only on 
AMD64 critical.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6603>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to