On Thu, Dec 27, 2012 at 9:08 AM, Uros Bizjak <[email protected]> wrote:
> On Wed, Dec 26, 2012 at 9:16 PM, Andi Kleen <[email protected]> wrote:
>>> Can you please post a real-world example, where using %r15 would break
>>> existing code?
>>
>> I used to run into problems like this when porting code to gcc from icc or
>> VC.
>> A lot of hyper optimized inline assembler snippets wants to use all registers
>> and icc/VC support that. With gcc usually had to add some manual
>> push/pops. In older gcc versions usually more than one because reload
>> tended to error out otherwise.
>>
>> So by try and error used the non fixed registers, but let the compiler know
>> about the others. This case would break.
>>
>> In 64bit it was less a problem than on 32bit, but could still happen.
>>
>> Admittedly medium is somewhat obscure and rarely used (and very slow),
>> but someone could be still using it.
>
> The alternative approach is changing cpuid definition in cpuid.h (as
> in attached patch) to preserve %rbx, but we can't detect various code
> model settings there. Since the change depends on the definition of
> __PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
> However, code that involves cpuid is rarely performance critical, so
> perhaps we can live with this tradeoff.
>
> IMO, this patch can be used on 4.7 branch, too.
Now with the patch ...
Uros.
Index: i386/cpuid.h
===================================================================
--- i386/cpuid.h (revision 194723)
+++ i386/cpuid.h (working copy)
@@ -132,8 +132,9 @@
#define signature_VORTEX_ecx 0x436f5320
#define signature_VORTEX_edx 0x36387865
-#if defined(__i386__) && defined(__PIC__)
+#if defined(__PIC__)
/* %ebx may be the PIC register. */
+#if defined(__i386__)
#if __GNUC__ >= 3
#define __cpuid(level, a, b, c, d) \
__asm__ ("xchg{l}\t{%%}ebx, %1\n\t" \
@@ -165,6 +166,21 @@
: "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
: "0" (level), "2" (count))
#endif
+#elif defined(__x86_64__)
+#define __cpuid(level, a, b, c, d) \
+ __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
+ "cpuid\n\t" \
+ "xchg{q}\t{%%}rbx, %q1\n\t" \
+ : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
+ : "0" (level))
+
+#define __cpuid_count(level, count, a, b, c, d) \
+ __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
+ "cpuid\n\t" \
+ "xchg{q}\t{%%}rbx, %q1\n\t" \
+ : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
+ : "0" (level), "2" (count))
+#endif
#else
#define __cpuid(level, a, b, c, d) \
__asm__ ("cpuid\n\t" \