Re: [patch 14/24] Immediate Values - x86 Optimization (updated)

2007-12-21 Thread Mathieu Desnoyers
x86 optimization of the immediate values which uses a movl with code patching
to set/unset the value used to populate the register used as variable source.

Changelog:
- Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing
  non atomic writes to a code region only touched by us (nobody can execute it
  since we are protected by the imv_mutex).
- Put imv_set and _imv_set in the architecture independent header.
- Use $0 instead of %2 with (0) operand.
- Add x86_64 support, ready for i386+x86_64 -> x86 merge.
- Use asm-x86/asm.h.

Ok, so the most flexible solution that I see, that should fit for both
x86 and x86_64 would be :
1 byte  :   "=q" : "a", "b", "c", or "d" register for the i386.  For
   x86-64 it is equivalent to "r" class (for 8-bit
   instructions that do not use upper halves).
2, 4, 8 bytes : "=r" : A register operand is allowed provided that it is in a
   general register.

- "Redux" immediate values : no need to put a breakpoint, therefore, no
  need to know where the instruction starts. It's therefore OK to have a
  REX prefix.

- Bugfix : 8 bytes 64 bits immediate value was declared as "4 bytes" in the
  immediate structure.
- Change the immediate.c update code to support variable length opcodes.
- Vastly simplified, using a busy looping IPI with interrupts disabled.
  Does not protect against NMI nor MCE.
- Pack the __imv section. Use smallest types required for size (char).
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: "H. Peter Anvin" <[EMAIL PROTECTED]>
CC: Chuck Ebbert <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig|1 
 include/asm-x86/immediate.h |   77 
 2 files changed, 78 insertions(+)

Index: linux-2.6-lttng.mm/include/asm-x86/immediate.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng.mm/include/asm-x86/immediate.h  2007-12-20 
18:55:00.0 -0500
@@ -0,0 +1,77 @@
+#ifndef _ASM_X86_IMMEDIATE_H
+#define _ASM_X86_IMMEDIATE_H
+
+/*
+ * Immediate values. x86 architecture optimizations.
+ *
+ * (C) Copyright 2006 Mathieu Desnoyers <[EMAIL PROTECTED]>
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#include 
+
+/**
+ * imv_read - read immediate variable
+ * @name: immediate value name
+ *
+ * Reads the value of @name.
+ * Optimized version of the immediate.
+ * Do not use in __init and __exit functions. Use _imv_read() instead.
+ * If size is bigger than the architecture long size, fall back on a memory
+ * read.
+ *
+ * Make sure to populate the initial static 64 bits opcode with a value
+ * what will generate an instruction with 8 bytes immediate value (not the 
REX.W
+ * prefixed one that loads a sign extended 32 bits immediate value in a r64
+ * register).
+ */
+#define imv_read(name) \
+   ({  \
+   __typeof__(name##__imv) value;  \
+   BUILD_BUG_ON(sizeof(value) > 8);\
+   switch (sizeof(value)) {\
+   case 1: \
+   asm(".section __imv,\"a\",@progbits\n\t"\
+   _ASM_PTR "%c1, (3f)-%c2\n\t"\
+   ".byte %c2\n\t" \
+   ".previous\n\t" \
+   "mov $0,%0\n\t" \
+   "3:\n\t"\
+   : "=q" (value)  \
+   : "i" (##__imv),   \
+ "i" (sizeof(value))); \
+   break;  \
+   case 2: \
+   case 4: \
+   asm(".section __imv,\"a\",@progbits\n\t"\
+   _ASM_PTR "%c1, (3f)-%c2\n\t"\
+   ".byte %c2\n\t" \
+   ".previous\n\t" \
+   "mov $0,%0\n\t" \
+   "3:\n\t"\
+  

Re: [patch 14/24] Immediate Values - x86 Optimization (updated)

2007-12-21 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote:
> Mathieu Desnoyers wrote:
>> Argh.. Rusty asked to have a simplified version first, and then to
>> implement the "more complex" one on top of it. However, in order to get
>> the reentrancy I need for the markers, I need the complex version of the
>> immediate values. Therefore, you find, in this patchset, the simple
>> version first, and then, the more complex one implemented on top.
>> About this patch header, the initial idea was to use the "Q" and "R"
>> constraints, but, as stated just below, the "q" and "r" constraints are
>> used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes
>> immediate values are never used. So the complete header follows the
>> source code, it's just that this paragraph could be clearer.
>
> Then you have it backwards.  "Q" and "R" avoid REX prefixes, "q" and "r" DO 
> NOT.
>
>   -hpa

Right.. I did that 1 month ago, which is already far away in my memory.
Looking back at this, here is what is the real situation. I attach the
patches that fixes the comments accordingly as reply to my original
posts.

- "Redux" immediate values : no need to put a breakpoint, therefore, no
  need to know where the instruction starts. It's therefore OK to have a
  REX prefix.
- More reentrant immediate value : uses a breakpoint. Needs to know the
  instruction's first byte. This is why we keep the "instruction size"
  variable, so we can support the REX prefixed instructions too.

Therefore, the "q" and "r" constraints are OK : they _allow_ REX
prefixes.

Mathieu




-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization (updated)

2007-12-21 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote:
 Mathieu Desnoyers wrote:
 Argh.. Rusty asked to have a simplified version first, and then to
 implement the more complex one on top of it. However, in order to get
 the reentrancy I need for the markers, I need the complex version of the
 immediate values. Therefore, you find, in this patchset, the simple
 version first, and then, the more complex one implemented on top.
 About this patch header, the initial idea was to use the Q and R
 constraints, but, as stated just below, the q and r constraints are
 used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes
 immediate values are never used. So the complete header follows the
 source code, it's just that this paragraph could be clearer.

 Then you have it backwards.  Q and R avoid REX prefixes, q and r DO 
 NOT.

   -hpa

Right.. I did that 1 month ago, which is already far away in my memory.
Looking back at this, here is what is the real situation. I attach the
patches that fixes the comments accordingly as reply to my original
posts.

- Redux immediate values : no need to put a breakpoint, therefore, no
  need to know where the instruction starts. It's therefore OK to have a
  REX prefix.
- More reentrant immediate value : uses a breakpoint. Needs to know the
  instruction's first byte. This is why we keep the instruction size
  variable, so we can support the REX prefixed instructions too.

Therefore, the q and r constraints are OK : they _allow_ REX
prefixes.

Mathieu




-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization (updated)

2007-12-21 Thread Mathieu Desnoyers
x86 optimization of the immediate values which uses a movl with code patching
to set/unset the value used to populate the register used as variable source.

Changelog:
- Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing
  non atomic writes to a code region only touched by us (nobody can execute it
  since we are protected by the imv_mutex).
- Put imv_set and _imv_set in the architecture independent header.
- Use $0 instead of %2 with (0) operand.
- Add x86_64 support, ready for i386+x86_64 - x86 merge.
- Use asm-x86/asm.h.

Ok, so the most flexible solution that I see, that should fit for both
x86 and x86_64 would be :
1 byte  :   =q : a, b, c, or d register for the i386.  For
   x86-64 it is equivalent to r class (for 8-bit
   instructions that do not use upper halves).
2, 4, 8 bytes : =r : A register operand is allowed provided that it is in a
   general register.

- Redux immediate values : no need to put a breakpoint, therefore, no
  need to know where the instruction starts. It's therefore OK to have a
  REX prefix.

- Bugfix : 8 bytes 64 bits immediate value was declared as 4 bytes in the
  immediate structure.
- Change the immediate.c update code to support variable length opcodes.
- Vastly simplified, using a busy looping IPI with interrupts disabled.
  Does not protect against NMI nor MCE.
- Pack the __imv section. Use smallest types required for size (char).
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers [EMAIL PROTECTED]
CC: Andi Kleen [EMAIL PROTECTED]
CC: H. Peter Anvin [EMAIL PROTECTED]
CC: Chuck Ebbert [EMAIL PROTECTED]
CC: Christoph Hellwig [EMAIL PROTECTED]
CC: Jeremy Fitzhardinge [EMAIL PROTECTED]
CC: Thomas Gleixner [EMAIL PROTECTED]
CC: Ingo Molnar [EMAIL PROTECTED]
CC: Rusty Russell [EMAIL PROTECTED]
---
 arch/x86/Kconfig|1 
 include/asm-x86/immediate.h |   77 
 2 files changed, 78 insertions(+)

Index: linux-2.6-lttng.mm/include/asm-x86/immediate.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng.mm/include/asm-x86/immediate.h  2007-12-20 
18:55:00.0 -0500
@@ -0,0 +1,77 @@
+#ifndef _ASM_X86_IMMEDIATE_H
+#define _ASM_X86_IMMEDIATE_H
+
+/*
+ * Immediate values. x86 architecture optimizations.
+ *
+ * (C) Copyright 2006 Mathieu Desnoyers [EMAIL PROTECTED]
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#include asm/asm.h
+
+/**
+ * imv_read - read immediate variable
+ * @name: immediate value name
+ *
+ * Reads the value of @name.
+ * Optimized version of the immediate.
+ * Do not use in __init and __exit functions. Use _imv_read() instead.
+ * If size is bigger than the architecture long size, fall back on a memory
+ * read.
+ *
+ * Make sure to populate the initial static 64 bits opcode with a value
+ * what will generate an instruction with 8 bytes immediate value (not the 
REX.W
+ * prefixed one that loads a sign extended 32 bits immediate value in a r64
+ * register).
+ */
+#define imv_read(name) \
+   ({  \
+   __typeof__(name##__imv) value;  \
+   BUILD_BUG_ON(sizeof(value)  8);\
+   switch (sizeof(value)) {\
+   case 1: \
+   asm(.section __imv,\a\,@progbits\n\t\
+   _ASM_PTR %c1, (3f)-%c2\n\t\
+   .byte %c2\n\t \
+   .previous\n\t \
+   mov $0,%0\n\t \
+   3:\n\t\
+   : =q (value)  \
+   : i (name##__imv),   \
+ i (sizeof(value))); \
+   break;  \
+   case 2: \
+   case 4: \
+   asm(.section __imv,\a\,@progbits\n\t\
+   _ASM_PTR %c1, (3f)-%c2\n\t\
+   .byte %c2\n\t \
+   .previous\n\t \
+   mov $0,%0\n\t \
+   3:\n\t\
+   : =r (value)  \
+ 

Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread H. Peter Anvin

Mathieu Desnoyers wrote:


Argh.. Rusty asked to have a simplified version first, and then to
implement the "more complex" one on top of it. However, in order to get
the reentrancy I need for the markers, I need the complex version of the
immediate values. Therefore, you find, in this patchset, the simple
version first, and then, the more complex one implemented on top.

About this patch header, the initial idea was to use the "Q" and "R"
constraints, but, as stated just below, the "q" and "r" constraints are
used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes
immediate values are never used. So the complete header follows the
source code, it's just that this paragraph could be clearer.



Then you have it backwards.  "Q" and "R" avoid REX prefixes, "q" and "r" 
DO NOT.


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote:
> This patch is modified by another patch in the sequence.  This feels 
> needlessly confusing when reviewing (especially since the comment doesn't 
> look to match the code, e.g. w.r.t to "Q" and "R" constraints); can you 
> reorder the patchset to avoid that?
>

Argh.. Rusty asked to have a simplified version first, and then to
implement the "more complex" one on top of it. However, in order to get
the reentrancy I need for the markers, I need the complex version of the
immediate values. Therefore, you find, in this patchset, the simple
version first, and then, the more complex one implemented on top.

About this patch header, the initial idea was to use the "Q" and "R"
constraints, but, as stated just below, the "q" and "r" constraints are
used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes
immediate values are never used. So the complete header follows the
source code, it's just that this paragraph could be clearer.

Mathieu

>   -hpa

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread H. Peter Anvin
This patch is modified by another patch in the sequence.  This feels 
needlessly confusing when reviewing (especially since the comment 
doesn't look to match the code, e.g. w.r.t to "Q" and "R" constraints); 
can you reorder the patchset to avoid that?


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread Mathieu Desnoyers
x86 optimization of the immediate values which uses a movl with code patching
to set/unset the value used to populate the register used as variable source.

Changelog:
- Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing
  non atomic writes to a code region only touched by us (nobody can execute it
  since we are protected by the imv_mutex).
- Put imv_set and _imv_set in the architecture independent header.
- Use $0 instead of %2 with (0) operand.
- Add x86_64 support, ready for i386+x86_64 -> x86 merge.
- Use asm-x86/asm.h.

Ok, so the most flexible solution that I see, that should fit for both
i386 and x86_64 would be :
1 byte  : "=Q" : Any register accessible as rh: a, b, c, and d.
2, 4 bytes : "=R" : Legacy register—the eight integer registers available
 on all i386 processors (a, b, c, d, si, di, bp, sp). 8
bytes : (only for x86_64)
  "=r" : A register operand is allowed provided that it is in a
 general register.
That should make sure x86_64 won't try to use REX prefixed opcodes for
1, 2 and 4 bytes values.

- Create the instruction in a discarded section to calculate its size. This is
  how we can align the beginning of the instruction on an address that will
  permit atomic modificatino of the immediate value without knowing the size of
  the opcode used by the compiler.
- Bugfix : 8 bytes 64 bits immediate value was declared as "4 bytes" in the
  immediate structure.
- Change the immediate.c update code to support variable length opcodes.

- Vastly simplified, using a busy looping IPI with interrupts disabled.
  Does not protect against NMI nor MCE.
- Pack the __imv section. Use smallest types required for size (char).
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: "H. Peter Anvin" <[EMAIL PROTECTED]>
CC: Chuck Ebbert <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: Rusty Russell <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig|1 
 include/asm-x86/immediate.h |   77 
 2 files changed, 78 insertions(+)

Index: linux-2.6-lttng/include/asm-x86/immediate.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng/include/asm-x86/immediate.h 2007-11-21 11:04:33.0 
-0500
@@ -0,0 +1,77 @@
+#ifndef _ASM_X86_IMMEDIATE_H
+#define _ASM_X86_IMMEDIATE_H
+
+/*
+ * Immediate values. x86 architecture optimizations.
+ *
+ * (C) Copyright 2006 Mathieu Desnoyers <[EMAIL PROTECTED]>
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#include 
+
+/**
+ * imv_read - read immediate variable
+ * @name: immediate value name
+ *
+ * Reads the value of @name.
+ * Optimized version of the immediate.
+ * Do not use in __init and __exit functions. Use _imv_read() instead.
+ * If size is bigger than the architecture long size, fall back on a memory
+ * read.
+ *
+ * Make sure to populate the initial static 64 bits opcode with a value
+ * what will generate an instruction with 8 bytes immediate value (not the 
REX.W
+ * prefixed one that loads a sign extended 32 bits immediate value in a r64
+ * register).
+ */
+#define imv_read(name) \
+   ({  \
+   __typeof__(name##__imv) value;  \
+   BUILD_BUG_ON(sizeof(value) > 8);\
+   switch (sizeof(value)) {\
+   case 1: \
+   asm(".section __imv,\"a\",@progbits\n\t"\
+   _ASM_PTR "%c1, (3f)-%c2\n\t"\
+   ".byte %c2\n\t" \
+   ".previous\n\t" \
+   "mov $0,%0\n\t" \
+   "3:\n\t"\
+   : "=q" (value)  \
+   : "i" (##__imv),   \
+ "i" (sizeof(value))); \
+   break;  \
+   case 2: \
+   case 4: \
+   asm(".section __imv,\"a\",@progbits\n\t"\
+   _ASM_PTR "%c1, (3f)-%c2\n\t"\
+   ".byte %c2\n\t" \
+

[patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread Mathieu Desnoyers
x86 optimization of the immediate values which uses a movl with code patching
to set/unset the value used to populate the register used as variable source.

Changelog:
- Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing
  non atomic writes to a code region only touched by us (nobody can execute it
  since we are protected by the imv_mutex).
- Put imv_set and _imv_set in the architecture independent header.
- Use $0 instead of %2 with (0) operand.
- Add x86_64 support, ready for i386+x86_64 - x86 merge.
- Use asm-x86/asm.h.

Ok, so the most flexible solution that I see, that should fit for both
i386 and x86_64 would be :
1 byte  : =Q : Any register accessible as rh: a, b, c, and d.
2, 4 bytes : =R : Legacy register—the eight integer registers available
 on all i386 processors (a, b, c, d, si, di, bp, sp). 8
bytes : (only for x86_64)
  =r : A register operand is allowed provided that it is in a
 general register.
That should make sure x86_64 won't try to use REX prefixed opcodes for
1, 2 and 4 bytes values.

- Create the instruction in a discarded section to calculate its size. This is
  how we can align the beginning of the instruction on an address that will
  permit atomic modificatino of the immediate value without knowing the size of
  the opcode used by the compiler.
- Bugfix : 8 bytes 64 bits immediate value was declared as 4 bytes in the
  immediate structure.
- Change the immediate.c update code to support variable length opcodes.

- Vastly simplified, using a busy looping IPI with interrupts disabled.
  Does not protect against NMI nor MCE.
- Pack the __imv section. Use smallest types required for size (char).
- Use imv_* instead of immediate_*.

Signed-off-by: Mathieu Desnoyers [EMAIL PROTECTED]
CC: Andi Kleen [EMAIL PROTECTED]
CC: H. Peter Anvin [EMAIL PROTECTED]
CC: Chuck Ebbert [EMAIL PROTECTED]
CC: Christoph Hellwig [EMAIL PROTECTED]
CC: Jeremy Fitzhardinge [EMAIL PROTECTED]
CC: Thomas Gleixner [EMAIL PROTECTED]
CC: Ingo Molnar [EMAIL PROTECTED]
CC: Rusty Russell [EMAIL PROTECTED]
---
 arch/x86/Kconfig|1 
 include/asm-x86/immediate.h |   77 
 2 files changed, 78 insertions(+)

Index: linux-2.6-lttng/include/asm-x86/immediate.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6-lttng/include/asm-x86/immediate.h 2007-11-21 11:04:33.0 
-0500
@@ -0,0 +1,77 @@
+#ifndef _ASM_X86_IMMEDIATE_H
+#define _ASM_X86_IMMEDIATE_H
+
+/*
+ * Immediate values. x86 architecture optimizations.
+ *
+ * (C) Copyright 2006 Mathieu Desnoyers [EMAIL PROTECTED]
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#include asm/asm.h
+
+/**
+ * imv_read - read immediate variable
+ * @name: immediate value name
+ *
+ * Reads the value of @name.
+ * Optimized version of the immediate.
+ * Do not use in __init and __exit functions. Use _imv_read() instead.
+ * If size is bigger than the architecture long size, fall back on a memory
+ * read.
+ *
+ * Make sure to populate the initial static 64 bits opcode with a value
+ * what will generate an instruction with 8 bytes immediate value (not the 
REX.W
+ * prefixed one that loads a sign extended 32 bits immediate value in a r64
+ * register).
+ */
+#define imv_read(name) \
+   ({  \
+   __typeof__(name##__imv) value;  \
+   BUILD_BUG_ON(sizeof(value)  8);\
+   switch (sizeof(value)) {\
+   case 1: \
+   asm(.section __imv,\a\,@progbits\n\t\
+   _ASM_PTR %c1, (3f)-%c2\n\t\
+   .byte %c2\n\t \
+   .previous\n\t \
+   mov $0,%0\n\t \
+   3:\n\t\
+   : =q (value)  \
+   : i (name##__imv),   \
+ i (sizeof(value))); \
+   break;  \
+   case 2: \
+   case 4: \
+   asm(.section __imv,\a\,@progbits\n\t\
+   _ASM_PTR %c1, (3f)-%c2\n\t\
+   .byte %c2\n\t \
+   .previous\n\t \
+   

Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread H. Peter Anvin
This patch is modified by another patch in the sequence.  This feels 
needlessly confusing when reviewing (especially since the comment 
doesn't look to match the code, e.g. w.r.t to Q and R constraints); 
can you reorder the patchset to avoid that?


-hpa
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote:
 This patch is modified by another patch in the sequence.  This feels 
 needlessly confusing when reviewing (especially since the comment doesn't 
 look to match the code, e.g. w.r.t to Q and R constraints); can you 
 reorder the patchset to avoid that?


Argh.. Rusty asked to have a simplified version first, and then to
implement the more complex one on top of it. However, in order to get
the reentrancy I need for the markers, I need the complex version of the
immediate values. Therefore, you find, in this patchset, the simple
version first, and then, the more complex one implemented on top.

About this patch header, the initial idea was to use the Q and R
constraints, but, as stated just below, the q and r constraints are
used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes
immediate values are never used. So the complete header follows the
source code, it's just that this paragraph could be clearer.

Mathieu

   -hpa

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 14/24] Immediate Values - x86 Optimization

2007-12-20 Thread H. Peter Anvin

Mathieu Desnoyers wrote:


Argh.. Rusty asked to have a simplified version first, and then to
implement the more complex one on top of it. However, in order to get
the reentrancy I need for the markers, I need the complex version of the
immediate values. Therefore, you find, in this patchset, the simple
version first, and then, the more complex one implemented on top.

About this patch header, the initial idea was to use the Q and R
constraints, but, as stated just below, the q and r constraints are
used instead to make sure the REX prefixed opcodes for 1, 2, and 4 bytes
immediate values are never used. So the complete header follows the
source code, it's just that this paragraph could be clearer.



Then you have it backwards.  Q and R avoid REX prefixes, q and r 
DO NOT.


-hpa
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/