Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-22 Thread H. Peter Anvin
Mathieu Desnoyers wrote: Will fix. I noticed that it was because I had too much "register" char variables declared and used at the same time. Putting a "g" constraint gave the same result. "g" is the same as "rmi", which is probably *NOT* what you want. Don't use "register" variables.

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-22 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote: > Mathieu Desnoyers wrote: > > > >I have tried generating asm-to-"register" c variables for char, short > >and int on i386 and I do not see this happening. The char opcode is > >always 1 byte, short 2 bytes and int 1 byte. Result: > > > > The comment

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-22 Thread Andi Kleen
> - Either I use a "r" constraint and let gcc produce the instructions, > that I need to assume to have correct size so I can align their > immediate values (therefore, taking the offset from the end of the > instruction will not help). Here, if gas changes its behavior > dramatically for

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-22 Thread Andi Kleen
- Either I use a r constraint and let gcc produce the instructions, that I need to assume to have correct size so I can align their immediate values (therefore, taking the offset from the end of the instruction will not help). Here, if gas changes its behavior dramatically for a given

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-22 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote: Mathieu Desnoyers wrote: I have tried generating asm-to-register c variables for char, short and int on i386 and I do not see this happening. The char opcode is always 1 byte, short 2 bytes and int 1 byte. Result: The comment was referring to

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-22 Thread H. Peter Anvin
Mathieu Desnoyers wrote: Will fix. I noticed that it was because I had too much register char variables declared and used at the same time. Putting a g constraint gave the same result. g is the same as rmi, which is probably *NOT* what you want. Don't use register variables. That's a poor

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-20 Thread H. Peter Anvin
Mathieu Desnoyers wrote: I have tried generating asm-to-"register" c variables for char, short and int on i386 and I do not see this happening. The char opcode is always 1 byte, short 2 bytes and int 1 byte. Result: The comment was referring to x86-64, but I incorrectly remembered that

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-20 Thread Mathieu Desnoyers
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: > H. Peter Anvin wrote: > > Allowing different registers should be doable, but if so, one would have > > to put 0: at the *end* of the instruction and use (0f)-4 instead, since > > the non-%eax forms are one byte longer. > > > > OK, that's

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-20 Thread Mathieu Desnoyers
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: H. Peter Anvin wrote: Allowing different registers should be doable, but if so, one would have to put 0: at the *end* of the instruction and use (0f)-4 instead, since the non-%eax forms are one byte longer. OK, that's already a

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-10-20 Thread H. Peter Anvin
Mathieu Desnoyers wrote: I have tried generating asm-to-register c variables for char, short and int on i386 and I do not see this happening. The char opcode is always 1 byte, short 2 bytes and int 1 byte. Result: The comment was referring to x86-64, but I incorrectly remembered that

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-21 Thread Mathieu Desnoyers
* Denys Vlasenko ([EMAIL PROTECTED]) wrote: > On Tuesday 18 September 2007 22:07, Mathieu Desnoyers wrote: > > i386 optimization of the immediate values which uses a movl with code > > patching > > to set/unset the value used to populate the register used as variable > > source. > > > >

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-21 Thread Mathieu Desnoyers
* Denys Vlasenko ([EMAIL PROTECTED]) wrote: On Tuesday 18 September 2007 22:07, Mathieu Desnoyers wrote: i386 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. Changelog: - Use

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-20 Thread Denys Vlasenko
On Tuesday 18 September 2007 22:07, Mathieu Desnoyers wrote: > i386 optimization of the immediate values which uses a movl with code patching > to set/unset the value used to populate the register used as variable source. > > Changelog: > - Use text_poke_early with cr0 WP save/restore to patch

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-20 Thread Denys Vlasenko
On Tuesday 18 September 2007 22:07, Mathieu Desnoyers wrote: i386 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. Changelog: - Use text_poke_early with cr0 WP save/restore to patch the

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote: > Jeremy Fitzhardinge wrote: > > > > Cross-cache-line, sure. But what about just not sizeof aligned? If its > > enough to avoid cross-cache-line, then that's simpler. > > > > Not really. It pretty much comes down to the same problem. > > > Which

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: > H. Peter Anvin wrote: > > Mathieu Desnoyers wrote: > > > >> Ok, let's have a good look at what we want: > >> > >> 1 - get a pointer to the beginning of the immediate value within the > >> instruction. > >> 2 - make sure that the immediate

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread H. Peter Anvin
Jeremy Fitzhardinge wrote: > > Cross-cache-line, sure. But what about just not sizeof aligned? If its > enough to avoid cross-cache-line, then that's simpler. > Not really. It pretty much comes down to the same problem. > Which is something I was going to comment on: Mathieu, you try to

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Jeremy Fitzhardinge
H. Peter Anvin wrote: > Mathieu Desnoyers wrote: > >> Ok, let's have a good look at what we want: >> >> 1 - get a pointer to the beginning of the immediate value within the >> instruction. >> 2 - make sure that the immediate value, within the instruction, is >> written to atomically wrt

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread H. Peter Anvin
Mathieu Desnoyers wrote: > > Ok, let's have a good look at what we want: > > 1 - get a pointer to the beginning of the immediate value within the > instruction. > 2 - make sure that the immediate value, within the instruction, is > written to atomically wrt all CPUs, even on older

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Mathieu Desnoyers ([EMAIL PROTECTED]) wrote: > * Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: > > H. Peter Anvin wrote: > > > Allowing different registers should be doable, but if so, one would have > > > to put 0: at the *end* of the instruction and use (0f)-4 instead, since > > > the

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: > H. Peter Anvin wrote: > > Allowing different registers should be doable, but if so, one would have > > to put 0: at the *end* of the instruction and use (0f)-4 instead, since > > the non-%eax forms are one byte longer. > > > > OK, that's

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Andi Kleen
> And yes, it's a pity there is no way to produce the long-nops there. :( To be honest I consider it extremly unlikely you would be able to measure a difference. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Andi Kleen ([EMAIL PROTECTED]) wrote: > Jeremy Fitzhardinge <[EMAIL PROTECTED]> writes: > > > > It's a pity that gas seems to generate plain 0x90 nops rather than > > long-nop forms here. I thought it could do that. > > .p2align does it. > Sadly, p2align does not apply well to my context. I

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote: > Jeremy Fitzhardinge wrote: > > Mathieu Desnoyers wrote: > > > >> +#define immediate_read(name) > >> \ > >> + ({ \ > >> +

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: > Mathieu Desnoyers wrote: > > > +#define immediate_read(name) > > \ > > + ({ \ > > + __typeof__(name##__immediate) value;

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: Mathieu Desnoyers wrote: +#define immediate_read(name) \ + ({ \ + __typeof__(name##__immediate) value;

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote: Jeremy Fitzhardinge wrote: Mathieu Desnoyers wrote: +#define immediate_read(name) \ + ({ \ + __typeof__(name##__immediate)

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Andi Kleen ([EMAIL PROTECTED]) wrote: Jeremy Fitzhardinge [EMAIL PROTECTED] writes: It's a pity that gas seems to generate plain 0x90 nops rather than long-nop forms here. I thought it could do that. .p2align does it. Sadly, p2align does not apply well to my context. I have to

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Andi Kleen
And yes, it's a pity there is no way to produce the long-nops there. :( To be honest I consider it extremly unlikely you would be able to measure a difference. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: H. Peter Anvin wrote: Allowing different registers should be doable, but if so, one would have to put 0: at the *end* of the instruction and use (0f)-4 instead, since the non-%eax forms are one byte longer. OK, that's already a

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Mathieu Desnoyers ([EMAIL PROTECTED]) wrote: * Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: H. Peter Anvin wrote: Allowing different registers should be doable, but if so, one would have to put 0: at the *end* of the instruction and use (0f)-4 instead, since the non-%eax forms are

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread H. Peter Anvin
Mathieu Desnoyers wrote: Ok, let's have a good look at what we want: 1 - get a pointer to the beginning of the immediate value within the instruction. 2 - make sure that the immediate value, within the instruction, is written to atomically wrt all CPUs, even on older architectures

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Jeremy Fitzhardinge
H. Peter Anvin wrote: Mathieu Desnoyers wrote: Ok, let's have a good look at what we want: 1 - get a pointer to the beginning of the immediate value within the instruction. 2 - make sure that the immediate value, within the instruction, is written to atomically wrt all CPUs,

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread H. Peter Anvin
Jeremy Fitzhardinge wrote: Cross-cache-line, sure. But what about just not sizeof aligned? If its enough to avoid cross-cache-line, then that's simpler. Not really. It pretty much comes down to the same problem. Which is something I was going to comment on: Mathieu, you try to align

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* Jeremy Fitzhardinge ([EMAIL PROTECTED]) wrote: H. Peter Anvin wrote: Mathieu Desnoyers wrote: Ok, let's have a good look at what we want: 1 - get a pointer to the beginning of the immediate value within the instruction. 2 - make sure that the immediate value, within the

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-19 Thread Mathieu Desnoyers
* H. Peter Anvin ([EMAIL PROTECTED]) wrote: Jeremy Fitzhardinge wrote: Cross-cache-line, sure. But what about just not sizeof aligned? If its enough to avoid cross-cache-line, then that's simpler. Not really. It pretty much comes down to the same problem. Which is something I

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Andi Kleen
On Tue, Sep 18, 2007 at 03:29:50PM -0700, Jeremy Fitzhardinge wrote: > Andi Kleen wrote: > > Jeremy Fitzhardinge <[EMAIL PROTECTED]> writes: > > > >> It's a pity that gas seems to generate plain 0x90 nops rather than > >> long-nop forms here. I thought it could do that. > >> > > > >

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread H. Peter Anvin
Jeremy Fitzhardinge wrote: > Andi Kleen wrote: >> Jeremy Fitzhardinge <[EMAIL PROTECTED]> writes: >> >>> It's a pity that gas seems to generate plain 0x90 nops rather than >>> long-nop forms here. I thought it could do that. >>> >> .p2align does it > > Just .p2align? Not align, balign,

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Jeremy Fitzhardinge
Andi Kleen wrote: > Jeremy Fitzhardinge <[EMAIL PROTECTED]> writes: > >> It's a pity that gas seems to generate plain 0x90 nops rather than >> long-nop forms here. I thought it could do that. >> > > .p2align does it Just .p2align? Not align, balign, org or skip? Seems... strange.

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Jeremy Fitzhardinge
H. Peter Anvin wrote: > Allowing different registers should be doable, but if so, one would have > to put 0: at the *end* of the instruction and use (0f)-4 instead, since > the non-%eax forms are one byte longer. > OK, that's already a problem since its using "=r" as the constraint. > This

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread H. Peter Anvin
Jeremy Fitzhardinge wrote: > Mathieu Desnoyers wrote: > >> +#define immediate_read(name) >> \ >> +({ \ >> +__typeof__(name##__immediate) value;\ >> +

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Andi Kleen
Jeremy Fitzhardinge <[EMAIL PROTECTED]> writes: > > It's a pity that gas seems to generate plain 0x90 nops rather than > long-nop forms here. I thought it could do that. .p2align does it. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Jeremy Fitzhardinge
Mathieu Desnoyers wrote: > +#define immediate_read(name) \ > + ({ \ > + __typeof__(name##__immediate) value;\ > + switch (sizeof(value)) {

[patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Mathieu Desnoyers
i386 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. Changelog: - Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing non atomic writes to a code region only

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Borislav Petkov
On Mon, Sep 17, 2007 at 02:42:28PM -0400, Mathieu Desnoyers wrote: > i386 optimization of the immediate values which uses a movl with code patching > to set/unset the value used to populate the register used as variable source. > > Changelog: > - Use text_poke_early with cr0 WP save/restore to

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Borislav Petkov
On Mon, Sep 17, 2007 at 02:42:28PM -0400, Mathieu Desnoyers wrote: i386 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. Changelog: - Use text_poke_early with cr0 WP save/restore to patch

[patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Mathieu Desnoyers
i386 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. Changelog: - Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing non atomic writes to a code region only

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Jeremy Fitzhardinge
Mathieu Desnoyers wrote: +#define immediate_read(name) \ + ({ \ + __typeof__(name##__immediate) value;\ + switch (sizeof(value)) {

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Andi Kleen
Jeremy Fitzhardinge [EMAIL PROTECTED] writes: It's a pity that gas seems to generate plain 0x90 nops rather than long-nop forms here. I thought it could do that. .p2align does it. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread H. Peter Anvin
Jeremy Fitzhardinge wrote: Mathieu Desnoyers wrote: +#define immediate_read(name) \ +({ \ +__typeof__(name##__immediate) value;\ +

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Jeremy Fitzhardinge
H. Peter Anvin wrote: Allowing different registers should be doable, but if so, one would have to put 0: at the *end* of the instruction and use (0f)-4 instead, since the non-%eax forms are one byte longer. OK, that's already a problem since its using =r as the constraint. This also

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Jeremy Fitzhardinge
Andi Kleen wrote: Jeremy Fitzhardinge [EMAIL PROTECTED] writes: It's a pity that gas seems to generate plain 0x90 nops rather than long-nop forms here. I thought it could do that. .p2align does it Just .p2align? Not align, balign, org or skip? Seems... strange. J - To

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread H. Peter Anvin
Jeremy Fitzhardinge wrote: Andi Kleen wrote: Jeremy Fitzhardinge [EMAIL PROTECTED] writes: It's a pity that gas seems to generate plain 0x90 nops rather than long-nop forms here. I thought it could do that. .p2align does it Just .p2align? Not align, balign, org or skip?

Re: [patch 4/7] Immediate Values - i386 Optimization

2007-09-18 Thread Andi Kleen
On Tue, Sep 18, 2007 at 03:29:50PM -0700, Jeremy Fitzhardinge wrote: Andi Kleen wrote: Jeremy Fitzhardinge [EMAIL PROTECTED] writes: It's a pity that gas seems to generate plain 0x90 nops rather than long-nop forms here. I thought it could do that. .p2align does it Just

[patch 4/7] Immediate Values - i386 Optimization

2007-09-17 Thread Mathieu Desnoyers
i386 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. Changelog: - Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing non atomic writes to a code region only

[patch 4/7] Immediate Values - i386 Optimization

2007-09-17 Thread Mathieu Desnoyers
i386 optimization of the immediate values which uses a movl with code patching to set/unset the value used to populate the register used as variable source. Changelog: - Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing non atomic writes to a code region only