Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-31 Thread Andi Kleen
> >That is what the assembler generates, and should have generated, for
> >"movw %ds,(%eax)" since Nov. 4, 2004.
> 
> Could this be the reason for the reported slowdown in the last six months?

No.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-31 Thread Andi Kleen
 That is what the assembler generates, and should have generated, for
 movw %ds,(%eax) since Nov. 4, 2004.
 
 Could this be the reason for the reported slowdown in the last six months?

No.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread H. J. Lu
On Thu, Mar 31, 2005 at 02:57:57AM +0200, Pau Aliagas wrote:
> On Wed, 30 Mar 2005, H. J. Lu wrote:
> 
> >>>That is what the assembler generates, and should have generated, for
> >>>"movw %ds,(%eax)" since Nov. 4, 2004.
> >>
> >>Could this be the reason for the reported slowdown in the last six months?
> >
> >Can you elaborate?
> 
> There's an unexplained slowdown of kernel 2.6 detailed in this thread:
> http://kerneltrap.org/node/4940
> 

It is dated as "November 13, 2002 - 13:58". The assembler change was
made on Nov. 4, 2004. I don't think they are related at all.

> I don't want at all to justify it with the change you talk about in gas, 
> but maybe it is worth to check if it has anything to do with it. The 
> slowdown happened in this last six months.


H.J.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread Pau Aliagas
On Wed, 30 Mar 2005, H. J. Lu wrote:
That is what the assembler generates, and should have generated, for
"movw %ds,(%eax)" since Nov. 4, 2004.
Could this be the reason for the reported slowdown in the last six months?
Can you elaborate?
There's an unexplained slowdown of kernel 2.6 detailed in this thread:
http://kerneltrap.org/node/4940
I don't want at all to justify it with the change you talk about in gas, 
but maybe it is worth to check if it has anything to do with it. The 
slowdown happened in this last six months.

--
Pau
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread H. J. Lu
On Thu, Mar 31, 2005 at 12:18:55AM +0200, Pau Aliagas wrote:
> On Wed, 30 Mar 2005, H. J. Lu wrote:
> 
> >On Wed, Mar 30, 2005 at 07:57:28AM -0800, Linus Torvalds wrote:
> 
> >>>There is no such an instruction of "movl %ds,(%eax)". The old assembler
> >>>accepts it and turns it into "movw %ds,(%eax)".
> >>
> >>I disagree. Violently. As does the old assembler, which does not turn
> >>"mov" into "movw" as you say. AT ALL.
> >
> >I should have made myself clear. By "movw %ds,(%eax)", I meant:
> >
> > 8c 18   movw   %ds,(%eax)
> >
> >That is what the assembler generates, and should have generated, for
> >"movw %ds,(%eax)" since Nov. 4, 2004.
> 
> Could this be the reason for the reported slowdown in the last six months?
> 

Can you elaborate?


H.J.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread Pau Aliagas
On Wed, 30 Mar 2005, H. J. Lu wrote:
On Wed, Mar 30, 2005 at 07:57:28AM -0800, Linus Torvalds wrote:

There is no such an instruction of "movl %ds,(%eax)". The old assembler
accepts it and turns it into "movw %ds,(%eax)".
I disagree. Violently. As does the old assembler, which does not turn
"mov" into "movw" as you say. AT ALL.
I should have made myself clear. By "movw %ds,(%eax)", I meant:
8c 18   movw   %ds,(%eax)
That is what the assembler generates, and should have generated, for
"movw %ds,(%eax)" since Nov. 4, 2004.
Could this be the reason for the reported slowdown in the last six months?
--
Pau
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread H. J. Lu
On Wed, Mar 30, 2005 at 11:23:25AM -0500, linux-os wrote:
> 
> So if there are any "movw (mem), %ds" and
> "movw %ds, (mem)" in the code. The sizeof(mem)
> needs to be 32-bits and the 'w' needs to be removed.
> Otherwise, we are wasting CPU cycles and/or fooling
> ourselves. GAS needs to continue to generate whatever
> it was fed, with appropriate diagnostics if it
> is fed the wrong stuff.

FYI, gas hasn't generated 0x66 on "movw (%eax),%ds" for a long time
and started doing it on "movw %ds,(%eax)" since Nov. 4, 2004.


H.J.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread H. J. Lu
On Wed, Mar 30, 2005 at 07:57:28AM -0800, Linus Torvalds wrote:
> 
> [ binutils and libc back in the discussion - I don't know why they got 
>   dropped ]

Removing glibc since it accesses segment register with proper
instructions.

> 
> On Tue, 29 Mar 2005, H. J. Lu wrote:
> > 
> > There is no such an instruction of "movl %ds,(%eax)". The old assembler
> > accepts it and turns it into "movw %ds,(%eax)".
> 
> I disagree. Violently. As does the old assembler, which does not turn 
> "mov" into "movw" as you say. AT ALL.

I should have made myself clear. By "movw %ds,(%eax)", I meant:

8c 18   movw   %ds,(%eax)

That is what the assembler generates, and should have generated, for
"movw %ds,(%eax)" since Nov. 4, 2004.

> 
> A "movw" has a 0x66 prefix. The assembler agree with me. Plain logic 
> agrees with me. Being consistent _also_ agrees with me (it's the same damn 
> instruction to move to a register, for chrissake!)

This is a bug in asssembler and has been fixed on Nov. 4, 2004. If
you want the 0x66 prefix for "movw %ds,(%eax)", you need to use
"word movw %ds,(%eax)" with the new assembler.

> 
> The fact is, every single "mov" instruction takes the size hint, and it
> HAS MEANING, even if the meaning is only about performance, not about
> semantics. In other words, yes, in the specific case of "mov segment to
> memory", it ends up being only a performance hit, but as such IT DOES HAVE
> MEANING. And in fact, even if it didn't end up having any meaning at all, 
> it's still a good idea as just a consistency issue.

Accessing segment register is a very special case. It has been treated
differently by gas. Try "movw (%eax),%ds" with your gas. Gas doesn't
generate 0x66. The "movw %ds,(%eax)" bug was fixed last year.

> If you think people should use just "mov", then fine, let people use 

I only suggested "mov" for old assemblers.

> "mov". That's their choice - the same way you can write just "or $5,%eax" 
> and gas will pick the 32-bit version based on the register name, yes, you 
> should be able to write just "mov %fs,mem", and gas will pick whatever 
> version using its heuristics for the size (in this case the 32-bit, since 
> it does the same thing and is smaller and faster).
> 
> And "mov" has always worked. The kernel just doesn't use it much, because 
> the kernel - for good historical reasons - doesn't trust gas to pick sizes 
> of instructions automagically.
> 
> And the fact that it is obvious that gas _should_ pick the 32-bit format
> of the instruction when you do not specify a size does NOT MEAN that it's
> wrong to specify the size explicitly.
> 
> And your arguments that there is no semantic difference between the 16-bit 
> and the 32-bit version IS MEANINGLESS. An assembler shouldn't care. This 

For segment register access, there is no 16-bit nor 32-bit version.
There is only one version.

> is not an argument about semantic difference. This is an argument over a 
> user wanting to make the size explicit, to DOCUMENT it.

Are you suggesting that gas should put back 0x66 for both
"movw %ds,(%eax)" and "movw (%eax),%ds"?

> 
> The fact is, if users use "movl" and "movw" explicitly (and the kernel has
> traditionally been _very_ careful to use all instruction sizes explicitly,
> partly exactly because gas itself has been very happy-go-lucky about
> them), then that is a GOOD THING. It means that the instruction is
> well-defined to somebody who knows the x86 instruction set, and he never
> needs to worry or use "objdump" to see if gas was being stupid and
> generated the 16-bit version.

Allowing "movl %ds,(%eax)" has a possibilty that people assume it will
update 32bit memory location. That is how this issue was uncovered.
If you really don't like "mov %ds,(%eax)" and want to support the
old assembler, I can write a kernel patch to check asssembler to
use "movl" for the old asssembler and "movw" for the new assembler.

BTW, to report problems with assembler, there is

http://www.sourceware.org/bugzilla/

Or I can be reached at [EMAIL PROTECTED]


H.J.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread linux-os
On Wed, 30 Mar 2005, Linus Torvalds wrote:
[ binutils and libc back in the discussion - I don't know why they got
 dropped ]
On Tue, 29 Mar 2005, H. J. Lu wrote:
There is no such an instruction of "movl %ds,(%eax)". The old assembler
accepts it and turns it into "movw %ds,(%eax)".
I disagree. Violently. As does the old assembler, which does not turn
"mov" into "movw" as you say. AT ALL.
A "movw" has a 0x66 prefix. The assembler agree with me. Plain logic
agrees with me. Being consistent _also_ agrees with me (it's the same damn
instruction to move to a register, for chrissake!)
"movw" is totally different from "movl". They _act_ the same, but that's
like saying that "orw $5,%ax" is the same as "orl $5,%eax". They also
_act_ the same, but that IN NO WAY makes them the same.
According to your logic, the assembler should disallow "orl $5,ax" because
it does the same thing as "or $5,%eax" and "orw $5,%eax", and thus to
"protect" the user, the user should not be able to say the size
explicitly.
The fact is, every single "mov" instruction takes the size hint, and it
HAS MEANING, even if the meaning is only about performance, not about
semantics. In other words, yes, in the specific case of "mov segment to
memory", it ends up being only a performance hit, but as such IT DOES HAVE
MEANING. And in fact, even if it didn't end up having any meaning at all,
it's still a good idea as just a consistency issue.
Dammit, if I say "orl $5,%eax", I mean "orl $5,%eax", and if the assembler
complains about it or claims it is the same as "orw $5,%ax", then the
assembler is fundamentally BROKEN.
None of your arguments have in any way responded to this fact.
If you think people should use just "mov", then fine, let people use
"mov". That's their choice - the same way you can write just "or $5,%eax"
and gas will pick the 32-bit version based on the register name, yes, you
should be able to write just "mov %fs,mem", and gas will pick whatever
version using its heuristics for the size (in this case the 32-bit, since
it does the same thing and is smaller and faster).
And "mov" has always worked. The kernel just doesn't use it much, because
the kernel - for good historical reasons - doesn't trust gas to pick sizes
of instructions automagically.
And the fact that it is obvious that gas _should_ pick the 32-bit format
of the instruction when you do not specify a size does NOT MEAN that it's
wrong to specify the size explicitly.
And your arguments that there is no semantic difference between the 16-bit
and the 32-bit version IS MEANINGLESS. An assembler shouldn't care. This
is not an argument about semantic difference. This is an argument over a
user wanting to make the size explicit, to DOCUMENT it.
The fact is, if users use "movl" and "movw" explicitly (and the kernel has
traditionally been _very_ careful to use all instruction sizes explicitly,
partly exactly because gas itself has been very happy-go-lucky about
them), then that is a GOOD THING. It means that the instruction is
well-defined to somebody who knows the x86 instruction set, and he never
needs to worry or use "objdump" to see if gas was being stupid and
generated the 16-bit version.
Linus
-

We went over this stuff when we first started using the
Intel 486. (Ref Intel 486 Microprocessor Programmers
reference manual, ISBN 1-55512-192-4)
Segment registers are really 32 bits in length. They
have a 'visible' part and an invisible part. The
visible part contains the 16-bit selector. The
invisible part contains the base address, limit,
etc., that was loaded from the GDT or the LDT.
(Ref. pp 5-9)
All access to these registers is 32 bits. If you
execute 'push ds' or 'pop ds' the stack-pointer
will move 4 bytes. An 0x66 override prefix is
ignored when accessing segment registers. It
should never be used. There is another override
prefix that can be used instead. The push ds
opcode is 0x1e and the pop ds opcode is 0x1f
if somebody wants to experiment.
Even a move from a CPU general purpose register
to a segment register is a 32-bit operation. If
you want to move the contents of a segment register
to memory or a register as a 16-bit action, for
instance not overwriting the high-word of a register,
the override prefix is 0x67, not 0x66. (Ref. pp 26-210)
This means that segment values stored in memory 
should really be aligned on 32-bit boundaries
so that extra clock-cycles are not wasted
accessing these registers. This also means
that they should be treated as (Posix) uint32_t
not uint16_t, even though the value will never
exceed 8192.

So if there are any "movw (mem), %ds" and
"movw %ds, (mem)" in the code. The sizeof(mem)
needs to be 32-bits and the 'w' needs to be removed.
Otherwise, we are wasting CPU cycles and/or fooling
ourselves. GAS needs to continue to generate whatever
it was fed, with appropriate diagnostics if it
is fed the wrong stuff.
Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
 Notice : All mail 

Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread Linus Torvalds

[ binutils and libc back in the discussion - I don't know why they got 
  dropped ]

On Tue, 29 Mar 2005, H. J. Lu wrote:
> 
> There is no such an instruction of "movl %ds,(%eax)". The old assembler
> accepts it and turns it into "movw %ds,(%eax)".

I disagree. Violently. As does the old assembler, which does not turn 
"mov" into "movw" as you say. AT ALL.

A "movw" has a 0x66 prefix. The assembler agree with me. Plain logic 
agrees with me. Being consistent _also_ agrees with me (it's the same damn 
instruction to move to a register, for chrissake!)

"movw" is totally different from "movl". They _act_ the same, but that's 
like saying that "orw $5,%ax" is the same as "orl $5,%eax". They also 
_act_ the same, but that IN NO WAY makes them the same.

According to your logic, the assembler should disallow "orl $5,ax" because
it does the same thing as "or $5,%eax" and "orw $5,%eax", and thus to
"protect" the user, the user should not be able to say the size
explicitly.

The fact is, every single "mov" instruction takes the size hint, and it
HAS MEANING, even if the meaning is only about performance, not about
semantics. In other words, yes, in the specific case of "mov segment to
memory", it ends up being only a performance hit, but as such IT DOES HAVE
MEANING. And in fact, even if it didn't end up having any meaning at all, 
it's still a good idea as just a consistency issue.

Dammit, if I say "orl $5,%eax", I mean "orl $5,%eax", and if the assembler 
complains about it or claims it is the same as "orw $5,%ax", then the 
assembler is fundamentally BROKEN.

None of your arguments have in any way responded to this fact. 

If you think people should use just "mov", then fine, let people use 
"mov". That's their choice - the same way you can write just "or $5,%eax" 
and gas will pick the 32-bit version based on the register name, yes, you 
should be able to write just "mov %fs,mem", and gas will pick whatever 
version using its heuristics for the size (in this case the 32-bit, since 
it does the same thing and is smaller and faster).

And "mov" has always worked. The kernel just doesn't use it much, because 
the kernel - for good historical reasons - doesn't trust gas to pick sizes 
of instructions automagically.

And the fact that it is obvious that gas _should_ pick the 32-bit format
of the instruction when you do not specify a size does NOT MEAN that it's
wrong to specify the size explicitly.

And your arguments that there is no semantic difference between the 16-bit 
and the 32-bit version IS MEANINGLESS. An assembler shouldn't care. This 
is not an argument about semantic difference. This is an argument over a 
user wanting to make the size explicit, to DOCUMENT it.

The fact is, if users use "movl" and "movw" explicitly (and the kernel has
traditionally been _very_ careful to use all instruction sizes explicitly,
partly exactly because gas itself has been very happy-go-lucky about
them), then that is a GOOD THING. It means that the instruction is
well-defined to somebody who knows the x86 instruction set, and he never
needs to worry or use "objdump" to see if gas was being stupid and
generated the 16-bit version.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread Andi Kleen
> > unsigned gsindex;
> > asm volatile("movl %%gs,%0" : "=g" (gsindex));
> 
> Ok, that's a real x86-64 bug, it seems. Andi, please fix, preferably by 
> just making the "g" be a "r".

Will do.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread Andi Kleen
  unsigned gsindex;
  asm volatile(movl %%gs,%0 : =g (gsindex));
 
 Ok, that's a real x86-64 bug, it seems. Andi, please fix, preferably by 
 just making the g be a r.

Will do.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread Linus Torvalds

[ binutils and libc back in the discussion - I don't know why they got 
  dropped ]

On Tue, 29 Mar 2005, H. J. Lu wrote:
 
 There is no such an instruction of movl %ds,(%eax). The old assembler
 accepts it and turns it into movw %ds,(%eax).

I disagree. Violently. As does the old assembler, which does not turn 
mov into movw as you say. AT ALL.

A movw has a 0x66 prefix. The assembler agree with me. Plain logic 
agrees with me. Being consistent _also_ agrees with me (it's the same damn 
instruction to move to a register, for chrissake!)

movw is totally different from movl. They _act_ the same, but that's 
like saying that orw $5,%ax is the same as orl $5,%eax. They also 
_act_ the same, but that IN NO WAY makes them the same.

According to your logic, the assembler should disallow orl $5,ax because
it does the same thing as or $5,%eax and orw $5,%eax, and thus to
protect the user, the user should not be able to say the size
explicitly.

The fact is, every single mov instruction takes the size hint, and it
HAS MEANING, even if the meaning is only about performance, not about
semantics. In other words, yes, in the specific case of mov segment to
memory, it ends up being only a performance hit, but as such IT DOES HAVE
MEANING. And in fact, even if it didn't end up having any meaning at all, 
it's still a good idea as just a consistency issue.

Dammit, if I say orl $5,%eax, I mean orl $5,%eax, and if the assembler 
complains about it or claims it is the same as orw $5,%ax, then the 
assembler is fundamentally BROKEN.

None of your arguments have in any way responded to this fact. 

If you think people should use just mov, then fine, let people use 
mov. That's their choice - the same way you can write just or $5,%eax 
and gas will pick the 32-bit version based on the register name, yes, you 
should be able to write just mov %fs,mem, and gas will pick whatever 
version using its heuristics for the size (in this case the 32-bit, since 
it does the same thing and is smaller and faster).

And mov has always worked. The kernel just doesn't use it much, because 
the kernel - for good historical reasons - doesn't trust gas to pick sizes 
of instructions automagically.

And the fact that it is obvious that gas _should_ pick the 32-bit format
of the instruction when you do not specify a size does NOT MEAN that it's
wrong to specify the size explicitly.

And your arguments that there is no semantic difference between the 16-bit 
and the 32-bit version IS MEANINGLESS. An assembler shouldn't care. This 
is not an argument about semantic difference. This is an argument over a 
user wanting to make the size explicit, to DOCUMENT it.

The fact is, if users use movl and movw explicitly (and the kernel has
traditionally been _very_ careful to use all instruction sizes explicitly,
partly exactly because gas itself has been very happy-go-lucky about
them), then that is a GOOD THING. It means that the instruction is
well-defined to somebody who knows the x86 instruction set, and he never
needs to worry or use objdump to see if gas was being stupid and
generated the 16-bit version.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread linux-os
On Wed, 30 Mar 2005, Linus Torvalds wrote:
[ binutils and libc back in the discussion - I don't know why they got
 dropped ]
On Tue, 29 Mar 2005, H. J. Lu wrote:
There is no such an instruction of movl %ds,(%eax). The old assembler
accepts it and turns it into movw %ds,(%eax).
I disagree. Violently. As does the old assembler, which does not turn
mov into movw as you say. AT ALL.
A movw has a 0x66 prefix. The assembler agree with me. Plain logic
agrees with me. Being consistent _also_ agrees with me (it's the same damn
instruction to move to a register, for chrissake!)
movw is totally different from movl. They _act_ the same, but that's
like saying that orw $5,%ax is the same as orl $5,%eax. They also
_act_ the same, but that IN NO WAY makes them the same.
According to your logic, the assembler should disallow orl $5,ax because
it does the same thing as or $5,%eax and orw $5,%eax, and thus to
protect the user, the user should not be able to say the size
explicitly.
The fact is, every single mov instruction takes the size hint, and it
HAS MEANING, even if the meaning is only about performance, not about
semantics. In other words, yes, in the specific case of mov segment to
memory, it ends up being only a performance hit, but as such IT DOES HAVE
MEANING. And in fact, even if it didn't end up having any meaning at all,
it's still a good idea as just a consistency issue.
Dammit, if I say orl $5,%eax, I mean orl $5,%eax, and if the assembler
complains about it or claims it is the same as orw $5,%ax, then the
assembler is fundamentally BROKEN.
None of your arguments have in any way responded to this fact.
If you think people should use just mov, then fine, let people use
mov. That's their choice - the same way you can write just or $5,%eax
and gas will pick the 32-bit version based on the register name, yes, you
should be able to write just mov %fs,mem, and gas will pick whatever
version using its heuristics for the size (in this case the 32-bit, since
it does the same thing and is smaller and faster).
And mov has always worked. The kernel just doesn't use it much, because
the kernel - for good historical reasons - doesn't trust gas to pick sizes
of instructions automagically.
And the fact that it is obvious that gas _should_ pick the 32-bit format
of the instruction when you do not specify a size does NOT MEAN that it's
wrong to specify the size explicitly.
And your arguments that there is no semantic difference between the 16-bit
and the 32-bit version IS MEANINGLESS. An assembler shouldn't care. This
is not an argument about semantic difference. This is an argument over a
user wanting to make the size explicit, to DOCUMENT it.
The fact is, if users use movl and movw explicitly (and the kernel has
traditionally been _very_ careful to use all instruction sizes explicitly,
partly exactly because gas itself has been very happy-go-lucky about
them), then that is a GOOD THING. It means that the instruction is
well-defined to somebody who knows the x86 instruction set, and he never
needs to worry or use objdump to see if gas was being stupid and
generated the 16-bit version.
Linus
-

We went over this stuff when we first started using the
Intel 486. (Ref Intel 486 Microprocessor Programmers
reference manual, ISBN 1-55512-192-4)
Segment registers are really 32 bits in length. They
have a 'visible' part and an invisible part. The
visible part contains the 16-bit selector. The
invisible part contains the base address, limit,
etc., that was loaded from the GDT or the LDT.
(Ref. pp 5-9)
All access to these registers is 32 bits. If you
execute 'push ds' or 'pop ds' the stack-pointer
will move 4 bytes. An 0x66 override prefix is
ignored when accessing segment registers. It
should never be used. There is another override
prefix that can be used instead. The push ds
opcode is 0x1e and the pop ds opcode is 0x1f
if somebody wants to experiment.
Even a move from a CPU general purpose register
to a segment register is a 32-bit operation. If
you want to move the contents of a segment register
to memory or a register as a 16-bit action, for
instance not overwriting the high-word of a register,
the override prefix is 0x67, not 0x66. (Ref. pp 26-210)
This means that segment values stored in memory 
should really be aligned on 32-bit boundaries
so that extra clock-cycles are not wasted
accessing these registers. This also means
that they should be treated as (Posix) uint32_t
not uint16_t, even though the value will never
exceed 8192.

So if there are any movw (mem), %ds and
movw %ds, (mem) in the code. The sizeof(mem)
needs to be 32-bits and the 'w' needs to be removed.
Otherwise, we are wasting CPU cycles and/or fooling
ourselves. GAS needs to continue to generate whatever
it was fed, with appropriate diagnostics if it
is fed the wrong stuff.
Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
 Notice : All mail here is now cached for review by Dictator Bush.
   

Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread H. J. Lu
On Wed, Mar 30, 2005 at 07:57:28AM -0800, Linus Torvalds wrote:
 
 [ binutils and libc back in the discussion - I don't know why they got 
   dropped ]

Removing glibc since it accesses segment register with proper
instructions.

 
 On Tue, 29 Mar 2005, H. J. Lu wrote:
  
  There is no such an instruction of movl %ds,(%eax). The old assembler
  accepts it and turns it into movw %ds,(%eax).
 
 I disagree. Violently. As does the old assembler, which does not turn 
 mov into movw as you say. AT ALL.

I should have made myself clear. By movw %ds,(%eax), I meant:

8c 18   movw   %ds,(%eax)

That is what the assembler generates, and should have generated, for
movw %ds,(%eax) since Nov. 4, 2004.

 
 A movw has a 0x66 prefix. The assembler agree with me. Plain logic 
 agrees with me. Being consistent _also_ agrees with me (it's the same damn 
 instruction to move to a register, for chrissake!)

This is a bug in asssembler and has been fixed on Nov. 4, 2004. If
you want the 0x66 prefix for movw %ds,(%eax), you need to use
word movw %ds,(%eax) with the new assembler.

 
 The fact is, every single mov instruction takes the size hint, and it
 HAS MEANING, even if the meaning is only about performance, not about
 semantics. In other words, yes, in the specific case of mov segment to
 memory, it ends up being only a performance hit, but as such IT DOES HAVE
 MEANING. And in fact, even if it didn't end up having any meaning at all, 
 it's still a good idea as just a consistency issue.

Accessing segment register is a very special case. It has been treated
differently by gas. Try movw (%eax),%ds with your gas. Gas doesn't
generate 0x66. The movw %ds,(%eax) bug was fixed last year.

 If you think people should use just mov, then fine, let people use 

I only suggested mov for old assemblers.

 mov. That's their choice - the same way you can write just or $5,%eax 
 and gas will pick the 32-bit version based on the register name, yes, you 
 should be able to write just mov %fs,mem, and gas will pick whatever 
 version using its heuristics for the size (in this case the 32-bit, since 
 it does the same thing and is smaller and faster).
 
 And mov has always worked. The kernel just doesn't use it much, because 
 the kernel - for good historical reasons - doesn't trust gas to pick sizes 
 of instructions automagically.
 
 And the fact that it is obvious that gas _should_ pick the 32-bit format
 of the instruction when you do not specify a size does NOT MEAN that it's
 wrong to specify the size explicitly.
 
 And your arguments that there is no semantic difference between the 16-bit 
 and the 32-bit version IS MEANINGLESS. An assembler shouldn't care. This 

For segment register access, there is no 16-bit nor 32-bit version.
There is only one version.

 is not an argument about semantic difference. This is an argument over a 
 user wanting to make the size explicit, to DOCUMENT it.

Are you suggesting that gas should put back 0x66 for both
movw %ds,(%eax) and movw (%eax),%ds?

 
 The fact is, if users use movl and movw explicitly (and the kernel has
 traditionally been _very_ careful to use all instruction sizes explicitly,
 partly exactly because gas itself has been very happy-go-lucky about
 them), then that is a GOOD THING. It means that the instruction is
 well-defined to somebody who knows the x86 instruction set, and he never
 needs to worry or use objdump to see if gas was being stupid and
 generated the 16-bit version.

Allowing movl %ds,(%eax) has a possibilty that people assume it will
update 32bit memory location. That is how this issue was uncovered.
If you really don't like mov %ds,(%eax) and want to support the
old assembler, I can write a kernel patch to check asssembler to
use movl for the old asssembler and movw for the new assembler.

BTW, to report problems with assembler, there is

http://www.sourceware.org/bugzilla/

Or I can be reached at [EMAIL PROTECTED]


H.J.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread H. J. Lu
On Wed, Mar 30, 2005 at 11:23:25AM -0500, linux-os wrote:
 
 So if there are any movw (mem), %ds and
 movw %ds, (mem) in the code. The sizeof(mem)
 needs to be 32-bits and the 'w' needs to be removed.
 Otherwise, we are wasting CPU cycles and/or fooling
 ourselves. GAS needs to continue to generate whatever
 it was fed, with appropriate diagnostics if it
 is fed the wrong stuff.

FYI, gas hasn't generated 0x66 on movw (%eax),%ds for a long time
and started doing it on movw %ds,(%eax) since Nov. 4, 2004.


H.J.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread Pau Aliagas
On Wed, 30 Mar 2005, H. J. Lu wrote:
On Wed, Mar 30, 2005 at 07:57:28AM -0800, Linus Torvalds wrote:

There is no such an instruction of movl %ds,(%eax). The old assembler
accepts it and turns it into movw %ds,(%eax).
I disagree. Violently. As does the old assembler, which does not turn
mov into movw as you say. AT ALL.
I should have made myself clear. By movw %ds,(%eax), I meant:
8c 18   movw   %ds,(%eax)
That is what the assembler generates, and should have generated, for
movw %ds,(%eax) since Nov. 4, 2004.
Could this be the reason for the reported slowdown in the last six months?
--
Pau
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread H. J. Lu
On Thu, Mar 31, 2005 at 12:18:55AM +0200, Pau Aliagas wrote:
 On Wed, 30 Mar 2005, H. J. Lu wrote:
 
 On Wed, Mar 30, 2005 at 07:57:28AM -0800, Linus Torvalds wrote:
 
 There is no such an instruction of movl %ds,(%eax). The old assembler
 accepts it and turns it into movw %ds,(%eax).
 
 I disagree. Violently. As does the old assembler, which does not turn
 mov into movw as you say. AT ALL.
 
 I should have made myself clear. By movw %ds,(%eax), I meant:
 
  8c 18   movw   %ds,(%eax)
 
 That is what the assembler generates, and should have generated, for
 movw %ds,(%eax) since Nov. 4, 2004.
 
 Could this be the reason for the reported slowdown in the last six months?
 

Can you elaborate?


H.J.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread Pau Aliagas
On Wed, 30 Mar 2005, H. J. Lu wrote:
That is what the assembler generates, and should have generated, for
movw %ds,(%eax) since Nov. 4, 2004.
Could this be the reason for the reported slowdown in the last six months?
Can you elaborate?
There's an unexplained slowdown of kernel 2.6 detailed in this thread:
http://kerneltrap.org/node/4940
I don't want at all to justify it with the change you talk about in gas, 
but maybe it is worth to check if it has anything to do with it. The 
slowdown happened in this last six months.

--
Pau
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-30 Thread H. J. Lu
On Thu, Mar 31, 2005 at 02:57:57AM +0200, Pau Aliagas wrote:
 On Wed, 30 Mar 2005, H. J. Lu wrote:
 
 That is what the assembler generates, and should have generated, for
 movw %ds,(%eax) since Nov. 4, 2004.
 
 Could this be the reason for the reported slowdown in the last six months?
 
 Can you elaborate?
 
 There's an unexplained slowdown of kernel 2.6 detailed in this thread:
 http://kerneltrap.org/node/4940
 

It is dated as November 13, 2002 - 13:58. The assembler change was
made on Nov. 4, 2004. I don't think they are related at all.

 I don't want at all to justify it with the change you talk about in gas, 
 but maybe it is worth to check if it has anything to do with it. The 
 slowdown happened in this last six months.


H.J.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-29 Thread H. J. Lu
On Tue, Mar 29, 2005 at 06:44:18PM -0800, Linus Torvalds wrote:
> 
> 
> On Tue, 29 Mar 2005, H. J. Lu wrote:
> > 
> > > the smaller and faster version do not want to just rely on gas
> > > automatically getting it right, especially since gas has historically been
> > > very very bad at getting things right.
> > 
> > We are fixing those issues in assembler. If people run into problems
> > like that with gas, they can report them. They will be fixed.
> 
> It's fine if gas fixes things. It's not fine if gas breaks things that 
> used to work, for no really good reason.
> 
> > > What is the advantage of not allowing "movl %ds,mem"? Really? Especially
> > > since I suspect the kernel is pretty much the only one who does this, and
> > > the kernel really does do it on purpose. The kernel explicitly wants the
> > > 32-bit version, knowing that the upper bits are undefined.
> > > 
> > 
> > Kernel has
> > 
> > unsigned gsindex;
> > asm volatile("movl %%gs,%0" : "=g" (gsindex));
> 
> Ok, that's a real x86-64 bug, it seems. Andi, please fix, preferably by 
> just making the "g" be a "r".
> 
> However, your argument isn't very valid, since:
> 
> > The new assembler will make sure that it won't happen.
> 
> Not true, since the suggestion was just to change all segment "movl"  
> things to "mov", at which point the same old bug is still there, and the
> assembler didn't really help us at all.

The new assembler won't accept

movl %gs,128(%rsp)

It makes it harder to generate binary code user doesn't tend. FWIW,
what I suggested are in

http://sourceware.org/ml/binutils/2005-03/msg00873.html

Thera are things like

-   asm volatile("movl %%fs,%0" : "=g" (fsindex)); 
+   asm volatile("movl %%fs,%0" : "=r" (fsindex)); 

> 
> See the problem? You're not actually protecting anything. The change just 
> makes it _harder_ to make sizes explicit, and suddenly we have to trust an 
> assembler to be clever about sizes, when that assembler historically has 
> definitely _not_ been very clever about them at all. 
> 

There is no such an instruction of "movl %ds,(%eax)". The old assembler
accepts it and turns it into "movw %ds,(%eax)". It won't catch problems
like

unsigned fsindex;
asm volatile("movl %%fs,%0" : "=m" (fsindex)); 

The "movw %ds,(%eax)" bug was fixed in binutils 2.15.94.0.1. Gas no
longer generates 0x66 for it. If you find gas preventing you from doing
what the hardware supports, I will be happy to fix it.


H.J.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-29 Thread Linus Torvalds


On Tue, 29 Mar 2005, H. J. Lu wrote:
> 
> > the smaller and faster version do not want to just rely on gas
> > automatically getting it right, especially since gas has historically been
> > very very bad at getting things right.
> 
> We are fixing those issues in assembler. If people run into problems
> like that with gas, they can report them. They will be fixed.

It's fine if gas fixes things. It's not fine if gas breaks things that 
used to work, for no really good reason.

> > What is the advantage of not allowing "movl %ds,mem"? Really? Especially
> > since I suspect the kernel is pretty much the only one who does this, and
> > the kernel really does do it on purpose. The kernel explicitly wants the
> > 32-bit version, knowing that the upper bits are undefined.
> > 
> 
> Kernel has
> 
>   unsigned gsindex;
>   asm volatile("movl %%gs,%0" : "=g" (gsindex));

Ok, that's a real x86-64 bug, it seems. Andi, please fix, preferably by 
just making the "g" be a "r".

However, your argument isn't very valid, since:

> The new assembler will make sure that it won't happen.

Not true, since the suggestion was just to change all segment "movl"  
things to "mov", at which point the same old bug is still there, and the
assembler didn't really help us at all.

See the problem? You're not actually protecting anything. The change just 
makes it _harder_ to make sizes explicit, and suddenly we have to trust an 
assembler to be clever about sizes, when that assembler historically has 
definitely _not_ been very clever about them at all. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-29 Thread H. J. Lu
On Tue, Mar 29, 2005 at 04:30:01PM -0800, Linus Torvalds wrote:
> 
> 
> On Mon, 28 Mar 2005, Andi Kleen wrote:
> >
> > "H. J. Lu" <[EMAIL PROTECTED]> writes:
> > > The new assembler will disallow them since those instructions with
> > > memory operand will only use the first 16bits. If the memory operand
> > > is 16bit, you won't see any problems. But if the memory destinatin
> > > is 32bit, the upper 16bits may have random values. The new assembler
> > 
> > Does it really have random values on existing x86 hardware?
> 
> The upper bits are not written at all, so it's not random.
> 
> > If it is a only a "theoretical" problem that does not happen
> > in practice I would advise to not do the change.
> 
> My preference too. The reason we use "movl" is because we really do want 
> the 32-bit versions, since they are faster. It's a conscious choice. In 
> contrast "movw" generates bigger and slower code on all assemblers out 
> there, and "mov" doesn't make it clear which one it is. Is it the slow 
> one, or the fast one? 

"mov" shouldn't generate the 0x66 prefix, at least with the assembler
since binutils 2.14.90.0.4 20030523. The assembler in CVS won't generate
0x66 for "movw" either.

> Now, those versions of gas may be so old that nobody cares, but the
> explicit size still is a GOOD THING. The size DOES MATTER. People who want

Suggesting "mov" instead of "movw" is for the existing assemblers. Or
kernel can check assembler version to decide if "movw" should be used.
I can verify the first Linux assembler which won't generate 0x66 for
"movw".

> the smaller and faster version do not want to just rely on gas
> automatically getting it right, especially since gas has historically been
> very very bad at getting things right.

We are fixing those issues in assembler. If people run into problems
like that with gas, they can report them. They will be fixed.

> 
> What is the advantage of not allowing "movl %ds,mem"? Really? Especially
> since I suspect the kernel is pretty much the only one who does this, and
> the kernel really does do it on purpose. The kernel explicitly wants the
> 32-bit version, knowing that the upper bits are undefined.
> 

Kernel has

unsigned gsindex;
asm volatile("movl %%gs,%0" : "=g" (gsindex));
...
if (gsindex)


It is OK if gcc never generates memory access like

movl %gs,0x128(%rsp)

Otherwise, the upper bits in gsindex are undefined. The new
assembler will make sure that it won't happen.


H.J.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-29 Thread Linus Torvalds


On Mon, 28 Mar 2005, Andi Kleen wrote:
>
> "H. J. Lu" <[EMAIL PROTECTED]> writes:
> > The new assembler will disallow them since those instructions with
> > memory operand will only use the first 16bits. If the memory operand
> > is 16bit, you won't see any problems. But if the memory destinatin
> > is 32bit, the upper 16bits may have random values. The new assembler
> 
> Does it really have random values on existing x86 hardware?

The upper bits are not written at all, so it's not random.

> If it is a only a "theoretical" problem that does not happen
> in practice I would advise to not do the change.

My preference too. The reason we use "movl" is because we really do want 
the 32-bit versions, since they are faster. It's a conscious choice. In 
contrast "movw" generates bigger and slower code on all assemblers out 
there, and "mov" doesn't make it clear which one it is. Is it the slow 
one, or the fast one? 

For example, "mov %ds,%eax" does seem to generate the (faster) 32-bit code
on modern assemblers, while "mov %ds,%ax" generates (slower) 16-bit code 
that leaves the high bits of %eax untouched. Sometimes you may want the 
slower one, sometimes the faster one. I have this pretty strong memory of 
old versions of gas not making any difference between %ax and %eax as a 
target, and that you really needed to set the size explicitly.

Now, those versions of gas may be so old that nobody cares, but the
explicit size still is a GOOD THING. The size DOES MATTER. People who want
the smaller and faster version do not want to just rely on gas
automatically getting it right, especially since gas has historically been
very very bad at getting things right.

What is the advantage of not allowing "movl %ds,mem"? Really? Especially
since I suspect the kernel is pretty much the only one who does this, and
the kernel really does do it on purpose. The kernel explicitly wants the
32-bit version, knowing that the upper bits are undefined.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-29 Thread Linus Torvalds


On Mon, 28 Mar 2005, Andi Kleen wrote:

 H. J. Lu [EMAIL PROTECTED] writes:
  The new assembler will disallow them since those instructions with
  memory operand will only use the first 16bits. If the memory operand
  is 16bit, you won't see any problems. But if the memory destinatin
  is 32bit, the upper 16bits may have random values. The new assembler
 
 Does it really have random values on existing x86 hardware?

The upper bits are not written at all, so it's not random.

 If it is a only a theoretical problem that does not happen
 in practice I would advise to not do the change.

My preference too. The reason we use movl is because we really do want 
the 32-bit versions, since they are faster. It's a conscious choice. In 
contrast movw generates bigger and slower code on all assemblers out 
there, and mov doesn't make it clear which one it is. Is it the slow 
one, or the fast one? 

For example, mov %ds,%eax does seem to generate the (faster) 32-bit code
on modern assemblers, while mov %ds,%ax generates (slower) 16-bit code 
that leaves the high bits of %eax untouched. Sometimes you may want the 
slower one, sometimes the faster one. I have this pretty strong memory of 
old versions of gas not making any difference between %ax and %eax as a 
target, and that you really needed to set the size explicitly.

Now, those versions of gas may be so old that nobody cares, but the
explicit size still is a GOOD THING. The size DOES MATTER. People who want
the smaller and faster version do not want to just rely on gas
automatically getting it right, especially since gas has historically been
very very bad at getting things right.

What is the advantage of not allowing movl %ds,mem? Really? Especially
since I suspect the kernel is pretty much the only one who does this, and
the kernel really does do it on purpose. The kernel explicitly wants the
32-bit version, knowing that the upper bits are undefined.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-29 Thread H. J. Lu
On Tue, Mar 29, 2005 at 04:30:01PM -0800, Linus Torvalds wrote:
 
 
 On Mon, 28 Mar 2005, Andi Kleen wrote:
 
  H. J. Lu [EMAIL PROTECTED] writes:
   The new assembler will disallow them since those instructions with
   memory operand will only use the first 16bits. If the memory operand
   is 16bit, you won't see any problems. But if the memory destinatin
   is 32bit, the upper 16bits may have random values. The new assembler
  
  Does it really have random values on existing x86 hardware?
 
 The upper bits are not written at all, so it's not random.
 
  If it is a only a theoretical problem that does not happen
  in practice I would advise to not do the change.
 
 My preference too. The reason we use movl is because we really do want 
 the 32-bit versions, since they are faster. It's a conscious choice. In 
 contrast movw generates bigger and slower code on all assemblers out 
 there, and mov doesn't make it clear which one it is. Is it the slow 
 one, or the fast one? 

mov shouldn't generate the 0x66 prefix, at least with the assembler
since binutils 2.14.90.0.4 20030523. The assembler in CVS won't generate
0x66 for movw either.

 Now, those versions of gas may be so old that nobody cares, but the
 explicit size still is a GOOD THING. The size DOES MATTER. People who want

Suggesting mov instead of movw is for the existing assemblers. Or
kernel can check assembler version to decide if movw should be used.
I can verify the first Linux assembler which won't generate 0x66 for
movw.

 the smaller and faster version do not want to just rely on gas
 automatically getting it right, especially since gas has historically been
 very very bad at getting things right.

We are fixing those issues in assembler. If people run into problems
like that with gas, they can report them. They will be fixed.

 
 What is the advantage of not allowing movl %ds,mem? Really? Especially
 since I suspect the kernel is pretty much the only one who does this, and
 the kernel really does do it on purpose. The kernel explicitly wants the
 32-bit version, knowing that the upper bits are undefined.
 

Kernel has

unsigned gsindex;
asm volatile(movl %%gs,%0 : =g (gsindex));
...
if (gsindex)


It is OK if gcc never generates memory access like

movl %gs,0x128(%rsp)

Otherwise, the upper bits in gsindex are undefined. The new
assembler will make sure that it won't happen.


H.J.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-29 Thread Linus Torvalds


On Tue, 29 Mar 2005, H. J. Lu wrote:
 
  the smaller and faster version do not want to just rely on gas
  automatically getting it right, especially since gas has historically been
  very very bad at getting things right.
 
 We are fixing those issues in assembler. If people run into problems
 like that with gas, they can report them. They will be fixed.

It's fine if gas fixes things. It's not fine if gas breaks things that 
used to work, for no really good reason.

  What is the advantage of not allowing movl %ds,mem? Really? Especially
  since I suspect the kernel is pretty much the only one who does this, and
  the kernel really does do it on purpose. The kernel explicitly wants the
  32-bit version, knowing that the upper bits are undefined.
  
 
 Kernel has
 
   unsigned gsindex;
   asm volatile(movl %%gs,%0 : =g (gsindex));

Ok, that's a real x86-64 bug, it seems. Andi, please fix, preferably by 
just making the g be a r.

However, your argument isn't very valid, since:

 The new assembler will make sure that it won't happen.

Not true, since the suggestion was just to change all segment movl  
things to mov, at which point the same old bug is still there, and the
assembler didn't really help us at all.

See the problem? You're not actually protecting anything. The change just 
makes it _harder_ to make sizes explicit, and suddenly we have to trust an 
assembler to be clever about sizes, when that assembler historically has 
definitely _not_ been very clever about them at all. 

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-29 Thread H. J. Lu
On Tue, Mar 29, 2005 at 06:44:18PM -0800, Linus Torvalds wrote:
 
 
 On Tue, 29 Mar 2005, H. J. Lu wrote:
  
   the smaller and faster version do not want to just rely on gas
   automatically getting it right, especially since gas has historically been
   very very bad at getting things right.
  
  We are fixing those issues in assembler. If people run into problems
  like that with gas, they can report them. They will be fixed.
 
 It's fine if gas fixes things. It's not fine if gas breaks things that 
 used to work, for no really good reason.
 
   What is the advantage of not allowing movl %ds,mem? Really? Especially
   since I suspect the kernel is pretty much the only one who does this, and
   the kernel really does do it on purpose. The kernel explicitly wants the
   32-bit version, knowing that the upper bits are undefined.
   
  
  Kernel has
  
  unsigned gsindex;
  asm volatile(movl %%gs,%0 : =g (gsindex));
 
 Ok, that's a real x86-64 bug, it seems. Andi, please fix, preferably by 
 just making the g be a r.
 
 However, your argument isn't very valid, since:
 
  The new assembler will make sure that it won't happen.
 
 Not true, since the suggestion was just to change all segment movl  
 things to mov, at which point the same old bug is still there, and the
 assembler didn't really help us at all.

The new assembler won't accept

movl %gs,128(%rsp)

It makes it harder to generate binary code user doesn't tend. FWIW,
what I suggested are in

http://sourceware.org/ml/binutils/2005-03/msg00873.html

Thera are things like

-   asm volatile(movl %%fs,%0 : =g (fsindex)); 
+   asm volatile(movl %%fs,%0 : =r (fsindex)); 

 
 See the problem? You're not actually protecting anything. The change just 
 makes it _harder_ to make sizes explicit, and suddenly we have to trust an 
 assembler to be clever about sizes, when that assembler historically has 
 definitely _not_ been very clever about them at all. 
 

There is no such an instruction of movl %ds,(%eax). The old assembler
accepts it and turns it into movw %ds,(%eax). It won't catch problems
like

unsigned fsindex;
asm volatile(movl %%fs,%0 : =m (fsindex)); 

The movw %ds,(%eax) bug was fixed in binutils 2.15.94.0.1. Gas no
longer generates 0x66 for it. If you find gas preventing you from doing
what the hardware supports, I will be happy to fix it.


H.J.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-28 Thread H. J. Lu
On Mon, Mar 28, 2005 at 09:46:00AM -0800, H. J. Lu wrote:
> On Mon, Mar 28, 2005 at 05:47:06PM +0200, Andi Kleen wrote:
> > "H. J. Lu" <[EMAIL PROTECTED]> writes:
> > > The new assembler will disallow them since those instructions with
> > > memory operand will only use the first 16bits. If the memory operand
> > > is 16bit, you won't see any problems. But if the memory destinatin
> > > is 32bit, the upper 16bits may have random values. The new assembler
> > 
> > Does it really have random values on existing x86 hardware?
> 
> The x86 hardwares will only change the first 16bits. The rest bits
> are unchanged. A simple test program can verify that.
> 
> > 
> > If it is a only a "theoretical" problem that does not happen
> > in practice I would advise to not do the change.
> > 
> 
> It depends on what the initial value in the upper bits is. The
> assembler in CVS generates the same binary code as
> 
>   movw %ds,(%eax)
> 
> for
> 
>   movl %ds,(%eax)
> 
> But the previous assemblers will generate
> 
>   66 8c 18   movw   %ds,(%eax)
> 
> for
> 
>   movw %ds,(%eax)
> 
> This bug has been fixed for a while. I guess that may be why Linux
> kernel uses
> 
>   movl %ds,(%eax)

It turns out that both old and new assemblers will generate

   0:   8c 18 movw   %ds,(%eax)

for
mov %ds,(%eax)

So kernel can use "mov" instead of "movl" and the binary output will
be the same.


H.J.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-28 Thread H. J. Lu
On Mon, Mar 28, 2005 at 05:47:06PM +0200, Andi Kleen wrote:
> "H. J. Lu" <[EMAIL PROTECTED]> writes:
> > The new assembler will disallow them since those instructions with
> > memory operand will only use the first 16bits. If the memory operand
> > is 16bit, you won't see any problems. But if the memory destinatin
> > is 32bit, the upper 16bits may have random values. The new assembler
> 
> Does it really have random values on existing x86 hardware?

The x86 hardwares will only change the first 16bits. The rest bits
are unchanged. A simple test program can verify that.

> 
> If it is a only a "theoretical" problem that does not happen
> in practice I would advise to not do the change.
> 

It depends on what the initial value in the upper bits is. The
assembler in CVS generates the same binary code as

movw %ds,(%eax)

for

movl %ds,(%eax)

But the previous assemblers will generate

66 8c 18   movw   %ds,(%eax)

for

movw %ds,(%eax)

This bug has been fixed for a while. I guess that may be why Linux
kernel uses

movl %ds,(%eax)


H.J.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-28 Thread Andi Kleen
"H. J. Lu" <[EMAIL PROTECTED]> writes:
> The new assembler will disallow them since those instructions with
> memory operand will only use the first 16bits. If the memory operand
> is 16bit, you won't see any problems. But if the memory destinatin
> is 32bit, the upper 16bits may have random values. The new assembler

Does it really have random values on existing x86 hardware?

If it is a only a "theoretical" problem that does not happen
in practice I would advise to not do the change.

> will force people to use
>
>   mov (%eax),%ds
>   movw (%eax),%ds
>   movw %ds,(%eax)
>   mov %ds,(%eax)
>
> Will it be a big problem for kernel people?

Well, we re getting used to the tool chain regularly breaking
perfectly good code.

You would not get more than the usual curses and only waste
a couple hundred man hours of testers worlwide scratching their heads
why their kernel does not compile anymore. World economy 
will probably survive ite  ;-)

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-28 Thread Andi Kleen
H. J. Lu [EMAIL PROTECTED] writes:
 The new assembler will disallow them since those instructions with
 memory operand will only use the first 16bits. If the memory operand
 is 16bit, you won't see any problems. But if the memory destinatin
 is 32bit, the upper 16bits may have random values. The new assembler

Does it really have random values on existing x86 hardware?

If it is a only a theoretical problem that does not happen
in practice I would advise to not do the change.

 will force people to use

   mov (%eax),%ds
   movw (%eax),%ds
   movw %ds,(%eax)
   mov %ds,(%eax)

 Will it be a big problem for kernel people?

Well, we re getting used to the tool chain regularly breaking
perfectly good code.

You would not get more than the usual curses and only waste
a couple hundred man hours of testers worlwide scratching their heads
why their kernel does not compile anymore. World economy 
will probably survive ite  ;-)

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-28 Thread H. J. Lu
On Mon, Mar 28, 2005 at 05:47:06PM +0200, Andi Kleen wrote:
 H. J. Lu [EMAIL PROTECTED] writes:
  The new assembler will disallow them since those instructions with
  memory operand will only use the first 16bits. If the memory operand
  is 16bit, you won't see any problems. But if the memory destinatin
  is 32bit, the upper 16bits may have random values. The new assembler
 
 Does it really have random values on existing x86 hardware?

The x86 hardwares will only change the first 16bits. The rest bits
are unchanged. A simple test program can verify that.

 
 If it is a only a theoretical problem that does not happen
 in practice I would advise to not do the change.
 

It depends on what the initial value in the upper bits is. The
assembler in CVS generates the same binary code as

movw %ds,(%eax)

for

movl %ds,(%eax)

But the previous assemblers will generate

66 8c 18   movw   %ds,(%eax)

for

movw %ds,(%eax)

This bug has been fixed for a while. I guess that may be why Linux
kernel uses

movl %ds,(%eax)


H.J.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-28 Thread H. J. Lu
On Mon, Mar 28, 2005 at 09:46:00AM -0800, H. J. Lu wrote:
 On Mon, Mar 28, 2005 at 05:47:06PM +0200, Andi Kleen wrote:
  H. J. Lu [EMAIL PROTECTED] writes:
   The new assembler will disallow them since those instructions with
   memory operand will only use the first 16bits. If the memory operand
   is 16bit, you won't see any problems. But if the memory destinatin
   is 32bit, the upper 16bits may have random values. The new assembler
  
  Does it really have random values on existing x86 hardware?
 
 The x86 hardwares will only change the first 16bits. The rest bits
 are unchanged. A simple test program can verify that.
 
  
  If it is a only a theoretical problem that does not happen
  in practice I would advise to not do the change.
  
 
 It depends on what the initial value in the upper bits is. The
 assembler in CVS generates the same binary code as
 
   movw %ds,(%eax)
 
 for
 
   movl %ds,(%eax)
 
 But the previous assemblers will generate
 
   66 8c 18   movw   %ds,(%eax)
 
 for
 
   movw %ds,(%eax)
 
 This bug has been fixed for a while. I guess that may be why Linux
 kernel uses
 
   movl %ds,(%eax)

It turns out that both old and new assemblers will generate

   0:   8c 18 movw   %ds,(%eax)

for
mov %ds,(%eax)

So kernel can use mov instead of movl and the binary output will
be the same.


H.J.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-27 Thread H. J. Lu
It turns out that 2.4 kernel has

arch/i386/kernel/process.c: asm volatile("movl %%" #seg ",%0":"=m" (*(int 
*)&(value)))
arch/i386/kernel/process.c: asm volatile("movl %%fs,%0":"=m" (*(int 
*)>fs));
arch/i386/kernel/process.c: asm volatile("movl %%gs,%0":"=m" (*(int 
*)>gs));
arch/x86_64/kernel/process.c:   asm("movl %%gs,%0" : "=m" (p->thread.gsindex));
arch/x86_64/kernel/process.c:   asm("movl %%fs,%0" : "=m" (p->thread.fsindex));
arch/x86_64/kernel/process.c:   asm("movl %%es,%0" : "=m" (p->thread.es));
arch/x86_64/kernel/process.c:   asm("movl %%ds,%0" : "=m" (p->thread.ds));
arch/x86_64/kernel/process.c:   asm volatile("movl %%es,%0" : "=m" (prev->es));
arch/x86_64/kernel/process.c:   asm volatile ("movl %%ds,%0" : "=m" (prev->ds));

2.6 kernel has

arch/i386/kernel/process.c: asm volatile("movl %%fs,%0":"=m" (*(int 
*)>fs));
arch/i386/kernel/process.c: asm volatile("movl %%gs,%0":"=m" (*(int 
*)>gs));
arch/x86_64/kernel/process.c:   asm("movl %%gs,%0" : "=m" (p->thread.gsindex));
arch/x86_64/kernel/process.c:   asm("movl %%fs,%0" : "=m" (p->thread.fsindex));
arch/x86_64/kernel/process.c:   asm("movl %%es,%0" : "=m" (p->thread.es));
arch/x86_64/kernel/process.c:   asm("movl %%ds,%0" : "=m" (p->thread.ds));
arch/x86_64/kernel/process.c:   asm volatile("movl %%es,%0" : "=m" (prev->es));
arch/x86_64/kernel/process.c:   asm volatile ("movl %%ds,%0" : "=m" (prev->ds));
arch/x86_64/kernel/process.c:   asm volatile("movl %%fs,%0" : "=g" 
(fsindex));
arch/x86_64/kernel/process.c:   asm volatile("movl %%gs,%0" : "=g" 
(gsindex));

The new assembler will disallow them since those instructions with
memory operand will only use the first 16bits. If the memory operand
is 16bit, you won't see any problems. But if the memory destinatin
is 32bit, the upper 16bits may have random values. The new assembler
will force people to use

mov (%eax),%ds
movw (%eax),%ds
movw %ds,(%eax)
mov %ds,(%eax)

Will it be a big problem for kernel people?

BTW, I haven't checked glibc yet. It may have similar issues.

H.J.
---
On Fri, Mar 25, 2005 at 06:05:06PM -0800, H. J. Lu wrote:
> X86 segment register access is a special. We can move between a segment
> register and a 16/32/64bit general-purpose register. But we can only
> move between a segment register and a 16bit memory address. The current
> assembler allows "movl (%eax),%ds", but doesn't allow "movq %rax,%ds".
> The disassembler display "movl (%eax),%ds". This patch tries to fix
> those.
> 
> 
> H.J.
> 
> gas/testsuite/
> 
> 2005-03-25  H.J. Lu  <[EMAIL PROTECTED]>
> 
>   * gas/i386/i386.exp: Run segment and inval-seg for i386. Run
>   x86-64-segment and x86-64-inval-seg for x86-64.
> 
>   * gas/i386/intel.d: Expect movw for moving between memory and
>   segment register.
>   * gas/i386/naked.d: Likewise.
>   * gas/i386/opcode.d: Likewise.
>   * gas/i386/x86-64-opcode.d: Likewise.
> 
>   * gas/i386/opcode.s: Use movw for moving between memory and
>   segment register.
>   * gas/i386/x86-64-opcode.s: Likewise.
> 
>   * : Likewise.
> 
>   * gas/i386/inval-seg.l: New.
>   * gas/i386/inval-seg.s: New.
>   * gas/i386/segment.l: New.
>   * gas/i386/segment.s: New.
>   * gas/i386/x86-64-inval-seg.l: New.
>   * gas/i386/x86-64-inval-seg.s: New.
>   * gas/i386/x86-64-segment.l: New.
>   * gas/i386/x86-64-segment.s: New.
> 
> include/opcode/
> 
> 2005-03-25  H.J. Lu  <[EMAIL PROTECTED]>
> 
>   * i386.h (i386_optab): Don't allow the `l' suffix for moving
>   moving between memory and segment register. Allow movq for
>   moving between general-purpose register and segment register.
> 
> opcodes/
> 
> 2005-03-25  H.J. Lu  <[EMAIL PROTECTED]>
> 
>   * i386-dis.c (SEG_Fixup): New.
>   (Sv): New.
>   (dis386): Use "Sv" for 0x8c and 0x8e.
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


i386/x86_64 segment register issuses (Re: PATCH: Fix x86 segment register access)

2005-03-27 Thread H. J. Lu
It turns out that 2.4 kernel has

arch/i386/kernel/process.c: asm volatile(movl %% #seg ,%0:=m (*(int 
*)(value)))
arch/i386/kernel/process.c: asm volatile(movl %%fs,%0:=m (*(int 
*)prev-fs));
arch/i386/kernel/process.c: asm volatile(movl %%gs,%0:=m (*(int 
*)prev-gs));
arch/x86_64/kernel/process.c:   asm(movl %%gs,%0 : =m (p-thread.gsindex));
arch/x86_64/kernel/process.c:   asm(movl %%fs,%0 : =m (p-thread.fsindex));
arch/x86_64/kernel/process.c:   asm(movl %%es,%0 : =m (p-thread.es));
arch/x86_64/kernel/process.c:   asm(movl %%ds,%0 : =m (p-thread.ds));
arch/x86_64/kernel/process.c:   asm volatile(movl %%es,%0 : =m (prev-es));
arch/x86_64/kernel/process.c:   asm volatile (movl %%ds,%0 : =m (prev-ds));

2.6 kernel has

arch/i386/kernel/process.c: asm volatile(movl %%fs,%0:=m (*(int 
*)prev-fs));
arch/i386/kernel/process.c: asm volatile(movl %%gs,%0:=m (*(int 
*)prev-gs));
arch/x86_64/kernel/process.c:   asm(movl %%gs,%0 : =m (p-thread.gsindex));
arch/x86_64/kernel/process.c:   asm(movl %%fs,%0 : =m (p-thread.fsindex));
arch/x86_64/kernel/process.c:   asm(movl %%es,%0 : =m (p-thread.es));
arch/x86_64/kernel/process.c:   asm(movl %%ds,%0 : =m (p-thread.ds));
arch/x86_64/kernel/process.c:   asm volatile(movl %%es,%0 : =m (prev-es));
arch/x86_64/kernel/process.c:   asm volatile (movl %%ds,%0 : =m (prev-ds));
arch/x86_64/kernel/process.c:   asm volatile(movl %%fs,%0 : =g 
(fsindex));
arch/x86_64/kernel/process.c:   asm volatile(movl %%gs,%0 : =g 
(gsindex));

The new assembler will disallow them since those instructions with
memory operand will only use the first 16bits. If the memory operand
is 16bit, you won't see any problems. But if the memory destinatin
is 32bit, the upper 16bits may have random values. The new assembler
will force people to use

mov (%eax),%ds
movw (%eax),%ds
movw %ds,(%eax)
mov %ds,(%eax)

Will it be a big problem for kernel people?

BTW, I haven't checked glibc yet. It may have similar issues.

H.J.
---
On Fri, Mar 25, 2005 at 06:05:06PM -0800, H. J. Lu wrote:
 X86 segment register access is a special. We can move between a segment
 register and a 16/32/64bit general-purpose register. But we can only
 move between a segment register and a 16bit memory address. The current
 assembler allows movl (%eax),%ds, but doesn't allow movq %rax,%ds.
 The disassembler display movl (%eax),%ds. This patch tries to fix
 those.
 
 
 H.J.
 
 gas/testsuite/
 
 2005-03-25  H.J. Lu  [EMAIL PROTECTED]
 
   * gas/i386/i386.exp: Run segment and inval-seg for i386. Run
   x86-64-segment and x86-64-inval-seg for x86-64.
 
   * gas/i386/intel.d: Expect movw for moving between memory and
   segment register.
   * gas/i386/naked.d: Likewise.
   * gas/i386/opcode.d: Likewise.
   * gas/i386/x86-64-opcode.d: Likewise.
 
   * gas/i386/opcode.s: Use movw for moving between memory and
   segment register.
   * gas/i386/x86-64-opcode.s: Likewise.
 
   * : Likewise.
 
   * gas/i386/inval-seg.l: New.
   * gas/i386/inval-seg.s: New.
   * gas/i386/segment.l: New.
   * gas/i386/segment.s: New.
   * gas/i386/x86-64-inval-seg.l: New.
   * gas/i386/x86-64-inval-seg.s: New.
   * gas/i386/x86-64-segment.l: New.
   * gas/i386/x86-64-segment.s: New.
 
 include/opcode/
 
 2005-03-25  H.J. Lu  [EMAIL PROTECTED]
 
   * i386.h (i386_optab): Don't allow the `l' suffix for moving
   moving between memory and segment register. Allow movq for
   moving between general-purpose register and segment register.
 
 opcodes/
 
 2005-03-25  H.J. Lu  [EMAIL PROTECTED]
 
   * i386-dis.c (SEG_Fixup): New.
   (Sv): New.
   (dis386): Use Sv for 0x8c and 0x8e.
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/