ALSA + GCC 4.1.1 + -Os is known to be a bad combination on some
arches; see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27363 . (I
tripped over it on an ARM target, but my limited understanding of GCC
internals does not allow me to conclude that it is ARM-specific.) A
patch claiming to fix the
ALSA + GCC 4.1.1 + -Os is known to be a bad combination on some
arches; see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27363 . (I
tripped over it on an ARM target, but my limited understanding of GCC
internals does not allow me to conclude that it is ARM-specific.) A
patch claiming to fix the
I want this:
char v[4];
...
memcmp(v, "abcd", 4) == 0
compile to single cmpl on i386. This (gcc 4.1.1) is ridiculous:
callmemcmp
i686-linux-gcc (GCC) 4.2.0 20060410 (experimental)
movl$4, %ecx#, tmp65
cld
movl$v, %esi#,
I want this:
char v[4];
...
memcmp(v, abcd, 4) == 0
compile to single cmpl on i386. This (gcc 4.1.1) is ridiculous:
callmemcmp
i686-linux-gcc (GCC) 4.2.0 20060410 (experimental)
movl$4, %ecx#, tmp65
cld
movl$v, %esi#,
Linus Torvalds wrote:
(That said, I think __builtin_memcpy() does a reasonable job these days
with gcc, and we might drop the crap one day when we can trust the
compiler to do ok. It didn't use to, and we continued using our
ridiculous macro/__builtin_constant_p misuses just because it works
On Sun, 7 Jan 2007, Denis Vlasenko wrote:
>
> I'd say "care about obvious, safe optimizations which we still not do".
> I want this:
>
> char v[4];
> ...
> memcmp(v, "abcd", 4) == 0
>
> compile to single cmpl on i386.
Yeah. For a more relevant case, look at the hoops we used to jump
On Thursday 04 January 2007 18:37, Linus Torvalds wrote:
> With 7+ million lines of C code and headers, I'm not interested in
> compilers that read the letter of the law. We don't want some really
> clever code generation that gets us .5% on some unrealistic load. We want
> good _solid_ code
On Sunday 07 January 2007 00:36, Pavel Machek wrote:
[snip]
> > However, this patch is mostly useless if you have a separate stack for
> > IRQ's (since if that happens, any interrupt will be taken on a different
> > stack which we don't see any more), so you should NOT enable the 4KSTACKS
> >
Hi!
> > (I realise with problems like these it's almost always some sort of obscure
> > hardware problem, but I find that very difficult to believe when I can
> > toggle
> > from 3 years of stability to 6-18 hours crashing by switching compiler.
> > I've
> > also ran extensive stability test
For a different mailing list indeed; let me just point
out
that for certain important quite common cases it's an
~50%
overall speedup.
Hmm, what code was that? 'signed int does not wrap around' does not
seem to provide _that_ much info...
One of the recent huge threads on the GCC dev list has
For a different mailing list indeed; let me just point
out
that for certain important quite common cases it's an
~50%
overall speedup.
Hmm, what code was that? 'signed int does not wrap around' does not
seem to provide _that_ much info...
One of the recent huge threads on the GCC dev list has
Hi!
(I realise with problems like these it's almost always some sort of obscure
hardware problem, but I find that very difficult to believe when I can
toggle
from 3 years of stability to 6-18 hours crashing by switching compiler.
I've
also ran extensive stability test programs on
On Sunday 07 January 2007 00:36, Pavel Machek wrote:
[snip]
However, this patch is mostly useless if you have a separate stack for
IRQ's (since if that happens, any interrupt will be taken on a different
stack which we don't see any more), so you should NOT enable the 4KSTACKS
config
On Thursday 04 January 2007 18:37, Linus Torvalds wrote:
With 7+ million lines of C code and headers, I'm not interested in
compilers that read the letter of the law. We don't want some really
clever code generation that gets us .5% on some unrealistic load. We want
good _solid_ code
On Sun, 7 Jan 2007, Denis Vlasenko wrote:
I'd say care about obvious, safe optimizations which we still not do.
I want this:
char v[4];
...
memcmp(v, abcd, 4) == 0
compile to single cmpl on i386.
Yeah. For a more relevant case, look at the hoops we used to jump through
to get
Linus Torvalds wrote:
(That said, I think __builtin_memcpy() does a reasonable job these days
with gcc, and we might drop the crap one day when we can trust the
compiler to do ok. It didn't use to, and we continued using our
ridiculous macro/__builtin_constant_p misuses just because it works
Hi!
> >IMHO you should play such games with "g++ -O9", but
> >that's
> >a discussion for a different mailing list.
>
> For a different mailing list indeed; let me just point
> out
> that for certain important quite common cases it's an
> ~50%
> overall speedup.
Hmm, what code was that?
On Fri, 5 Jan 2007, Alistair John Strachan wrote:
>
> (I realise with problems like these it's almost always some sort of obscure
> hardware problem, but I find that very difficult to believe when I can toggle
> from 3 years of stability to 6-18 hours crashing by switching compiler. I've
>
On Friday 05 January 2007 16:02, Linus Torvalds wrote:
> On Fri, 5 Jan 2007, Alistair John Strachan wrote:
> > This didn't help. After about 14 hours, the machine crashed again.
> >
> > cmov is not the culprit.
>
> Ok. Have you ever tried to limit the drivers you have loaded? I notice you
> had
On Fri, 5 Jan 2007, Alistair John Strachan wrote:
>
> This didn't help. After about 14 hours, the machine crashed again.
>
> cmov is not the culprit.
Ok. Have you ever tried to limit the drivers you have loaded? I notice you
had the prism54 wireless thing in your modules list and the vt1211
On Wednesday 03 January 2007 02:20, Alistair John Strachan wrote:
> On Wednesday 03 January 2007 02:12, Mikael Pettersson wrote:
> > On Tue, 2 Jan 2007 17:43:00 -0800 (PST), Linus Torvalds wrote:
> > > > The suggestions I've had so far which I have not yet tried:
> > > >
> > > > - Select a
On Wednesday 03 January 2007 02:20, Alistair John Strachan wrote:
On Wednesday 03 January 2007 02:12, Mikael Pettersson wrote:
On Tue, 2 Jan 2007 17:43:00 -0800 (PST), Linus Torvalds wrote:
The suggestions I've had so far which I have not yet tried:
- Select a different x86
On Fri, 5 Jan 2007, Alistair John Strachan wrote:
This didn't help. After about 14 hours, the machine crashed again.
cmov is not the culprit.
Ok. Have you ever tried to limit the drivers you have loaded? I notice you
had the prism54 wireless thing in your modules list and the vt1211 hw
On Friday 05 January 2007 16:02, Linus Torvalds wrote:
On Fri, 5 Jan 2007, Alistair John Strachan wrote:
This didn't help. After about 14 hours, the machine crashed again.
cmov is not the culprit.
Ok. Have you ever tried to limit the drivers you have loaded? I notice you
had the prism54
On Fri, 5 Jan 2007, Alistair John Strachan wrote:
(I realise with problems like these it's almost always some sort of obscure
hardware problem, but I find that very difficult to believe when I can toggle
from 3 years of stability to 6-18 hours crashing by switching compiler. I've
also
Hi!
IMHO you should play such games with g++ -O9, but
that's
a discussion for a different mailing list.
For a different mailing list indeed; let me just point
out
that for certain important quite common cases it's an
~50%
overall speedup.
Hmm, what code was that? 'signed int does
On Jan 4, 2007, at 13:34, Segher Boessenkool wrote:
The "signed wrap is undefined" thing doesn't fit in this category
though:
-- It is an important optimisation for loops with a signed
induction variable;
It certainly isn't that important. Even SpecINT compiled with
-O3 and top-of-tree
On Thu, Jan 04, 2007 at 09:47:01AM -0800, Linus Torvalds wrote:
> NOBODY will guarantee you that they follow all standards to the letter.
> Some use compiler extensions knowingly, but pretty much _everybody_ ends
> up depending on subtle issues without even realizing it. It's almost
>
(in which case, nearly all real-world code is broken)
Not "nearly all" -- but lots of code, yes.
I wouldn't say "lots of code". I would say "all real projects".
All projects that tell the compiler they're written in ISO C,
while they're not, can easily break, sure. You can't say this
is
I'll happily turn off compiler features that are "clever optimizations
that never actually matter in practice, but are just likely to possible
cause problems".
The "signed wrap is undefined" thing doesn't fit in this category
though:
-- It is an important optimisation for loops with a signed
"Albert Cahalan" <[EMAIL PROTECTED]> writes:
> FYI, the kernel also assumes that a "char" is 8 bits.
> Maybe you should run away screaming.
You are confusing "undefined" with "implementation defined". Those are
two quite different concepts.
Andreas.
--
Andreas Schwab, SuSE Labs, [EMAIL
On Thu, 4 Jan 2007, Segher Boessenkool wrote:
>
> > (in which case, nearly all real-world code is broken)
>
> Not "nearly all" -- but lots of code, yes.
I wouldn't say "lots of code". I would say "all real projects".
NOBODY will guarantee you that they follow all standards to the letter.
On Thu, 4 Jan 2007, Albert Cahalan wrote:
> On 1/4/07, Segher Boessenkool <[EMAIL PROTECTED]> wrote:
> >
> > Lack of the flag does not break any valid C code, only code
> > making unwarranted assumptions (i.e., buggy code).
>
> Right, if "C" means "strictly conforming ISO C" to you.
> (in
Lack of the flag does not break any valid C code, only code
making unwarranted assumptions (i.e., buggy code).
Right, if "C" means "strictly conforming ISO C" to you.
Without any further qualification, it of course does, yes.
(in which case, nearly all real-world code is broken)
Not
On 1/4/07, Segher Boessenkool <[EMAIL PROTECTED]> wrote:
> Adjusting gcc flags to eliminate optimizations is another way to go.
> Adding -fwrapv would be an excellent start. Lack of this flag breaks
> most code which checks for integer wrap-around.
Lack of the flag does not break any valid C
Adjusting gcc flags to eliminate optimizations is another way to go.
Adding -fwrapv would be an excellent start. Lack of this flag breaks
most code which checks for integer wrap-around.
Lack of the flag does not break any valid C code, only code
making unwarranted assumptions (i.e., buggy
On Thu, 4 Jan 2007, Zou, Nanhai wrote:
>
> cmov will stall on eflags in your test program.
And that is EXACTLY my point.
CMOV is a piece of CRAP for most things, exactly because it serializes
three streams of data: the two inputs, and the conditional.
My test-case was actually _good_ for
On Thu, 4 Jan 2007, Zou, Nanhai wrote:
cmov will stall on eflags in your test program.
And that is EXACTLY my point.
CMOV is a piece of CRAP for most things, exactly because it serializes
three streams of data: the two inputs, and the conditional.
My test-case was actually _good_ for
Adjusting gcc flags to eliminate optimizations is another way to go.
Adding -fwrapv would be an excellent start. Lack of this flag breaks
most code which checks for integer wrap-around.
Lack of the flag does not break any valid C code, only code
making unwarranted assumptions (i.e., buggy
On 1/4/07, Segher Boessenkool [EMAIL PROTECTED] wrote:
Adjusting gcc flags to eliminate optimizations is another way to go.
Adding -fwrapv would be an excellent start. Lack of this flag breaks
most code which checks for integer wrap-around.
Lack of the flag does not break any valid C code,
Lack of the flag does not break any valid C code, only code
making unwarranted assumptions (i.e., buggy code).
Right, if C means strictly conforming ISO C to you.
Without any further qualification, it of course does, yes.
(in which case, nearly all real-world code is broken)
Not nearly
On Thu, 4 Jan 2007, Albert Cahalan wrote:
On 1/4/07, Segher Boessenkool [EMAIL PROTECTED] wrote:
Lack of the flag does not break any valid C code, only code
making unwarranted assumptions (i.e., buggy code).
Right, if C means strictly conforming ISO C to you.
(in which case, nearly
On Thu, 4 Jan 2007, Segher Boessenkool wrote:
(in which case, nearly all real-world code is broken)
Not nearly all -- but lots of code, yes.
I wouldn't say lots of code. I would say all real projects.
NOBODY will guarantee you that they follow all standards to the letter.
Some use
Albert Cahalan [EMAIL PROTECTED] writes:
FYI, the kernel also assumes that a char is 8 bits.
Maybe you should run away screaming.
You are confusing undefined with implementation defined. Those are
two quite different concepts.
Andreas.
--
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE
I'll happily turn off compiler features that are clever optimizations
that never actually matter in practice, but are just likely to possible
cause problems.
The signed wrap is undefined thing doesn't fit in this category
though:
-- It is an important optimisation for loops with a signed
(in which case, nearly all real-world code is broken)
Not nearly all -- but lots of code, yes.
I wouldn't say lots of code. I would say all real projects.
All projects that tell the compiler they're written in ISO C,
while they're not, can easily break, sure. You can't say this
is GCC's
On Thu, Jan 04, 2007 at 09:47:01AM -0800, Linus Torvalds wrote:
NOBODY will guarantee you that they follow all standards to the letter.
Some use compiler extensions knowingly, but pretty much _everybody_ ends
up depending on subtle issues without even realizing it. It's almost
impossible to
On Jan 4, 2007, at 13:34, Segher Boessenkool wrote:
The signed wrap is undefined thing doesn't fit in this category
though:
-- It is an important optimisation for loops with a signed
induction variable;
It certainly isn't that important. Even SpecINT compiled with
-O3 and top-of-tree GCC
Linus Torvalds writes:
[probably Mikael Pettersson] writes:
The suggestions I've had so far which I have not yet tried:
- Select a different x86 CPU in the config.
- Unfortunately the C3-2 flags seem to simply tell GCC to
schedule for ppro (like i686) and enabled MMX and SSE
-
L PROTECTED];
> linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
> Subject: Re: kernel + gcc 4.1 = several problems
>
>
>
> On Wed, 3 Jan 2007, Grzegorz Kulewski wrote:
> >
> > Could you explain why CMOV is pointless now? Are there any benchmarks
> > proving
&g
On Wed, 2007-01-03 at 08:03 -0800, Linus Torvalds wrote:
> and assuming the branch is AT ALL predictable (and 95+% of all branches
> are), the branch-over will actually be a LOT better for a CPU.
IF... Counterexample: Add-Compare-Select in a Viterbi Decoder. If the
compare can be predicted, you
On Wed, 3 Jan 2007, Denis Vlasenko wrote:
>
> IOW: yet another slot in instruction opcode matrix and thousands of
> transistors in instruction decoders are wasted because of this
> "clever invention", eh?
Well, in all fairness, it can probably help more on certain
microarchitectures. Intel is
On Wed, 3 Jan 2007, Thomas Sailer wrote:
>
> IF... Counterexample: Add-Compare-Select in a Viterbi Decoder.
Yes. [De]compression stuff tends to be (a) totally unpredictable and (b) a
situation where people care about performance. It's fairly rare in many
other situations.
That said, any
On Wednesday 03 January 2007 21:38, Linus Torvalds wrote:
> On Wed, 3 Jan 2007, Denis Vlasenko wrote:
> >
> > Why CPU people do not internally convert cmov into jmp,mov pair?
>
...
> It really all boils down to: there's simply no real reason to use cmov.
> It's not horrible either, so go ahead
On Wed, 3 Jan 2007, Denis Vlasenko wrote:
>
> Why CPU people do not internally convert cmov into jmp,mov pair?
Probably because
- it's not worth it. cmov's certainly _can_ be faster for unpredictable
input. So expecially if you teach your compiler (by using profiling) to
use cmov's
On Wed, 3 Jan 2007, Tim Schmielau wrote:
>
> Well, on a P4 (which is supposed to be soo bad) I get:
Interesting. My P4 gets basically exactly the same timings for the cmov
and branch cases. And my Core 2 is consistently faster (something like
15%) for the branch version.
Btw, the test-case
On Wed, Jan 03, 2007 at 04:25:09PM +0100, Udo van den Heuvel wrote:
> Hello,
>
> I just read about the subjects.
> I have a firewall which has some issues.
> First it was a VIA CL6000 (c3).
> Now it is a EK8000 (c3-2) with different power supply, RAM and board of
> course. Still I see strange
On Wednesday 03 January 2007 17:03, Linus Torvalds wrote:
> On Wed, 3 Jan 2007, Grzegorz Kulewski wrote:
> > Could you explain why CMOV is pointless now? Are there any benchmarks
> > proving
> > that?
>
> CMOV (and, more generically, any "predicated instruction") tends to
> generally a bad idea
Hello,
> Here's an example program that you can test and time yourself.
>
> On my Core 2, I get
>
> [EMAIL PROTECTED] ~]$ gcc -DCMOV -Wall -O2 t.c
> [EMAIL PROTECTED] ~]$ time ./a.out
> 6
>
> real0m0.194s
> user0m0.192s
> sys 0m0.000s
>
Well, on a P4 (which is supposed to be soo bad) I get:
> gcc -O2 t.c -o t
> foreach x ( 1 2 3 4 5 )
>> time ./t > /dev/null
>> end
0.196u 0.004s 0:00.19 100.0%0+0k 0+0io 0pf+0w
0.168u 0.004s 0:00.16 100.0%0+0k 0+0io 0pf+0w
0.168u 0.000s 0:00.16 100.0%0+0k 0+0io 0pf+0w
0.160u 0.000s
: Alan <[EMAIL PROTECTED]>, Mikael Pettersson <[EMAIL PROTECTED]>,
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], linux-kernel@vger.kernel.org,
[EMAIL PROTECTED]
Subject: Re: kernel + gcc 4.1 = several problems
Resent-Date: We
Just to make clearer why I am so curious, this from X86_64 X2 3800+:
DarkStar:{venom}:/tmp> gcc -DCMOV -Wall -O2 t.c
DarkStar:{venom}:/tmp>time ./a.out
6
real0m0.151s
user0m0.150s
sys 0m0.000s
DarkStar:{venom}:/tmp> gcc -Wall -O2 t.c
DarkStar:{venom}:/tmp> time ./a.out
On Wed, 3 Jan 2007, Alan wrote:
>
> > cmov is effectively the same cost as a compare and jump, in both cases
> > the cpu needs to do a prediction, and on a mispredict, restart.
>
> On a P4 it appears to be slower than compare/jump in most cases
On just about EVERYTHING it's slower than
On Wed, 3 Jan 2007, Grzegorz Kulewski wrote:
>
> Could you explain why CMOV is pointless now? Are there any benchmarks proving
> that?
CMOV (and, more generically, any "predicated instruction") tends to
generally a bad idea on an aggressively out-of-order CPU. It doesn't
always have to be
Hello,
I just read about the subjects.
I have a firewall which has some issues.
First it was a VIA CL6000 (c3).
Now it is a EK8000 (c3-2) with different power supply, RAM and board of
course. Still I see strange things sometimes. Crashes, hangs, etc. Now
and then. Not too often.
I have in
> cmov is effectively the same cost as a compare and jump, in both cases
> the cpu needs to do a prediction, and on a mispredict, restart.
On a P4 it appears to be slower than compare/jump in most cases
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a
On Wed, Jan 03, 2007 at 05:32:16AM -0800, Arjan van de Ven wrote:
> On Wed, 2007-01-03 at 12:44 +, Alan wrote:
> > > > fixed. At that point an i686 kernel would contain i686 instructions and
> > > > actually run on all i686 processors ending all the i586 pain for most
> > > > users and
On Wed, 2007-01-03 at 12:44 +, Alan wrote:
> > > fixed. At that point an i686 kernel would contain i686 instructions and
> > > actually run on all i686 processors ending all the i586 pain for most
> > > users and distributions.
> >
> > Could you explain why CMOV is pointless now? Are there
> > fixed. At that point an i686 kernel would contain i686 instructions and
> > actually run on all i686 processors ending all the i586 pain for most
> > users and distributions.
>
> Could you explain why CMOV is pointless now? Are there any benchmarks
> proving that?
Take a look at the recent
Grzegorz Kulewski wrote:
On Wed, 3 Jan 2007, Alan wrote:
The proper fix for all of this mess is to fix the gcc compiler suite to
actually generate i686 code when told to use i686. CMOV is an optional
i686 extension which gcc uses without checking. In early PIV days it made
sense but on modern
On Wed, 3 Jan 2007, Alan wrote:
The proper fix for all of this mess is to fix the gcc compiler suite to
actually generate i686 code when told to use i686. CMOV is an optional
i686 extension which gcc uses without checking. In early PIV days it made
sense but on modern processors CMOV is so
> That's a good suggestion. Earlier C3s didn't have cmov so it's
> not entirely unlikely that cmov in C3-2 is broken in some cases.
> Configuring for P5MMX or 486 should be good safe alternatives.
The proper fix for all of this mess is to fix the gcc compiler suite to
actually generate i686 code
That's a good suggestion. Earlier C3s didn't have cmov so it's
not entirely unlikely that cmov in C3-2 is broken in some cases.
Configuring for P5MMX or 486 should be good safe alternatives.
The proper fix for all of this mess is to fix the gcc compiler suite to
actually generate i686 code
On Wed, 3 Jan 2007, Alan wrote:
The proper fix for all of this mess is to fix the gcc compiler suite to
actually generate i686 code when told to use i686. CMOV is an optional
i686 extension which gcc uses without checking. In early PIV days it made
sense but on modern processors CMOV is so
Grzegorz Kulewski wrote:
On Wed, 3 Jan 2007, Alan wrote:
The proper fix for all of this mess is to fix the gcc compiler suite to
actually generate i686 code when told to use i686. CMOV is an optional
i686 extension which gcc uses without checking. In early PIV days it made
sense but on modern
fixed. At that point an i686 kernel would contain i686 instructions and
actually run on all i686 processors ending all the i586 pain for most
users and distributions.
Could you explain why CMOV is pointless now? Are there any benchmarks
proving that?
Take a look at the recent ffmpeg
On Wed, 2007-01-03 at 12:44 +, Alan wrote:
fixed. At that point an i686 kernel would contain i686 instructions and
actually run on all i686 processors ending all the i586 pain for most
users and distributions.
Could you explain why CMOV is pointless now? Are there any benchmarks
On Wed, Jan 03, 2007 at 05:32:16AM -0800, Arjan van de Ven wrote:
On Wed, 2007-01-03 at 12:44 +, Alan wrote:
fixed. At that point an i686 kernel would contain i686 instructions and
actually run on all i686 processors ending all the i586 pain for most
users and distributions.
cmov is effectively the same cost as a compare and jump, in both cases
the cpu needs to do a prediction, and on a mispredict, restart.
On a P4 it appears to be slower than compare/jump in most cases
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a
Hello,
I just read about the subjects.
I have a firewall which has some issues.
First it was a VIA CL6000 (c3).
Now it is a EK8000 (c3-2) with different power supply, RAM and board of
course. Still I see strange things sometimes. Crashes, hangs, etc. Now
and then. Not too often.
I have in
On Wed, 3 Jan 2007, Grzegorz Kulewski wrote:
Could you explain why CMOV is pointless now? Are there any benchmarks proving
that?
CMOV (and, more generically, any predicated instruction) tends to
generally a bad idea on an aggressively out-of-order CPU. It doesn't
always have to be
On Wed, 3 Jan 2007, Alan wrote:
cmov is effectively the same cost as a compare and jump, in both cases
the cpu needs to do a prediction, and on a mispredict, restart.
On a P4 it appears to be slower than compare/jump in most cases
On just about EVERYTHING it's slower than compare/jump.
Just to make clearer why I am so curious, this from X86_64 X2 3800+:
DarkStar:{venom}:/tmp gcc -DCMOV -Wall -O2 t.c
DarkStar:{venom}:/tmptime ./a.out
6
real0m0.151s
user0m0.150s
sys 0m0.000s
DarkStar:{venom}:/tmp gcc -Wall -O2 t.c
DarkStar:{venom}:/tmp time ./a.out
Pettersson [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], linux-kernel@vger.kernel.org,
[EMAIL PROTECTED]
Subject: Re: kernel + gcc 4.1 = several problems
Resent-Date: Wed, 03 Jan 2007 17:16:00 +0100
Resent-From: [EMAIL
Well, on a P4 (which is supposed to be soo bad) I get:
gcc -O2 t.c -o t
foreach x ( 1 2 3 4 5 )
time ./t /dev/null
end
0.196u 0.004s 0:00.19 100.0%0+0k 0+0io 0pf+0w
0.168u 0.004s 0:00.16 100.0%0+0k 0+0io 0pf+0w
0.168u 0.000s 0:00.16 100.0%0+0k 0+0io 0pf+0w
0.160u 0.000s 0:00.15
Hello,
Here's an example program that you can test and time yourself.
On my Core 2, I get
[EMAIL PROTECTED] ~]$ gcc -DCMOV -Wall -O2 t.c
[EMAIL PROTECTED] ~]$ time ./a.out
6
real0m0.194s
user0m0.192s
sys 0m0.000s
On Wednesday 03 January 2007 17:03, Linus Torvalds wrote:
On Wed, 3 Jan 2007, Grzegorz Kulewski wrote:
Could you explain why CMOV is pointless now? Are there any benchmarks
proving
that?
CMOV (and, more generically, any predicated instruction) tends to
generally a bad idea on an
On Wed, Jan 03, 2007 at 04:25:09PM +0100, Udo van den Heuvel wrote:
Hello,
I just read about the subjects.
I have a firewall which has some issues.
First it was a VIA CL6000 (c3).
Now it is a EK8000 (c3-2) with different power supply, RAM and board of
course. Still I see strange things
On Wed, 3 Jan 2007, Tim Schmielau wrote:
Well, on a P4 (which is supposed to be soo bad) I get:
Interesting. My P4 gets basically exactly the same timings for the cmov
and branch cases. And my Core 2 is consistently faster (something like
15%) for the branch version.
Btw, the test-case
On Wed, 3 Jan 2007, Denis Vlasenko wrote:
Why CPU people do not internally convert cmov into jmp,mov pair?
Probably because
- it's not worth it. cmov's certainly _can_ be faster for unpredictable
input. So expecially if you teach your compiler (by using profiling) to
use cmov's
On Wednesday 03 January 2007 21:38, Linus Torvalds wrote:
On Wed, 3 Jan 2007, Denis Vlasenko wrote:
Why CPU people do not internally convert cmov into jmp,mov pair?
...
It really all boils down to: there's simply no real reason to use cmov.
It's not horrible either, so go ahead and use
On Wed, 3 Jan 2007, Thomas Sailer wrote:
IF... Counterexample: Add-Compare-Select in a Viterbi Decoder.
Yes. [De]compression stuff tends to be (a) totally unpredictable and (b) a
situation where people care about performance. It's fairly rare in many
other situations.
That said, any real
On Wed, 3 Jan 2007, Denis Vlasenko wrote:
IOW: yet another slot in instruction opcode matrix and thousands of
transistors in instruction decoders are wasted because of this
clever invention, eh?
Well, in all fairness, it can probably help more on certain
microarchitectures. Intel is
On Wed, 2007-01-03 at 08:03 -0800, Linus Torvalds wrote:
and assuming the branch is AT ALL predictable (and 95+% of all branches
are), the branch-over will actually be a LOT better for a CPU.
IF... Counterexample: Add-Compare-Select in a Viterbi Decoder. If the
compare can be predicted, you
@vger.kernel.org; [EMAIL PROTECTED]
Subject: Re: kernel + gcc 4.1 = several problems
On Wed, 3 Jan 2007, Grzegorz Kulewski wrote:
Could you explain why CMOV is pointless now? Are there any benchmarks
proving
that?
CMOV (and, more generically, any predicated instruction) tends
On Wed, Jan 03, 2007 at 03:12:13AM +0100, Mikael Pettersson wrote:
> On Tue, 2 Jan 2007 17:43:00 -0800 (PST), Linus Torvalds wrote:
> > > The suggestions I've had so far which I have not yet tried:
> > >
> > > - Select a different x86 CPU in the config.
> > > - Unfortunately the
On Tue, 2 Jan 2007 17:43:00 -0800 (PST), Linus Torvalds wrote:
> > The suggestions I've had so far which I have not yet tried:
> >
> > - Select a different x86 CPU in the config.
> > - Unfortunately the C3-2 flags seem to simply tell GCC
> > to schedule for
On Wednesday 03 January 2007 02:12, Mikael Pettersson wrote:
> On Tue, 2 Jan 2007 17:43:00 -0800 (PST), Linus Torvalds wrote:
> > > The suggestions I've had so far which I have not yet tried:
> > >
> > > - Select a different x86 CPU in the config.
> > > - Unfortunately the C3-2
D. Hazelton <[EMAIL PROTECTED]> wrote:
[...]
> None. I didn't file a report on this because I didn't find the big, just
> noted a problem that appears to occur. In this case the call's generated
> seem to wrap loops - something I've never heard of anyone doing.
Example code showing this
On Tue, 2 Jan 2007, Alistair John Strachan wrote:
>
> eax: 0008 ebx: ecx: 0008 edx:
> esi: f70f3e9c edi: f7017c00 ebp: f70f3c1c esp: f70f3c0c
>
> Code: 58 01 00 00 0f 4f c2 09 c1 89 c8 83 c8 08 85 db 0f 44 c8 8b 5d f4 89 c8
> 8b 75 f8 8b 7d fc 89 ec 5d c3
1 - 100 of 123 matches
Mail list logo