--- Comment #11 from michaelni at gmx dot at 2008-03-24 00:08 ---
Subject: Re: Performance degradation when
building code that uses MMX intrinsics with gcc-4.0.0
On Sun, Mar 23, 2008 at 10:46:41AM -, ubizjak at gmail dot com wrote:
--- Comment #10 from ubizjak
--- Comment #9 from michaelni at gmx dot at 2008-03-23 02:49 ---
Subject: Re: Performance degradation when
building code that uses MMX intrinsics with gcc-4.0.0
On Sat, Mar 22, 2008 at 11:01:55AM -, ubizjak at gmail dot com wrote:
--- Comment #8 from ubizjak
--- Comment #6 from michaelni at gmx dot at 2008-03-22 02:15 ---
As Uros has challenged me to beat performance of gcc-4.4 generated code by
hand-crafted assembly using the example of PR 21395 heres my entry, sadly i
only have gcc-4.3 compiled ATM for comparission but 4.3 generates
--- Comment #37 from michaelni at gmx dot at 2008-03-22 02:39 ---
Subject: Re: compiled trivial vector intrinsic code is
inefficient
On Fri, Mar 21, 2008 at 10:34:00AM -, ubizjak at gmail dot com wrote:
--- Comment #36 from ubizjak at gmail dot com 2008-03-21 10
--- Comment #7 from michaelni at gmx dot at 2008-03-22 02:51 ---
You can also replace the inner loop by:
2: \n\t
pxor %%mm1, %%mm1 \n\t
movq (%%eax, %%ecx), %%mm0\n\t
psubw (%%esi, %%ecx), %%mm0\n\t
pcmpgtw %%mm0
--- Comment #35 from michaelni at gmx dot at 2008-03-20 17:18 ---
Subject: Re: compiled trivial vector intrinsic code is
inefficient
On Thu, Mar 20, 2008 at 09:49:22AM -, ubizjak at gmail dot com wrote:
--- Comment #34 from ubizjak at gmail dot com 2008-03-20 09
--- Comment #33 from michaelni at gmx dot at 2008-03-20 01:37 ---
Subject: Re: compiled trivial vector intrinsic code is
inefficient
On Wed, Mar 19, 2008 at 11:39:18PM -, uros at gcc dot gnu dot org wrote:
--- Comment #26 from uros at gcc dot gnu dot org 2008-03
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: michaelni at gmx dot at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35058
--- Comment #39 from michaelni at gmx dot at 2007-02-27 22:50 ---
(In reply to comment #38)
(In reply to comment #37)
now if there is a unwritten rule that m operands and variations of them
cannot be copied anywhere, then it would be very desireable to have a asm
constraint like
--- Comment #37 from michaelni at gmx dot at 2006-11-08 20:45 ---
(In reply to comment #36)
(In reply to comment #21)
asm volatile(
: =m (*(unsigned int*)(src + 0*stride)),
=m (*(unsigned int*)(src + 1*stride)),
=m (*(unsigned int*)(src + 2
--- Comment #8 from michaelni at gmx dot at 2006-02-11 11:40 ---
I really think this should be fixed, otherwise gcc wont be able to follow its
exponential decaying performance which it has so accurately followed since 2.95
at least, to show clearer how much speed we could loose
--- Comment #11 from michaelni at gmx dot at 2006-02-11 13:54 ---
(In reply to comment #9)
Re. comment #8:
exponential decaying performance which it has so accurately followed since
2.95
Can you back this up with numbers, or are you just trolling? If the latter,
please don't do
: inline-asm
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: michaelni at gmx dot at
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: x86-linux
GCC host triplet: x86-linux
GCC target triplet: x86-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id
--- Additional Comments From michaelni at gmx dot at 2005-01-22 17:10
---
(In reply to comment #14)
In any case, just because code is syntactically valid
GNU C doesn't mean gcc can always compile it. With this kind of inline asm,
you're bound to confuse the register allocator
--- Additional Comments From michaelni at gmx dot at 2005-01-01 18:57
---
(In reply to comment #12)
Why do people write inline-asm like this?
why not? its valid code and a compiler should compile valid code ...
It is crazy to do so. Split up the inline-asm correctly.
fix gcc
15 matches
Mail list logo