--- Comment #34 from rakdver at gcc dot gnu dot org 2005-11-17 13:35
---
It behaves somewhat erratically on SPEC2000 (it increases the overall score,
but there are some significant regressions). And, it also causes us to produce
worse code for this testcase at the moment, due to a
--- Comment #35 from rakdver at gcc dot gnu dot org 2005-11-17 15:09
---
Created an attachment (id=10263)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10263action=view)
Patch
After some playing with fold, I arrived to the following patch, that almost
works. With the patch, the
--- Comment #33 from steven at gcc dot gnu dot org 2005-11-16 09:42 ---
Zdenek, any news about your patch from comment #30?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19923
--- Comment #32 from mmitchel at gcc dot gnu dot org 2005-10-31 02:39
---
Leaving as P2 as this is a significant pessimization on a significant piece of
code on relatively common processors.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19923
--- Comment #31 from pinskia at gcc dot gnu dot org 2005-10-27 00:47
---
(In reply to comment #30)
This patch could help; I need to benchmark it before submitting it.
Any news about this patch?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19923
--
What|Removed |Added
Target Milestone|4.0.2 |4.0.3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19923
--- Additional Comments From steven at gcc dot gnu dot org 2005-06-25
10:15 ---
Re. comment #25, as far as I can tell there are registers available in
that loop. To quote the loop from comment #12:
.L4:
movb(%esi), %al
movb%al, (%edx)
leal
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni
dot cz 2005-06-25 11:32 ---
Subject: Re: [4.0/4.1 Regression] openssl is slower when compiled with gcc 4.0
than 3.3
--- Additional Comments From steven at gcc dot gnu dot org 2005-06-25
10:15 ---
--- Additional Comments From dank at kegel dot com 2005-06-24 15:00 ---
Michael Meissner looked at the code, and saw that
gcc-2.95.3 converts the loop to a countdown loop,
but gcc-3.x doesn't, which wastes a precious register.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19923
--- Additional Comments From dank at kegel dot com 2005-06-24 15:01 ---
And, for what it's worth, the latest 4.1 snapshot also suffers from this.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19923
--- Additional Comments From steven at gcc dot gnu dot org 2005-06-24
15:53 ---
I don't see how the precious register would matter much. But this compare
with memory is strange:
cmpl%ecx, 12(%ebp)
Why isn't len loaded into a register??
--
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni
dot cz 2005-06-24 16:24 ---
Subject: Re: [4.0/4.1 Regression] openssl is slower when compiled with gcc 4.0
than 3.3
I don't see how the precious register would matter much. But this compare
with memory is
--- Additional Comments From dann at godzilla dot ics dot uci dot edu
2005-06-24 17:41 ---
(In reply to comment #21)
The slow routine appears to be the buffer cleaning routine,
though I haven't verified this with oprofile yet.
Here's its loop:
static char cleanse_ctr;
...
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-06-25
02:49 ---
Ivopts seem to do several quite doubtful decisions in this testcase.
--
What|Removed |Added
--- Additional Comments From dank at kegel dot com 2005-06-18 06:24 ---
Looks to me like gcc-3.4.3 is known to fail, too, depending on the CPU.
Anthony Danalis and I came up with a little script to run foo4.i
on various processors with various values for -mtune, which I'll
attach; here
--- Additional Comments From dank at kegel dot com 2005-06-18 06:38 ---
To be clear, here are the two most worrying rows from the above table,
expanded a bit. These are the runtimes of foo4.i in seconds.
The cpu family, model, and name are as shown by /proc/cpuinfo.
cpu family 15,
--- Additional Comments From dank at kegel dot com 2005-06-18 17:46 ---
The above tests did not use -mcpu on gcc-2.95.3,
so they were comparing apples to oranges, kind of.
I reran them on a PIII with gcc-2.95.3 -mcpu=$tune -O3
and gcc-[34] -mtune=$tune -O3. The problem persists
even
--- Additional Comments From dank at kegel dot com 2005-06-18 22:45 ---
I asked the fellow who posted the original problem report to give
me the results of 'cat /proc/cpuinfo' on the affected machine.
Here it is:
vendor_id : GenuineIntel
cpu family : 6
model : 8
--- Additional Comments From dank at kegel dot com 2005-06-17 00:59 ---
We're learning more about this bug.
Anthony Danalis has boiled down the testcase much further;
I'll attach the reduced testcase as foo4.i.
It looks like it shows up if your /proc/cpuinfo says
vendor_id :
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-06-17
01:10 ---
(In reply to comment #14)
We're learning more about this bug.
Anthony Danalis has boiled down the testcase much further;
I'll attach the reduced testcase as foo4.i.
Yes you know what the difference is
--
What|Removed |Added
Summary|openssl is slower when |[4.0/4.1 Regression] openssl
|compiled with gcc 4.0 than |is slower when compiled with
21 matches
Mail list logo