--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-06-24 06:34 ---
One use of this macro is to increase alignment of medium-size
data to make it all fit in fewer cache lines.
1) This potentially makes single string fit into fewer cachelines
ReportedBy: vda at port dot imtp dot ilyichevsk dot odessa dot ua
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i386-pc-linux-gnu
GCC host triplet: i386-pc-linux-gnu
GCC target triplet: i386-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22158
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-06-23 06:03 ---
Created an attachment (id=9129)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9129action=view)
Do not align at all if -Os
Sorry only have 3.4.1 sources available locally
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-06-23 06:04 ---
Created an attachment (id=9130)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9130action=view)
Same patch with slightly different formatting
Also run tested
--
http
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-06-23 06:59 ---
Created an attachment (id=9131)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9131action=view)
While we are at it, speed up ix86_data_alignment
All if()s below are true only
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-06-23 07:07 ---
Created an attachment (id=9132)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9132action=view)
Same for ix86_local_alignment()
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-06-23 12:56 ---
In majority of cases char msg[] = A message is used for text strings.
These are _bytes_, they need no alignment whatsoever, let alone 32 byte one.
I'm perfectly fine if other people
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-06-23 13:03 ---
Oh, I did look at http://gcc.gnu.org/ml/gcc-patches/2000-06/msg00860.html,
I see 128 and 256 bit alignment added, but I don't immediately see where it is
applied to byte arrays
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-06-14 07:06 ---
If I understand this correctly, older GCCs were able to
figure out that when there is 5 registers available,
=g (__d3) can olny be matched with memory (on-stack local var)
whereas
: unassigned at gcc dot gnu dot org
ReportedBy: vda at port dot imtp dot ilyichevsk dot odessa dot ua
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i386-pc-linux-gnu
GCC host triplet: i386-pc-linux-gnu
GCC target triplet: i386-pc-linux-gnu
http://gcc.gnu.org/bugzilla
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-05-02 09:00 ---
Created an attachment (id=8790)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8790action=view)
testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21329
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-05-02 09:02 ---
Created an attachment (id=8791)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8791action=view)
patch against 4.0.0
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21329
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-05-02 09:04 ---
Comparison between old and new code (-O2):
--- tO2.s Mon May 2 11:49:24 2005
+++ tO2-new.s Mon May 2 11:50:03 2005
@@ -35,8 +35,7 @@
movl$t21, %edi
movl
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-05-02 09:10 ---
BTW, see above comment: gcc -O2 allocated 24 bytes on stack
and never uset them. ?!
Now, unoptimized compilation comparison:
--- t.s Mon May 2 11:41:20 2005
+++ t-new.s Mon May
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-27 05:38 ---
Marking as invalid. I found out that this happens on Celeron
but doesn't happen on Athlon. Must be instruction scheduling
artifact.
Same binaries were used:
# gcc -O2 -o twofish_O2
ReportedBy: vda at port dot imtp dot ilyichevsk dot odessa dot ua
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i386-pc-linux-gnu
GCC host triplet: i386-pc-linux-gnu
GCC target triplet: i386-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21202
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-25 07:34 ---
As you can see by inspecting .s file,
I replaced gcc 3.4.3 with gcc 4.0.0 between compiles.
Both of them produce extra moves.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21202
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-24 13:05 ---
With 4.0.0: gcc -O2 gives the same result as gcc -O3,
which is better than gcc 3.4.3 -O2 but worse than 3.4.3 -O3.
For example:
movl%edx, -20(%ebp)
orl %ecx
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-24 13:26 ---
I don't think that bug description is correct.
I believe similar observation will be valid for byte extraction
from u32 and u16, and for u16-from-u32, etc.
Update for latest gcc
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: vda at port dot imtp dot ilyichevsk dot odessa dot ua
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i386-pc-linux-gnu
GCC host triplet: i386-pc-linux-gnu
GCC target
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-23 22:32 ---
Created an attachment (id=8719)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8719action=view)
testcase. change #if 0 into #if 1 and compare resulting asm
--
http
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-23 22:49 ---
Aha!
I found out that gcc will use registers with -O3, but not with -O2.
# gcc -O3 serpent.c -S -o serpent-O3.s
# gcc -O2 serpent.c -S -o serpent-O2.s
# ls -l
-rw-r--r-- 1 root root
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-23 22:54 ---
These are -O2 and -O3 code comparison.
-O3 code have all modified variables in registers
and thus is smaller and most likely faster.
serpent_encrypt:
pushl %ebp
: gcc
Version: 3.4.3
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: vda at port dot imtp dot ilyichevsk dot odessa dot ua
CC
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-21 06:08 ---
Created an attachment (id=8695)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8695action=view)
testcase
Use gcc -O2 -S t.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-21 11:27 ---
Though on 4.0.0/4.1.0, we get better:
subl$260, %esp
It's way too good. Declared locals should take 512 bytes, plus
any temporaries for spills.
Please find fixed
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-21 11:29 ---
Whoops no, locals are 256 bytes only.
(/me is looking for some coffee)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21141
at gcc dot gnu dot org
ReportedBy: vda at port dot imtp dot ilyichevsk dot odessa dot ua
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i386-pc-linux-gnu
GCC host triplet: i386-pc-linux-gnu
GCC target triplet: i386-pc-linux-gnu
http://gcc.gnu.org/bugzilla
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-21 13:05 ---
Created an attachment (id=8700)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8700action=view)
move return 0; around to find out where does that happens
--
http
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: vda at port dot imtp dot ilyichevsk dot odessa dot ua
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-21 13:12 ---
Created an attachment (id=8701)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8701action=view)
generate assembly with -S and compare results
--
http://gcc.gnu.org/bugzilla
--- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa
dot ua 2005-04-21 13:36 ---
testcase is measuring how many twofish_setkey()'s can be executed per second.
By inserting extra 'return 0;' in the body of that function and running
the testcase, we can measure
32 matches
Mail list logo