[Bug rtl-optimization/24810] [4.1/4.2 Regression] mov + mov + testl generated instead of testb

2005-12-29 Thread jakub at gcc dot gnu dot org


--- Comment #8 from jakub at gcc dot gnu dot org  2005-12-29 11:53 ---
I don't think this is a bug, in fact, not honoring the volatile in GCC 4.0.x
and earlier was a bug.  If you want to allow byte access rather than word
access, you really need to remove the volatile keyword and then it compiles
into
restore_fpu:
testb   $1, boot_cpu_data+15
je  .L2
jmp foo
.L2:
jmp bar
.size   restore_fpu, .-restore_fpu
.ident  "GCC: (GNU) 4.2.0 20051223 (experimental)"

You should report this against Linux kernel, it shouldn't use volatile in
there.


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24810



[Bug rtl-optimization/24810] [4.1/4.2 Regression] mov + mov + testl generated instead of testb

2005-12-18 Thread kazu at gcc dot gnu dot org


--- Comment #7 from kazu at gcc dot gnu dot org  2005-12-19 00:37 ---
We are basically talking about narrowing the memory being loaded for testing.
Now, can we really optimize this case?  We've got

  const volatile unsigned long *addr

I am not sure if "volatile" allows us to change the width of a memory read.
I know a chip that expects you to read memory at one address repeatedly to
transfer a block of data, and people probably use volatile
for this kind of case.  If the compiler changes the width of memory access,
we may be screwing up something.

IMHO, if byte access is really desired, the code should be rewritten that way.


-- 

kazu at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||kazu at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24810



[Bug rtl-optimization/24810] [4.1/4.2 Regression] mov + mov + testl generated instead of testb

2005-12-18 Thread dann at godzilla dot ics dot uci dot edu


--- Comment #6 from dann at godzilla dot ics dot uci dot edu  2005-12-18 
22:57 ---
(In reply to comment #5)
> Simplified testcase seems to work for me on 4.1 branch:
> restore_fpu:
> movl4(%esp), %edx
> movlboot_cpu_data+12, %eax
> testl   $16777216, %eax

4.0 still does better, it uses a single "testb" instruction instead of 2
dependent 
movl + testb instructions.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24810



[Bug rtl-optimization/24810] [4.1/4.2 Regression] mov + mov + testl generated instead of testb

2005-12-18 Thread hubicka at gcc dot gnu dot org


--- Comment #5 from hubicka at gcc dot gnu dot org  2005-12-18 20:53 ---
Simplified testcase seems to work for me on 4.1 branch:
restore_fpu:
movl4(%esp), %edx
movlboot_cpu_data+12, %eax
testl   $16777216, %eax
je  .L2
jmp foo
.L2:
movl%edx, 4(%esp)
jmp bar
"jmp foo" is not elliminated because we don't have pattern for conditional
tailcalls.  Should not be big issue to add the neccesary patterns however.

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24810