[Bug rtl-optimization/24810] [4.1/4.2 Regression] mov + mov + testl generated instead of testb
--- Comment #8 from jakub at gcc dot gnu dot org 2005-12-29 11:53 --- I don't think this is a bug, in fact, not honoring the volatile in GCC 4.0.x and earlier was a bug. If you want to allow byte access rather than word access, you really need to remove the volatile keyword and then it compiles into restore_fpu: testb $1, boot_cpu_data+15 je .L2 jmp foo .L2: jmp bar .size restore_fpu, .-restore_fpu .ident "GCC: (GNU) 4.2.0 20051223 (experimental)" You should report this against Linux kernel, it shouldn't use volatile in there. -- jakub at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24810
[Bug rtl-optimization/24810] [4.1/4.2 Regression] mov + mov + testl generated instead of testb
--- Comment #7 from kazu at gcc dot gnu dot org 2005-12-19 00:37 --- We are basically talking about narrowing the memory being loaded for testing. Now, can we really optimize this case? We've got const volatile unsigned long *addr I am not sure if "volatile" allows us to change the width of a memory read. I know a chip that expects you to read memory at one address repeatedly to transfer a block of data, and people probably use volatile for this kind of case. If the compiler changes the width of memory access, we may be screwing up something. IMHO, if byte access is really desired, the code should be rewritten that way. -- kazu at gcc dot gnu dot org changed: What|Removed |Added CC||kazu at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24810
[Bug rtl-optimization/24810] [4.1/4.2 Regression] mov + mov + testl generated instead of testb
--- Comment #6 from dann at godzilla dot ics dot uci dot edu 2005-12-18 22:57 --- (In reply to comment #5) > Simplified testcase seems to work for me on 4.1 branch: > restore_fpu: > movl4(%esp), %edx > movlboot_cpu_data+12, %eax > testl $16777216, %eax 4.0 still does better, it uses a single "testb" instruction instead of 2 dependent movl + testb instructions. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24810
[Bug rtl-optimization/24810] [4.1/4.2 Regression] mov + mov + testl generated instead of testb
--- Comment #5 from hubicka at gcc dot gnu dot org 2005-12-18 20:53 --- Simplified testcase seems to work for me on 4.1 branch: restore_fpu: movl4(%esp), %edx movlboot_cpu_data+12, %eax testl $16777216, %eax je .L2 jmp foo .L2: movl%edx, 4(%esp) jmp bar "jmp foo" is not elliminated because we don't have pattern for conditional tailcalls. Should not be big issue to add the neccesary patterns however. Honza -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24810