Re: Fwd: constant hoisting out of loops
On Sun, Mar 21, 2010 at 3:43 AM, Jim Wilson wil...@codesourcery.com wrote: On Sun, 2010-03-21 at 03:40 +0800, fanqifei wrote: foor_expand_move is changed and it works now. However, I still don't understand why there was no such error if below condition was used and foor_expand_move was not changed. Both below condition and (register_operand(operands[0], SImode) || register_operand(operands[1],SImode)) ... does not accept memmem. The define_expand is used for generating RTL. The RTL expander calls the define_expand, which checks for MEMCONST, and then falls through generating the mem copy insn. The define_insn is used for matching RTL. After it has been generated, we look at the movsi define_insn, and see that MEMMEM doesn't match, so you get an error for unrecognized RTL. The define_expand must always match the define_insn(s). They are used in different phases, and they aren't checked against each other when gcc is built. If there is a mismatch, then you get a run-time error for unrecognized rtl. Jim Great thanks for your explanation. I will look into the internal reason why they must match although that is beyond my work and what I know. Thanks again! -- -Qifei Fan http://freshtime.org
Re: Fwd: constant hoisting out of loops
On Sat, Mar 20, 2010 at 6:24 AM, Jim Wilson wil...@codesourcery.com wrote: On Fri, 2010-03-19 at 22:06 +0800, fanqifei wrote: 1. I add movsi expander in which function foor_expand_move is used to force the operands[1] to reg and emit the move insn. Another change is that in the define of insn “*mov_mode_insn in which a condition is added to prevent a store const to mem RTL insn from being accepted. Are these changes necessary? Yes, this looks correct and necessary. 2. Is is correct to use emit_move_insn in foor_expand_move? in mips.md, the function mips_emit_move called both emit_move_insn and emit_move_insn_1. But I don’t quite understand the comment above the function. This looks like the kind of thing you don't need to understand now. Just call emit_move_insn, and worry about bizarre details like this later. It isn't obvious to me why it is there either. Before reload, you can create new pseudo-regs at any time if you need to load something into a register. After reload you can't. emit_move_insn_1 assumes its operands are valid. emit_move_insn checks to see if operands are valid and if not tries to fix them. Calling emit_move_insn after reload will fail if you have an invalid operand that needs to be loaded into a new pseudo. Calling emit_move_insn_1 with invalid operands will fail a different way. It looks like the mips port is trying to do something very tricky and subtle here. If you want to understand it, you probably have to find the patch that added it. Or find a testcase where it makes a difference. 3. My understanding of the internal flow about the issue is: The named insn “movsi” is used to generate RTL insn list from parse tree. The insn pattern “set mem, const” is expanded by function foor_expand_move(). For other forms of “set” insns, the template given in the pattern is inserted. Then the insn *mov_mode_insn is used to generate assembler code. In the generation, the condition of mov_mode_insn is checked. I am not fully confident the understanding is correct. That seems correct. movsi is used for generating RTL. mov_mode_insn is used for matching RTL. (define_insn *mov_mode_insn [(set (match_operand:BWD 0 nonimmediate_operand =r,m,r,r,r,r,r,r,x,r) (match_operand:BWD 1 foor_move_source_operand Z,r,L,I,Q,P,ni,x,r,r))] (!( (memory_operand(operands[0], SImode) (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], HImode) (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], QImode) (foor_const_operand_f(operands[1]))) BWD is presumably a mode macro. You can use BWDmode to get the enum mode name instead of having 3 copies of the test. Checking to reject memconst is equivalent to checking to accept reg||reg. The latter check is the conventional one and will be faster, as register_operand does less work than memory_operand, and short-cut evaluation means we only need one register_operand call in the common case. This assumes that 'x' is some kind of register which seems likely. predicates.md: (define_predicate foor_const_operand (match_test foor_const_operand_f(op))) You don't need the foor_const_operand function. You can just do (match_code const_int) Jim I changed the condition in *mov_insn_mode to below: ( (register_operand(operands[0], SImode) || register_operand(operands[1],SImode)) ||(register_operand(operands[0], HImode) || register_operand(operands[1],HImode)) ||(register_operand(operands[0], QImode) || register_operand(operands[1],QImode)) ) Then there is an error during gcc build: ../.././gcc/tmplibgcc_fp_bit.c: In function '_fpadd_parts': ../.././gcc/tmplibgcc_fp_bit.c:740: error: unrecognizable insn: (insn 41 40 42 12 ../.././gcc/tmplibgcc_fp_bit.c:637 (set (mem/s:SI (reg/v/f:SI 43 [ tmp ]) [2 S4 A32]) (mem/s:SI (reg/v/f:SI 41 [ a ]) [2 S4 A32])) -1 (nil)) ../.././gcc/tmplibgcc_fp_bit.c:740: internal compiler error: in extract_insn, at recog.c:1990 Please submit a full bug report, Seems that the pattern set mem,mem is not recognized. But how was it recognized when I was using memconst? The constraints don't contain mm. I don't know what extra information should I provide now. Thanks very much! -- -Qifei Fan http://freshtime.org
Re: Fwd: constant hoisting out of loops
On Sat, 2010-03-20 at 14:29 +0800, fanqifei wrote: I changed the condition in *mov_insn_mode to below: (register_operand(operands[0], SImode) || register_operand(operands[1],SImode)) I think you need the same change in foor_expand_move. I.e., if neither the source or dest is a register, then you force the source into a register. If you still have the memconst check there, then memmem will accidentally be accepted and generated. Jim
Re: Fwd: constant hoisting out of loops
On Sun, Mar 21, 2010 at 2:47 AM, Jim Wilson wil...@codesourcery.com wrote: On Sat, 2010-03-20 at 14:29 +0800, fanqifei wrote: I changed the condition in *mov_insn_mode to below: (register_operand(operands[0], SImode) || register_operand(operands[1],SImode)) I think you need the same change in foor_expand_move. I.e., if neither the source or dest is a register, then you force the source into a register. If you still have the memconst check there, then memmem will accidentally be accepted and generated. Jim foor_expand_move is changed and it works now. However, I still don't understand why there was no such error if below condition was used and foor_expand_move was not changed. Both below condition and (register_operand(operands[0], SImode) || register_operand(operands[1],SImode)) ... does not accept memmem. (!( (memory_operand(operands[0], SImode) (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], HImode) (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], QImode) (foor_const_operand_f(operands[1]))) )) Thanks. -- -Qifei Fan http://freshtime.org
Re: Fwd: constant hoisting out of loops
On Sun, 2010-03-21 at 03:40 +0800, fanqifei wrote: foor_expand_move is changed and it works now. However, I still don't understand why there was no such error if below condition was used and foor_expand_move was not changed. Both below condition and (register_operand(operands[0], SImode) || register_operand(operands[1],SImode)) ... does not accept memmem. The define_expand is used for generating RTL. The RTL expander calls the define_expand, which checks for MEMCONST, and then falls through generating the mem copy insn. The define_insn is used for matching RTL. After it has been generated, we look at the movsi define_insn, and see that MEMMEM doesn't match, so you get an error for unrecognized RTL. The define_expand must always match the define_insn(s). They are used in different phases, and they aren't checked against each other when gcc is built. If there is a mismatch, then you get a run-time error for unrecognized rtl. Jim
Re: Fwd: constant hoisting out of loops
On Fri, 2010-03-19 at 22:06 +0800, fanqifei wrote: 1. I add movsi expander in which function foor_expand_move is used to force the operands[1] to reg and emit the move insn. Another change is that in the define of insn “*mov_mode_insn in which a condition is added to prevent a store const to mem RTL insn from being accepted. Are these changes necessary? Yes, this looks correct and necessary. 2. Is is correct to use emit_move_insn in foor_expand_move? in mips.md, the function mips_emit_move called both emit_move_insn and emit_move_insn_1. But I don’t quite understand the comment above the function. This looks like the kind of thing you don't need to understand now. Just call emit_move_insn, and worry about bizarre details like this later. It isn't obvious to me why it is there either. Before reload, you can create new pseudo-regs at any time if you need to load something into a register. After reload you can't. emit_move_insn_1 assumes its operands are valid. emit_move_insn checks to see if operands are valid and if not tries to fix them. Calling emit_move_insn after reload will fail if you have an invalid operand that needs to be loaded into a new pseudo. Calling emit_move_insn_1 with invalid operands will fail a different way. It looks like the mips port is trying to do something very tricky and subtle here. If you want to understand it, you probably have to find the patch that added it. Or find a testcase where it makes a difference. 3. My understanding of the internal flow about the issue is: The named insn “movsi” is used to generate RTL insn list from parse tree. The insn pattern “set mem, const” is expanded by function foor_expand_move(). For other forms of “set” insns, the template given in the pattern is inserted. Then the insn *mov_mode_insn is used to generate assembler code. In the generation, the condition of mov_mode_insn is checked. I am not fully confident the understanding is correct. That seems correct. movsi is used for generating RTL. mov_mode_insn is used for matching RTL. (define_insn *mov_mode_insn [(set (match_operand:BWD 0 nonimmediate_operand =r,m,r,r,r,r,r,r,x,r) (match_operand:BWD 1 foor_move_source_operand Z,r,L,I,Q,P,ni,x,r,r))] (!( (memory_operand(operands[0], SImode) (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], HImode) (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], QImode) (foor_const_operand_f(operands[1]))) BWD is presumably a mode macro. You can use BWDmode to get the enum mode name instead of having 3 copies of the test. Checking to reject memconst is equivalent to checking to accept reg||reg. The latter check is the conventional one and will be faster, as register_operand does less work than memory_operand, and short-cut evaluation means we only need one register_operand call in the common case. This assumes that 'x' is some kind of register which seems likely. predicates.md: (define_predicate foor_const_operand (match_test foor_const_operand_f(op))) You don't need the foor_const_operand function. You can just do (match_code const_int) Jim
Re: constant hoisting out of loops
On Thu, Mar 18, 2010 at 2:30 AM, Jim Wilson wil...@codesourcery.com wrote: On Wed, 2010-03-17 at 11:27 +0800, fanqifei wrote: You are correct. The reload pass emitted the clr.w insn. However, I can see loop opt passes after reload: problem1.c.174r.loop2_invariant1 Not unless you have a modified toolchain. We don't run loop opt after register allocation. See the list of optimization passes in the FSF GCC passes.c file. loop2 occurs before ira/postreload. If you do have a modified toolchain, then I doubt that the current loop passes would work right, since they were designed to handle pseudo-regs not hard-regs. Jim That passes were added by me more than two months ago. I thought these passes could perform the optimization of hoisting constant out of loop. I just removed them. Thanks very much for your help! Now I am working on the movsi expander. -- -Qifei Fan http://freshtime.org
Re: constant hoisting out of loops
On Wed, 2010-03-17 at 11:27 +0800, fanqifei wrote: You are correct. The reload pass emitted the clr.w insn. However, I can see loop opt passes after reload: problem1.c.174r.loop2_invariant1 Not unless you have a modified toolchain. We don't run loop opt after register allocation. See the list of optimization passes in the FSF GCC passes.c file. loop2 occurs before ira/postreload. If you do have a modified toolchain, then I doubt that the current loop passes would work right, since they were designed to handle pseudo-regs not hard-regs. Jim
Re: constant hoisting out of loops
On Mon, Mar 15, 2010 at 5:24 AM, Jim Wilson wil...@codesourcery.com wrote: On 03/10/2010 10:48 PM, fanqifei wrote: For below piece of code, the instruction clr.w a15 obviously doesn't belong to the inner loop. 6: bd f4 clr.w a15; #clear to zero 8: 80 af 00 std.w a10 0x0 a15; There is info lacking here. Did you compile with optimization? What does the RTL look like before and after the loop opt passes? I'd guess that your movsi pattern is defined wrong. You probably have predicates that allow either registers or constants in the set source, which is normal, and constraints that only allow registers when the dest is a mem. But constraints are only used by the reload pass, so a store zero to mem rtl insn will be generated early, and then fixed late during the reload pass. So the loop opt did not move the clear insn out of the loop because there was no clear insn at this time. The way to fix this is to add a condition to the movsi pattern that excludes this case. For instance, something like this: (register_operand (operands[0], SImode) || register_operand (operands[1], SImode)) This will prevent a store zero to mem RTL insn from being accepted. In order to make this work, you need to make movsi an expander that accepts anything, and then forces the source to a register if you have a store constant to memory. See for instance the sparc_expand_move function or the mips_legitimize_move function. Use -da (old) or -fdump-rtl-all (new) to see the RTL dumps to see what is going on. Jim It's compiled with -O2. You are correct. The reload pass emitted the clr.w insn. However, I can see loop opt passes after reload: problem1.c.174r.loop2_invariant1 problem1.c.174r.redo_loop2_invariant problem1.c.175r.loop2_unswitch problem1.c.177r.redo_loop2_invariant After reload pass, the clr.w insn is in the loop. And after above loop2 passes, the insn is not moved outside of the loop. I am not sure the issue is in these loop2 passes. I guess there is. For the definition of movsi expander, I will try to do what you pointed out. (I am not very familiar with these code and that may take me some time.) current definition of mov pattern: (define_insn movmode [(set (match_operand:BWD 0 nonimmediate_operand =r,m,r,r,r,r,r,r,x,r) (match_operand:BWD 1 move_source_operand Z,r,L,I,Q,P,ni,x,r,r))] @ %L1m %0 %1; %S0m %0 %1; clrm %0; mv %0 %1; ... ... Thanks! -- -Qifei Fan http://freshtime.org
Re: constant hoisting out of loops
On 03/10/2010 10:48 PM, fanqifei wrote: For below piece of code, the instruction clr.w a15 obviously doesn't belong to the inner loop. 6: bd f4clr.w a15; #clear to zero 8: 80 af 00std.w a10 0x0 a15; There is info lacking here. Did you compile with optimization? What does the RTL look like before and after the loop opt passes? I'd guess that your movsi pattern is defined wrong. You probably have predicates that allow either registers or constants in the set source, which is normal, and constraints that only allow registers when the dest is a mem. But constraints are only used by the reload pass, so a store zero to mem rtl insn will be generated early, and then fixed late during the reload pass. So the loop opt did not move the clear insn out of the loop because there was no clear insn at this time. The way to fix this is to add a condition to the movsi pattern that excludes this case. For instance, something like this: (register_operand (operands[0], SImode) || register_operand (operands[1], SImode)) This will prevent a store zero to mem RTL insn from being accepted. In order to make this work, you need to make movsi an expander that accepts anything, and then forces the source to a register if you have a store constant to memory. See for instance the sparc_expand_move function or the mips_legitimize_move function. Use -da (old) or -fdump-rtl-all (new) to see the RTL dumps to see what is going on. Jim
constant hoisting out of loops
I am porting gcc 4.3.2 to my own micro-controller. For below piece of code, the instruction clr.w a15 obviously doesn't belong to the inner loop. Compiler should be smart enough to move it to the beginning of the function. How I can hoist the constant out of loops? Maybe the costs functions have to be changed, but I don't know how. Thanks. C code: void memzero_aligned(uint* ptr, uint size) { uint ptr_end = (uint)ptr + size; while ((uint)ptr ptr_end) { *ptr++ = 0; } } Disassembly code: _memzero_aligned: 0: bc ab 90 add.w a9 a10 a11; 3: f4 0e 0bbra 0xe; #branch unconditionally, no delay slot 6: bd f4clr.w a15; #clear to zero 8: 80 af 00std.w a10 0x0 a15; b: 90 aa 04 add.w a10 a10 0x4; e: b8 a9 06 cmp.w a10 a9; 11: f4 08 f5brc 0x6; 14: f8 00ret; -- -Qifei Fan http://freshtime.org