I'm porting gcc 4.0.1 to a new VLIW architecture.
Some of its function units doesn't have internal hardware pipeline forwarding,
so I need to insert "nop" instructions in order to resovle the data hazard.

I used the automata based pipeline description for my ports,
I described the data latency time by `define_insn_reservation',
and I'm trying to insert the "nop" in the hook TARGET_MACHINE_DEPENDENT_REORG.

The implementation of this hook is simple.
I just run the DFA scheduler again manually,
and I just let the insns to be issued as well as 2nd sched pass.
The following codes are my implementation:
--------------------------------------------[TOP]---------------------------------------------------
void unicore_reorg(void)
{
       bool func_start = true;
       int stalls;
       rtx insn = NULL, new_insn;
       state_t dfa_state = alloca(state_size());
dfa_start();
       state_reset(dfa_state);

       for(insn = get_insns(); insn ; insn = NEXT_INSN(insn)) {
               if(!executable_insn_p(insn)) continue;
if(!func_start && GET_MODE(insn) == TImode) state_transition(dfa_state, NULL);
               stalls = state_transition(dfa_state, insn);
if(stalls == 1) {
                       state_transition(dfa_state, NULL);
                       state_transition(dfa_state, insn);
               }
               if(stalls > 1) {
                       while(--stalls) {
                               new_insn = emit_insn_before(gen_nop(), insn);
                               if(flag_schedule_insns_after_reload) 
PUT_MODE(new_insn, TImode);
                               recog_memoized(new_insn);
                               state_transition(dfa_state, NULL);
                       }
                       state_transition(dfa_state, NULL);
                       state_transition(dfa_state, insn);
               }
func_start = false;
       }
       dfa_finish();
}
--------------------------------------------[END]---------------------------------------------------


But I still saw that the two instructions can be issued in the continuous 
cycles:
--------------------------------------------[TOP]---------------------------------------------------
@(insn:TI 48 50 49 (set (reg:SI 32 d0 [134])
@        (minus:SI (reg:SI 6 r6 [orig:135 FLAG ] [135])
@            (reg:SI 33 d1))) 48 {*subsi3} (insn_list:REG_DEP_TRUE 175 (nil))
@    (expr_list:REG_EQUAL (neg:SI (reg:SI 3 r3 [orig:122 D.1804 ] [122]))
@ (nil))) sub .m0 d0, r6, d1 @ 48 *subsi3/4 [length = 4]
@(insn:TI 49 48 176 (set (reg:SI 32 d0 [136])
@        (smax:SI (reg:SI 3 r3 [orig:122 D.1804 ] [122])
@            (reg:SI 32 d0 [134]))) 66 {smaxsi3} (insn_list:REG_DEP_TRUE 48 
(insn_list:REG_DEP_TRUE 42 (nil)))
@    (expr_list:REG_DEAD (reg:SI 3 r3 [orig:122 D.1804 ] [122])
@ (nil))) max .m0 d0, r3, d0 @ 49 smaxsi3/2 [length = 4]
--------------------------------------------[END]---------------------------------------------------
The destination operand of the `sub' instruction, d0, will be written back in 
the 4th cycle,
and the instruction `max' will use it as source operand (i.e., there is a true 
data dependency).

I figured out that the state_transition() returns -1 when I issuing the `max' 
instruction,
and I figured out it only returns > 0 when "hardware structural hazard" occured.

Are there any solutions for me to insert 4 nops between the 2 insns?
Thanks a lot.


Reply via email to