Hello Richard:

On 14/02/24 4:03 pm, Richard Sandiford wrote:
> Hi,
> 
> Thanks for working on this.
> 
> You posted a version of this patch on Sunday too.  If you need to repost
> to fix bugs or make other improvements, could you describe the changes
> that you've made since the previous version?  It makes things easier
> to follow.

Sure. Sorry for that I forgot to add that.

> 
> Also, sorry for starting with a meta discussion about reviews, but
> there are multiple types of review comment, including:
> 
> (1) Suggestions for changes that are worded as suggestions.
> 
> (2) Suggestions for changes that are worded as questions ("Wouldn't it be
>     better to do X?", etc).
> 
> (3) Questions asking for an explanation or for more information.
> 
> Just sending a new patch makes sense when the previous review comments
> were all like (1), and arguably also (1)+(2).  But Alex's previous review
> included (3) as well.  Could you go back and respond to his questions there?
> It would help understand some of the design choices.
>

I have responded to Alex comments for the previous patches.
I have incorporated all of his comments in this patch.

 
> A natural starting point when reviewing a patch like this is to diff
> the current aarch64-ldp-fusion.cc with the new pair-fusion.cc.  This shows
> many of the kind of changes that I'd expect.  But it also seems to include
> some code reordering, such as putting fuse_pair after try_fuse_pair.
> If some reordering is necessary, could you try to organise the patch as
> a series in which the reordering is a separate step?  It's a bit hard
> to review at the moment.  (Reordering for cosmetic reasons is also OK,
> but again please separate it out for ease of review.)
> 
> Maybe one way of making the review easier would be to split the aarch64
> pass into the "target-dependent" and "target-independent" pieces
> in-place, i.e. keeping everything within aarch64-ldp-fusion.cc, and then
> (as separate patches) move the target-independent pieces outside
> config/aarch64.
> 
Sure I will do that.

> The patch includes:
> 
>>      * emit-rtl.cc: Modify ge with gt on PolyINT data structure.
>>      * dce.cc: Add changes not to delete the load store pair.
>>      * rtl-ssa/changes.cc: Modified assert code.
>>      * var-tracking.cc: Modified assert code.
>>      * df-problems.cc: Not to generate REG_UNUSED for multi
>>      word registers that is requied for rs6000 target.
> 
> Please submit these separately, as independent preparatory patches,
> with an explanation for why they're needed & correct.  But:
> 
Sure I will do that.

>> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
>> index 88ee0dd67fc..a8d0ee7c4db 100644
>> --- a/gcc/df-problems.cc
>> +++ b/gcc/df-problems.cc
>> @@ -3360,7 +3360,7 @@ df_set_unused_notes_for_mw (rtx_insn *insn, struct 
>> df_mw_hardreg *mws,
>>    if (df_whole_mw_reg_unused_p (mws, live, artificial_uses))
>>      {
>>        unsigned int regno = mws->start_regno;
>> -      df_set_note (REG_UNUSED, insn, mws->mw_reg);
>> +      //df_set_note (REG_UNUSED, insn, mws->mw_reg);
>>        dead_debug_insert_temp (debug, regno, insn, 
>> DEBUG_TEMP_AFTER_WITH_REG);
>>  
>>        if (REG_DEAD_DEBUGGING)
>> @@ -3375,7 +3375,7 @@ df_set_unused_notes_for_mw (rtx_insn *insn, struct 
>> df_mw_hardreg *mws,
>>      if (!bitmap_bit_p (live, r)
>>          && !bitmap_bit_p (artificial_uses, r))
>>        {
>> -        df_set_note (REG_UNUSED, insn, regno_reg_rtx[r]);
>> +       // df_set_note (REG_UNUSED, insn, regno_reg_rtx[r]);
>>          dead_debug_insert_temp (debug, r, insn, DEBUG_TEMP_AFTER_WITH_REG);
>>          if (REG_DEAD_DEBUGGING)
>>            df_print_note ("adding 2: ", insn, REG_NOTES (insn));
>> @@ -3493,9 +3493,9 @@ df_create_unused_note (rtx_insn *insn, df_ref def,
>>      || bitmap_bit_p (artificial_uses, dregno)
>>      || df_ignore_stack_reg (dregno)))
>>      {
>> -      rtx reg = (DF_REF_LOC (def))
>> -                ? *DF_REF_REAL_LOC (def): DF_REF_REG (def);
>> -      df_set_note (REG_UNUSED, insn, reg);
>> +      //rtx reg = (DF_REF_LOC (def))
>> +      //          ? *DF_REF_REAL_LOC (def): DF_REF_REG (def);
>> +      //df_set_note (REG_UNUSED, insn, reg);
>>        dead_debug_insert_temp (debug, dregno, insn, 
>> DEBUG_TEMP_AFTER_WITH_REG);
>>        if (REG_DEAD_DEBUGGING)
>>      df_print_note ("adding 3: ", insn, REG_NOTES (insn));
> 
> I don't think this can be right.  The last hunk of the var-tracking.cc
> patch also seems to be reverting a correct change.
> 

We generate sequential registers using (subreg V16QI (reg 00mode) 16)
and (reg OOmode 0)
where OOmode is 256 bit and V16QI is 128 bits in order to generate
sequential register pair. If I keep the above REG_UNUSED notes ira generates
REG_UNUSED and in cprop_harreg pass and dce pass deletes store pairs and
we get incorrect code.

By commenting REG_UNUSED notes it is not generated and we get the correct store
pair fusion and cprop_hardreg and dce doesn't deletes them.

Please let me know is there are better ways to address this instead of 
commenting
above generation of REG_UNUSED notes.

Thanks & Regards
Ajit
> Thanks,
> Richard

Reply via email to