Hi Richi,

Thanks for your review comments on this and some others!

on 2023/6/30 19:37, Richard Biener wrote:
> On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <li...@linux.ibm.com> wrote:
>>
>> This patch series follows Richi's suggestion at the link [1],
>> which suggest structuring vectorizable_load to make costing
>> next to the transform, in order to make it easier to keep
>> costing and the transform in sync.  For now, it's a known
>> issue that what we cost can be inconsistent with what we
>> transform, as the case in PR82255 and some other associated
>> test cases in the patches of this series show.
>>
>> Basically this patch series makes costing not call function
>> vect_model_load_cost any more.  To make the review and
>> bisection easy, I organized the changes according to the
>> memory access types of vector load.  For each memory access
>> type, firstly it follows the handlings in the function
>> vect_model_load_costto avoid any missing, then refines
>> further by referring to the transform code, I also checked
>> them with some typical test cases to verify.  Hope the
>> subjects of patches are clear enough.
>>
>> The whole series can be bootstrapped and regtested
>> incrementally on:
>>   - x86_64-redhat-linux
>>   - aarch64-linux-gnu
>>   - powerpc64-linux-gnu P7, P8 and P9
>>   - powerpc64le-linux-gnu P8, P9 and P10
>>
>> By considering the current vector test buckets are mainly
>> tested without cost model, I also verified the whole patch
>> series was neutral for SPEC2017 int/fp on Power9 at O2,
>> O3 and Ofast separately.
> 
> I went through the series now and I like it overall (well, I suggested
> the change).
> Looking at the changes I think we want some followup to reduce the
> mess in the final loop nest.  We already have some VMAT_* cases handled
> separately, maybe we can split out some more cases.  Maybe we should

At first glance, the simple parts look to be the handlings for
VMAT_LOAD_STORE_LANES, and VMAT_GATHER_SCATTER (with ifn and emulated).
It seems a bit straightforward if it's fine to duplicate the nested loop,
but may need to care about removing some useless code.

> bite the bullet and duplicate that loop nest for the different VMAT_* cases.
> Maybe we can merge some of the if (!costing_p) checks by clever
> re-ordering.

I've tried a bit to merge them if possible, like the place to check
VMAT_CONTIGUOUS, VMAT_CONTIGUOUS_REVERSE and VMAT_CONTIGUOUS_PERMUTE.
But will keep in mind for the following updates.

> So what
> this series doesn't improve is overall readability of the code (indent and our
> 80 char line limit).

Sorry about that.

> 
> The change also makes it more difficult(?) to separate analysis and transform
> though in the end I hope that analysis will actually "code generate" to a 
> (SLP)
> data structure so the target will have a chance to see the actual flow of 
> insns.
> 
> That said, I'd like to hear from Richard whether he thinks this is a step
> in the right direction.
> 
> Are you willing to followup with doing the same re-structuring to
> vectorizable_store?

Yes, vectorizable_store was also pointed out in your original suggestion [1],
I planned to update this once this series meets your expectations and gets 
landed.

> 
> OK from my side with the few comments addressed.  The patch likely needs 
> refresh
> after the RVV changes in this area?

Thanks!  Yes, I've updated 2/9 and 3/9 according to your comments, and updated
5/9 and 9/9 as they had some conflicts when rebasing.  Re-testing is ongoing,
do the updated versions look good to you?  Is this series ok for trunk if all 
the
test runs go well again as before?

BR,
Kewen

Reply via email to