Hi Uros,

on 2021/7/15 下午3:17, Uros Bizjak wrote:
> On Thu, Jul 15, 2021 at 9:07 AM Kewen.Lin <li...@linux.ibm.com> wrote:
>>
>> on 2021/7/14 下午3:45, Kewen.Lin via Gcc-patches wrote:
>>> on 2021/7/14 下午2:38, Richard Biener wrote:
>>>> On Tue, Jul 13, 2021 at 4:59 PM Kewen.Lin <li...@linux.ibm.com> wrote:
>>>>>
>>>>> on 2021/7/13 下午8:42, Richard Biener wrote:
>>>>>> On Tue, Jul 13, 2021 at 12:25 PM Kewen.Lin <li...@linux.ibm.com> wrote:
>>>>
>>>>> I guess the proposed IFN would be directly mapped for [us]mul_highpart?
>>>>
>>>> Yes.
>>>>
>>>
>>> Thanks for confirming!  The related patch v2 is attached and the testing
>>> is ongoing.
>>>
>>
>> It's bootstrapped & regtested on powerpc64le-linux-gnu P9 and
>> aarch64-linux-gnu.  But on x86_64-redhat-linux there are XPASSes as below:
>>
>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhuw
>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhuw
>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhw
>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhw
> 
> These XFAILs should be removed after your patch.
> 
I'm curious whether it's intentional not to specify -fno-vect-cost-model
for this test case.  As noted above, this case is sensitive on how we
cost mult_highpart.  Without cost modeling, the XFAILs can be removed
only with this mul_highpart pattern support, no matter how we model it
(x86 part of this patch exists or not).

> This is PR100696 [1], we want PMULH.W here, so x86 part of the patch
> is actually not needed.
> 

Thanks for the information!  The justification for the x86 part is that:
the IFN_MULH essentially covers MULT_HIGHPART_EXPR with mul_highpart
optab support, i386 port has already customized costing for 
MULT_HIGHPART_EXPR (should mean/involve the case with mul_highpart optab
support), if we don't follow the same way for IFN_MULH, I'm worried that
we may cost the IFN_MULH wrongly.  If taking IFN_MULH as normal stmt is
a right thing (we shouldn't cost it specially), it at least means we
have to adjust ix86_multiplication_cost for MULT_HIGHPART_EXPR when it
has direct mul_highpart optab support, I think they should be costed
consistently.  Does it sound reasonable?

BR,
Kewen

> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100696
> 
> Uros.
> 
>> They weren't exposed in the testing run with the previous patch which
>> doesn't use IFN way.  By investigating it, the difference comes from
>> the different costing on MULT_HIGHPART_EXPR and IFN_MULH.
>>
>> For MULT_HIGHPART_EXPR, it's costed by 16 from below call:
>>
>>         case MULT_EXPR:
>>         case WIDEN_MULT_EXPR:
>>         case MULT_HIGHPART_EXPR:
>>           stmt_cost = ix86_multiplication_cost (ix86_cost, mode);
>>
>> While for IFN_MULH, it's costed by 4 as normal stmt so the total cost
>> becomes profitable and the expected vectorization happens.
>>
>> One conservative fix seems to make IFN_MULH costing go through the
>> unique cost interface for multiplication, that is:
>>
>>       case CFN_MULH:
>>         stmt_cost = ix86_multiplication_cost (ix86_cost, mode);
>>         break;
>>
>> As the test case marks the checks as "xfail", probably it's good to
>> revisit the costing on mul_highpart to ensure it's not priced more.
>>
>> The attached patch also addressed Richard S.'s review comments on
>> two reformatting hunks.  Is it ok for trunk?
>>
>> BR,
>> Kewen
>> -----
>> gcc/ChangeLog:
>>
>>         * internal-fn.c (first_commutative_argument): Add info for IFN_MULH.
>>         * internal-fn.def (IFN_MULH): New internal function.
>>         * tree-vect-patterns.c (vect_recog_mulhs_pattern): Add support to
>>         recog normal multiply highpart as IFN_MULH.
>>         * config/i386/i386.c (ix86_add_stmt_cost): Adjust for combined
>>         function CFN_MULH.

Reply via email to