On Wed, Feb 14, 2018 at 7:04 PM, Kyrill  Tkachov
<kyrylo.tkac...@foss.arm.com> wrote:
> On 13/02/18 16:45, Jeff Law wrote:
>> On 02/09/2018 07:50 AM, Kyrill Tkachov wrote:
>>> Hi Uros,
>>> On 08/02/18 22:54, Uros Bizjak wrote:
>>>> On Thu, Feb 8, 2018 at 6:11 PM, Kyrill  Tkachov
>>>> <kyrylo.tkac...@foss.arm.com> wrote:
>>>>> Hi all,
>>>>> This patch fixes some fallout in the i386 testsuite that occurs after
>>>>> the
>>>>> simplification in patch [1/3] [1].
>>>>> The gcc.target/i386/extract-2.c FAILs because it expects to match:
>>>>> (set (reg:CC 17 flags)
>>>>>       (compare:CC (subreg:QI (zero_extract:SI (reg:HI 98)
>>>>>                   (const_int 8 [0x8])
>>>>>                   (const_int 8 [0x8])) 0)
>>>>>           (const_int 4 [0x4])))
>>>>> which is the *cmpqi_ext_2 pattern in i386.md but with the new
>>>>> simplification
>>>>> the combine/simplify-rtx
>>>>> machinery produces:
>>>>> (set (reg:CC 17 flags)
>>>>>       (compare:CC (subreg:QI (zero_extract:HI (reg:HI 98)
>>>>>                   (const_int 8 [0x8])
>>>>>                   (const_int 8 [0x8])) 0)
>>>>>           (const_int 4 [0x4])))
>>>>> Notice that the zero_extract now has HImode like the register source
>>>>> rather
>>>>> than SImode.
>>>>> The existing *cmpqi_ext_<n> patterns however explicitly demand an
>>>>> SImode on
>>>>> the zero_extract.
>>>>> I'm not overly familiar with the i386 port but I think that's too
>>>>> restrictive.
>>>>> The RTL documentation says:
>>>>> For (zero_extract:m loc size pos) "The mode m is the same as the mode
>>>>> that
>>>>> would be used for loc if it were a register."
>>>>> I'm not sure if that means that the mode of the zero_extract and the
>>>>> source
>>>>> register must always match (as is the
>>>>> case after patch [1/3]) but in any case it shouldn't matter
>>>>> semantically
>>>>> since we're taking a QImode subreg of the whole
>>>>> thing anyway.
>>>>> So the proposed solution in this patch is to allow HI, SI and DImode
>>>>> zero_extracts in these patterns as these are the
>>>>> modes that the ext_register_operand predicate accepts, so that the
>>>>> patterns
>>>>> can match the new form above.
>>>>> With this patch the aforementioned test passes again and bootstrap and
>>>>> testing on x86_64-unknown-linux-gnu shows
>>>>> no regressions.
>>>>> Is this ok for trunk if the first patch is accepted?
>>>> Huh, there are many other zero-extract patterns besides cmpqi_ext_*
>>>> with QImode subreg of SImode zero_extract in i386.md, used to access
>>>> high QImode register of HImode pair. A quick grep shows these that
>>>> have _ext_ in their name:
>>>> (define_insn "*cmpqi_ext_1"
>>>> (define_insn "*cmpqi_ext_2"
>>>> (define_expand "cmpqi_ext_3"
>>>> (define_insn "*cmpqi_ext_3"
>>>> (define_insn "*cmpqi_ext_4"
>>>> (define_insn "addqi_ext_1"
>>>> (define_insn "*addqi_ext_2"
>>>> (define_expand "testqi_ext_1_ccno"
>>>> (define_insn "*testqi_ext_1"
>>>> (define_insn "*testqi_ext_2"
>>>> (define_insn_and_split "*testqi_ext_3"
>>>> (define_insn "andqi_ext_1"
>>>> (define_insn "*andqi_ext_1_cc"
>>>> (define_insn "*andqi_ext_2"
>>>> (define_insn "*<code>qi_ext_1"
>>>> (define_insn "*<code>qi_ext_2"
>>>> (define_expand "xorqi_ext_1_cc"
>>>> (define_insn "*xorqi_ext_1_cc"
>>>> There are also relevant splitters and peephole2 patterns.
>>> I see. Another approach I've looked at is removing the mode specifier
>>> from
>>> the zero_extract in these patterns. This means that they can be of any
>>> mode
>>> so they will match all of these modes without creating new patterns
>>> through
>>> iterators. That also works for the testcase and passes bootstrap and
>>> testing
>>> however there is the snag that the define_insns that don't start with a
>>> "*"
>>> are used to generate RTL through the gen_* mechanism and in that context
>>> the absence of a mode on the zero_extract would mean a VOIDmode
>>> zero_extract
>>> would be created, which I'm fairly sure is not good. So in my
>>> experiments I left
>>> those patterns alone (with an explicit SI on the zero_extract).
>>>> IIRC, SImode zero_extract was enough to catch all high-register uses.
>>>> There will be a pattern explosion if we want to handle all other
>>>> integer modes here. However, I'm not a RTL expert, so someone will
>>>> have to say what is the correct RTX form here.
>>> Jeff, Richard, could you please give us some guidance on this issue?
>>> Sorry for the trouble.
>> I don't think any of the patterns above are known to the generic code.
>> So you just have to check the x86 backend to see their precise uses in a
>> generator (ie gen_cmpqi_ext_3) and verify those do not allow a VOIDmode
>> (or any other undesirable mode) to slip through.
>> Jeff
> Thanks Jeff, I did have a look. I think we want to maintain the SImode on
> the
> RTL that gets created through these named expanders, as generating a
> VOIDmode
> zero_extract is not valid. So my patch leaves those intact.
> The patch removes the mode from the zero_extract RTXes in patterns that are
> only ever going to get matched (as opposed to generated). That is the ones
> that
> start with "*" in their name.
> This should allow them to match any of the zero_extract modes that
> might get generated by the midend.
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> Is this a preferable approach?

We are trying to avoid VOIDmode RTXes as much as possible (to tighten
pattern matching to avoid various surpsrises). Looking at these
instructions, I think that using SWI248 in insn and peephole2 patterns
should be OK.

So, if it survives regression tests, the patch is OK with SWI248 mode
instead of void mode.


> Thanks,
> Kyrill
> 2018-02-14  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>     PR target/84164
>     * config/i386/i386.md (*cmpqi_ext_1, *cmpqi_ext_2, *cmpqi_ext_3,
>     *cmpqi_ext_4, *extzvqi_mem_rex64, *extzvqi, QImode zero_extract
>     peephole, *addqi_ext_2, *testqi_ext_1, *testqi_ext_2, *andqi_ext_1_cc,
>     *andqi_ext_2, *<code>qi_ext_1, *<code>qi_ext_2, *xorqi_ext_1_cc
>     AND QImode subreg of zero_extract peephole): Remove mode from
> zero_extract.

Reply via email to