Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-13 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 5:07 PM Segher Boessenkool
 wrote:
>
> On Fri, Sep 10, 2021 at 12:53:37PM +0200, Richard Biener wrote:
> > On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool
> >  wrote:
> > > And many targets have strange rules for bit-strings in which modes can
> > > be used as bit-strings in which other modes, and at what offsets in
> > > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > > but changing this without a transition plan simply does not work.
> >
> > But we _do_ already allow some of them :/  Like
>
> Yes.  And all of this is old and ingrained, and targets depend on the
> status quo, so changing this will need more care and planning and
> cooperation.  It certainly is a worthwhile thing to improve, but it is
> not a small project, and it requires a plan.
>
> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> >  is the culprit here, and not the backends.  */
> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > ;
> >
> > so for the special case where 'regsize' matches osize it would be
> > a bit-cast of a full register from int to float.  But as written it also
> > allows (subreg:XF (reg:TI))  which will likely wreck havoc?
>
> That does not pass the isize >= osize test?  Or maybe I don't know
> what XFmode is well enough :-)  Hey I can read, I have source, and it
> is Friday...

TImode is 16 bytes but XFmode is 12.  I meant to construct a case
which passes all the tests but where definitely such kind of subreg
is very odd to be "allowed" by validate_subreg - a target may of course
have means to make sense of it (I don't see how x87 would though).

> Ah.  So XF has different size on 32-bit and on 64-bit, but that doesn't
> even matter here.
>
> > Similar for the omode == word_mode check which allows
> > (subreg:DI (reg:TF ..)).  That is, the existing special-cases look
> > too broad to me - and they probably exist because when validate_subreg
> > rejects sth then we can't put it together later when expand split it
> > into two subregs and a pseudo ...
>
> I said it before, and I'll say it again, it is a very important point:
> expand should not try to optimise this, *at all*.  And not just this,
> not *anything*.  Expand's job in the current compiler is only to
> translate Gimple to RTL, and nothing more.

+1 (or +10!), but unfortunately expand _is_ an important part
of optimization - in particular when it gets to avoiding stack usage
since we can never get rid of all effects of allocating a stack slot.

Splitting to multiple insns with pseudos should be fine though, but
it at least seems that splitting (subreg:FLOAT (reg:INT)) into
(subreg:INT (reg:INT)) (subreg:FLOAT (reg:INT)) isn't always valid.

Richard.


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Segher Boessenkool
On Fri, Sep 10, 2021 at 11:09:31AM +0800, Hongtao Liu wrote:
> On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
>  wrote:
> > way too long.  But from that same history it follows that anything you
> > do not super carefully (with testing everywhere) will cause some serious
> Frankly, testing everywhere is too heavy a burden for developers,
> after all, everyone has a limited variety of machines, and may not be
> familiar with using  other targets' simulators.

Your change should be tested on enough relevant targets that we can have
confidence it works properly.  That does not necessarily mean you have
to test everywhere yourself (although that is greatly appreciated, makes
life easier for everyone, including yourself, and as David points out
the cfarm is a great help).

So you didn't realise your patch would wreak havoc on most other targets
than what you tested on.  It happens, it helps if you can avoid it, but
you learn only from things that go wrong :-)

(The patch has still not been reverted btw.  I'll do this later tonight
if you don't).

> And back to the problem we were trying to solve at the beginning
> (subreg:HF(reg:SI)), I guess this is not just a problem in x86
> backend, any backend can encounter similar problems, that's why we
> remove all the weird cases in validate_subreg.

Expand should simply expand to two statements: one doing the real
subreg, the other doing the bit_cast.


Segher


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Segher Boessenkool
On Fri, Sep 10, 2021 at 12:53:37PM +0200, Richard Biener wrote:
> On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool
>  wrote:
> > And many targets have strange rules for bit-strings in which modes can
> > be used as bit-strings in which other modes, and at what offsets in
> > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > but changing this without a transition plan simply does not work.
> 
> But we _do_ already allow some of them :/  Like

Yes.  And all of this is old and ingrained, and targets depend on the
status quo, so changing this will need more care and planning and
cooperation.  It certainly is a worthwhile thing to improve, but it is
not a small project, and it requires a plan.

>   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
>  is the culprit here, and not the backends.  */
>   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> ;
> 
> so for the special case where 'regsize' matches osize it would be
> a bit-cast of a full register from int to float.  But as written it also
> allows (subreg:XF (reg:TI))  which will likely wreck havoc?

That does not pass the isize >= osize test?  Or maybe I don't know
what XFmode is well enough :-)  Hey I can read, I have source, and it
is Friday...

Ah.  So XF has different size on 32-bit and on 64-bit, but that doesn't
even matter here.

> Similar for the omode == word_mode check which allows
> (subreg:DI (reg:TF ..)).  That is, the existing special-cases look
> too broad to me - and they probably exist because when validate_subreg
> rejects sth then we can't put it together later when expand split it
> into two subregs and a pseudo ...

I said it before, and I'll say it again, it is a very important point:
expand should not try to optimise this, *at all*.  And not just this,
not *anything*.  Expand's job in the current compiler is only to
translate Gimple to RTL, and nothing more.

When later passes try to optimise things they will always ask the target
if it agrees with the change (all RTL has to pass recog() for example).
This simplifies life.


Segher


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 10:08 PM David Edelsohn  wrote:
>
> On Thu, Sep 9, 2021 at 11:03 PM Hongtao Liu  wrote:
> >
> > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
> >  wrote:
> > >
> > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > > I think we should (longer term) get rid of the overloaded meanings and
> > > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > > >
> > > > But subreg _is_ bit_cast.
> > >
> > > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > > bit_cast.  It is taking a part of a register, or a single register from
> > > a multi-register thing.  Paradoxicals are not bit-casts either.
> > >
> > > Subregs from or to (but not both) integer modes are generally bit_cast,
> > > yeah.
> > >
> > > > What is odd to me is that a "disallowed" subreg
> > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 
> > > > 0) 0).
> > > > Of course that's nested and invalid but just push the inner subreg to a
> > > > new pseudo and the thing becomes valid.
> > >
> > > Bingo.
> > >
> > > And many targets have strange rules for bit-strings in which modes can
> > > be used as bit-strings in which other modes, and at what offsets in
> > > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > > but changing this without a transition plan simply does not work.
> > >
> > > > > But that is not the core problem we had here.  The behaviour of the
> > > > > generic parts of the compiler was changed, without testing if that
> > > > > works on other targets but x86.  That is an understandable mistake, it
> > > > > takes some experience to know where the morasses are.  But this change
> > > > > should have been accompanied by testcases exercising the changed code.
> > > > > We would have clearly seen there are issues then, simply by watching
> > > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > > anyway).  Also, if there were testcases for this, we could have some
> > > > > confidence that a change in this area is robust.
> > > >
> > > > Well, that only works if some maintainers that are familiar enough
> > > > with all this chime in ;)
> > >
> > > Not really.  It works always.  And it works way better than the
> > > pandemonium we now have with broken targets left and right.
> > >
> > > With testcases anyone can see if any specific target is broken here.
> > >
> > > > It's stage1 so it's understandable that some
> > > > people (like me ...) are tyring to help people making progress even
> > > > if that involves trying to decipher 30 years of GCC history in this
> > > > area (without much success in the end as we see) ;)
> > >
> > > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > > problematic part of GCC, which has been neglected and patched over for
> > > way too long.  But from that same history it follows that anything you
> > > do not super carefully (with testing everywhere) will cause some serious
> > Frankly, testing everywhere is too heavy a burden for developers,
> > after all, everyone has a limited variety of machines, and may not be
> > familiar with using  other targets' simulators.
>
> Hongtao,
>
> This is why the GNU Toolchain community sponsors the GCC Compile Farm.
> Not every architecture is provided, but a good subset.  And there are
> a few, powerful Linux on Power systems in the cluster, such as gcc135.
>
I see.
> And the comments in the code warned of port-specific problems.  Even
> if you cannot test the patch yourself, you should have publicly
> requested that other port maintainers test the patch to shake out
> problems.
Yes, sorry for the inconvenience.
>
> The Tree-SSA passes are mostly target-independent, but RTL is
> target-dependent, which can elicit widely different behavior in the
> RTL passes.  When you make changes to RTL passes in the common parts
> of the compiler, you must consider the impact on all targets.  GCC
> explicitly supports a wide variety of architectures, ABIs and OSes.
> All of the developers strive to ensure that changes don't adversely
> affect any targets.
>
> Thanks, David



-- 
BR,
Hongtao


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread David Edelsohn via Gcc-patches
On Thu, Sep 9, 2021 at 11:03 PM Hongtao Liu  wrote:
>
> On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
>  wrote:
> >
> > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > I think we should (longer term) get rid of the overloaded meanings and
> > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > >
> > > But subreg _is_ bit_cast.
> >
> > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > bit_cast.  It is taking a part of a register, or a single register from
> > a multi-register thing.  Paradoxicals are not bit-casts either.
> >
> > Subregs from or to (but not both) integer modes are generally bit_cast,
> > yeah.
> >
> > > What is odd to me is that a "disallowed" subreg
> > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 
> > > 0) 0).
> > > Of course that's nested and invalid but just push the inner subreg to a
> > > new pseudo and the thing becomes valid.
> >
> > Bingo.
> >
> > And many targets have strange rules for bit-strings in which modes can
> > be used as bit-strings in which other modes, and at what offsets in
> > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > but changing this without a transition plan simply does not work.
> >
> > > > But that is not the core problem we had here.  The behaviour of the
> > > > generic parts of the compiler was changed, without testing if that
> > > > works on other targets but x86.  That is an understandable mistake, it
> > > > takes some experience to know where the morasses are.  But this change
> > > > should have been accompanied by testcases exercising the changed code.
> > > > We would have clearly seen there are issues then, simply by watching
> > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > anyway).  Also, if there were testcases for this, we could have some
> > > > confidence that a change in this area is robust.
> > >
> > > Well, that only works if some maintainers that are familiar enough
> > > with all this chime in ;)
> >
> > Not really.  It works always.  And it works way better than the
> > pandemonium we now have with broken targets left and right.
> >
> > With testcases anyone can see if any specific target is broken here.
> >
> > > It's stage1 so it's understandable that some
> > > people (like me ...) are tyring to help people making progress even
> > > if that involves trying to decipher 30 years of GCC history in this
> > > area (without much success in the end as we see) ;)
> >
> > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > problematic part of GCC, which has been neglected and patched over for
> > way too long.  But from that same history it follows that anything you
> > do not super carefully (with testing everywhere) will cause some serious
> Frankly, testing everywhere is too heavy a burden for developers,
> after all, everyone has a limited variety of machines, and may not be
> familiar with using  other targets' simulators.

Hongtao,

This is why the GNU Toolchain community sponsors the GCC Compile Farm.
Not every architecture is provided, but a good subset.  And there are
a few, powerful Linux on Power systems in the cluster, such as gcc135.

And the comments in the code warned of port-specific problems.  Even
if you cannot test the patch yourself, you should have publicly
requested that other port maintainers test the patch to shake out
problems.

The Tree-SSA passes are mostly target-independent, but RTL is
target-dependent, which can elicit widely different behavior in the
RTL passes.  When you make changes to RTL passes in the common parts
of the compiler, you must consider the impact on all targets.  GCC
explicitly supports a wide variety of architectures, ABIs and OSes.
All of the developers strive to ensure that changes don't adversely
affect any targets.

Thanks, David


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 7:25 PM Hongtao Liu  wrote:
>
> On Fri, Sep 10, 2021 at 6:54 PM Richard Biener
>  wrote:
> >
> > On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu  wrote:
> > >
> > > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
> > >  wrote:
> > > >
> > > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > > > I think we should (longer term) get rid of the overloaded meanings 
> > > > > > and
> > > > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > > > >
> > > > > But subreg _is_ bit_cast.
> > > >
> > > > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > > > bit_cast.  It is taking a part of a register, or a single register from
> > > > a multi-register thing.  Paradoxicals are not bit-casts either.
> > > >
> > > > Subregs from or to (but not both) integer modes are generally bit_cast,
> > > > yeah.
> > > >
> > > > > What is odd to me is that a "disallowed" subreg
> > > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI 
> > > > > ..) 0) 0).
> > > > > Of course that's nested and invalid but just push the inner subreg to 
> > > > > a
> > > > > new pseudo and the thing becomes valid.
> > > >
> > > > Bingo.
> > > >
> > > > And many targets have strange rules for bit-strings in which modes can
> > > > be used as bit-strings in which other modes, and at what offsets in
> > > > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > > > but changing this without a transition plan simply does not work.
> > > >
> > > > > > But that is not the core problem we had here.  The behaviour of the
> > > > > > generic parts of the compiler was changed, without testing if that
> > > > > > works on other targets but x86.  That is an understandable mistake, 
> > > > > > it
> > > > > > takes some experience to know where the morasses are.  But this 
> > > > > > change
> > > > > > should have been accompanied by testcases exercising the changed 
> > > > > > code.
> > > > > > We would have clearly seen there are issues then, simply by watching
> > > > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > > > anyway).  Also, if there were testcases for this, we could have some
> > > > > > confidence that a change in this area is robust.
> > > > >
> > > > > Well, that only works if some maintainers that are familiar enough
> > > > > with all this chime in ;)
> > > >
> > > > Not really.  It works always.  And it works way better than the
> > > > pandemonium we now have with broken targets left and right.
> > > >
> > > > With testcases anyone can see if any specific target is broken here.
> > > >
> > > > > It's stage1 so it's understandable that some
> > > > > people (like me ...) are tyring to help people making progress even
> > > > > if that involves trying to decipher 30 years of GCC history in this
> > > > > area (without much success in the end as we see) ;)
> > > >
> > > > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > > > problematic part of GCC, which has been neglected and patched over for
> > > > way too long.  But from that same history it follows that anything you
> > > > do not super carefully (with testing everywhere) will cause some serious
> > > Frankly, testing everywhere is too heavy a burden for developers,
> > > after all, everyone has a limited variety of machines, and may not be
> > > familiar with using  other targets' simulators.
> > > And back to the problem we were trying to solve at the beginning
> > > (subreg:HF(reg:SI)), I guess this is not just a problem in x86
> > > backend, any backend can encounter similar problems, that's why we
> > > remove all the weird cases in validate_subreg.
> >
> > So can you please revert the change for now?  I think we need to go
> > back to the issue in extract_bit_field - does it somehow work to use
> > validate_subreg to avoid creating the subreg we ICE on in the first
> > place and what happens then to code quality?
> Sure, let me test the patch.
Survive regtest, code quality seems to be ok.
But lose some performance.
.cfi_startproc
-   pinsrw  $0, %edi, %xmm0
+   movw%di, -2(%rsp)
+   pinsrw  $0, -2(%rsp), %xmm0
ret
.cfi_endproc

Anyway I'll post the patches first and see if I can fix the
performance issue in the x86 backend.

> >
> > Thanks,
> > Richard.
> >
> > > > problems.  And nonse of these are easy to fix at all -- there is a
> > > > *reason* targets did this nastiness.
> > > >
> > > > > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as 
> > > > > > well?
> > > > > > Is there any reason this could not work?
> > > >
> > > > Still interested in this btw :-)  (And still very unrelated.)
> > > >
> > > >
> > > > Segher
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> 

Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 6:54 PM Richard Biener
 wrote:
>
> On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu  wrote:
> >
> > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
> >  wrote:
> > >
> > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > > I think we should (longer term) get rid of the overloaded meanings and
> > > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > > >
> > > > But subreg _is_ bit_cast.
> > >
> > > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > > bit_cast.  It is taking a part of a register, or a single register from
> > > a multi-register thing.  Paradoxicals are not bit-casts either.
> > >
> > > Subregs from or to (but not both) integer modes are generally bit_cast,
> > > yeah.
> > >
> > > > What is odd to me is that a "disallowed" subreg
> > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 
> > > > 0) 0).
> > > > Of course that's nested and invalid but just push the inner subreg to a
> > > > new pseudo and the thing becomes valid.
> > >
> > > Bingo.
> > >
> > > And many targets have strange rules for bit-strings in which modes can
> > > be used as bit-strings in which other modes, and at what offsets in
> > > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > > but changing this without a transition plan simply does not work.
> > >
> > > > > But that is not the core problem we had here.  The behaviour of the
> > > > > generic parts of the compiler was changed, without testing if that
> > > > > works on other targets but x86.  That is an understandable mistake, it
> > > > > takes some experience to know where the morasses are.  But this change
> > > > > should have been accompanied by testcases exercising the changed code.
> > > > > We would have clearly seen there are issues then, simply by watching
> > > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > > anyway).  Also, if there were testcases for this, we could have some
> > > > > confidence that a change in this area is robust.
> > > >
> > > > Well, that only works if some maintainers that are familiar enough
> > > > with all this chime in ;)
> > >
> > > Not really.  It works always.  And it works way better than the
> > > pandemonium we now have with broken targets left and right.
> > >
> > > With testcases anyone can see if any specific target is broken here.
> > >
> > > > It's stage1 so it's understandable that some
> > > > people (like me ...) are tyring to help people making progress even
> > > > if that involves trying to decipher 30 years of GCC history in this
> > > > area (without much success in the end as we see) ;)
> > >
> > > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > > problematic part of GCC, which has been neglected and patched over for
> > > way too long.  But from that same history it follows that anything you
> > > do not super carefully (with testing everywhere) will cause some serious
> > Frankly, testing everywhere is too heavy a burden for developers,
> > after all, everyone has a limited variety of machines, and may not be
> > familiar with using  other targets' simulators.
> > And back to the problem we were trying to solve at the beginning
> > (subreg:HF(reg:SI)), I guess this is not just a problem in x86
> > backend, any backend can encounter similar problems, that's why we
> > remove all the weird cases in validate_subreg.
>
> So can you please revert the change for now?  I think we need to go
> back to the issue in extract_bit_field - does it somehow work to use
> validate_subreg to avoid creating the subreg we ICE on in the first
> place and what happens then to code quality?
Sure, let me test the patch.
>
> Thanks,
> Richard.
>
> > > problems.  And nonse of these are easy to fix at all -- there is a
> > > *reason* targets did this nastiness.
> > >
> > > > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as 
> > > > > well?
> > > > > Is there any reason this could not work?
> > >
> > > Still interested in this btw :-)  (And still very unrelated.)
> > >
> > >
> > > Segher
> >
> >
> >
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu  wrote:
>
> On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
>  wrote:
> >
> > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > > I think we should (longer term) get rid of the overloaded meanings and
> > > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > > "bit_cast" (or is there a nice short more traditional name for it?)
> > >
> > > But subreg _is_ bit_cast.
> >
> > It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> > bit_cast.  It is taking a part of a register, or a single register from
> > a multi-register thing.  Paradoxicals are not bit-casts either.
> >
> > Subregs from or to (but not both) integer modes are generally bit_cast,
> > yeah.
> >
> > > What is odd to me is that a "disallowed" subreg
> > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 
> > > 0) 0).
> > > Of course that's nested and invalid but just push the inner subreg to a
> > > new pseudo and the thing becomes valid.
> >
> > Bingo.
> >
> > And many targets have strange rules for bit-strings in which modes can
> > be used as bit-strings in which other modes, and at what offsets in
> > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > but changing this without a transition plan simply does not work.
> >
> > > > But that is not the core problem we had here.  The behaviour of the
> > > > generic parts of the compiler was changed, without testing if that
> > > > works on other targets but x86.  That is an understandable mistake, it
> > > > takes some experience to know where the morasses are.  But this change
> > > > should have been accompanied by testcases exercising the changed code.
> > > > We would have clearly seen there are issues then, simply by watching
> > > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > > anyway).  Also, if there were testcases for this, we could have some
> > > > confidence that a change in this area is robust.
> > >
> > > Well, that only works if some maintainers that are familiar enough
> > > with all this chime in ;)
> >
> > Not really.  It works always.  And it works way better than the
> > pandemonium we now have with broken targets left and right.
> >
> > With testcases anyone can see if any specific target is broken here.
> >
> > > It's stage1 so it's understandable that some
> > > people (like me ...) are tyring to help people making progress even
> > > if that involves trying to decipher 30 years of GCC history in this
> > > area (without much success in the end as we see) ;)
> >
> > Yeah :-)  And my thanks to you and everyone involved for tackling this
> > problematic part of GCC, which has been neglected and patched over for
> > way too long.  But from that same history it follows that anything you
> > do not super carefully (with testing everywhere) will cause some serious
> Frankly, testing everywhere is too heavy a burden for developers,
> after all, everyone has a limited variety of machines, and may not be
> familiar with using  other targets' simulators.
> And back to the problem we were trying to solve at the beginning
> (subreg:HF(reg:SI)), I guess this is not just a problem in x86
> backend, any backend can encounter similar problems, that's why we
> remove all the weird cases in validate_subreg.

So can you please revert the change for now?  I think we need to go
back to the issue in extract_bit_field - does it somehow work to use
validate_subreg to avoid creating the subreg we ICE on in the first
place and what happens then to code quality?

Thanks,
Richard.

> > problems.  And nonse of these are easy to fix at all -- there is a
> > *reason* targets did this nastiness.
> >
> > > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
> > > > Is there any reason this could not work?
> >
> > Still interested in this btw :-)  (And still very unrelated.)
> >
> >
> > Segher
>
>
>
> --
> BR,
> Hongtao


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-10 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool
 wrote:
>
> On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > I think we should (longer term) get rid of the overloaded meanings and
> > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > "bit_cast" (or is there a nice short more traditional name for it?)
> >
> > But subreg _is_ bit_cast.
>
> It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> bit_cast.  It is taking a part of a register, or a single register from
> a multi-register thing.  Paradoxicals are not bit-casts either.
>
> Subregs from or to (but not both) integer modes are generally bit_cast,
> yeah.
>
> > What is odd to me is that a "disallowed" subreg
> > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 
> > 0).
> > Of course that's nested and invalid but just push the inner subreg to a
> > new pseudo and the thing becomes valid.
>
> Bingo.
>
> And many targets have strange rules for bit-strings in which modes can
> be used as bit-strings in which other modes, and at what offsets in
> which registers.  Now perhaps none of that is optimal (I bet it isn't),
> but changing this without a transition plan simply does not work.

But we _do_ already allow some of them :/  Like

  /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
 is the culprit here, and not the backends.  */
  else if (known_ge (osize, regsize) && known_ge (isize, osize))
;

so for the special case where 'regsize' matches osize it would be
a bit-cast of a full register from int to float.  But as written it also
allows (subreg:XF (reg:TI))  which will likely wreck havoc?

Similar for the omode == word_mode check which allows
(subreg:DI (reg:TF ..)).  That is, the existing special-cases look
too broad to me - and they probably exist because when validate_subreg
rejects sth then we can't put it together later when expand split it
into two subregs and a pseudo ...

> > > But that is not the core problem we had here.  The behaviour of the
> > > generic parts of the compiler was changed, without testing if that
> > > works on other targets but x86.  That is an understandable mistake, it
> > > takes some experience to know where the morasses are.  But this change
> > > should have been accompanied by testcases exercising the changed code.
> > > We would have clearly seen there are issues then, simply by watching
> > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > anyway).  Also, if there were testcases for this, we could have some
> > > confidence that a change in this area is robust.
> >
> > Well, that only works if some maintainers that are familiar enough
> > with all this chime in ;)
>
> Not really.  It works always.  And it works way better than the
> pandemonium we now have with broken targets left and right.
>
> With testcases anyone can see if any specific target is broken here.
>
> > It's stage1 so it's understandable that some
> > people (like me ...) are tyring to help people making progress even
> > if that involves trying to decipher 30 years of GCC history in this
> > area (without much success in the end as we see) ;)
>
> Yeah :-)  And my thanks to you and everyone involved for tackling this
> problematic part of GCC, which has been neglected and patched over for
> way too long.  But from that same history it follows that anything you
> do not super carefully (with testing everywhere) will cause some serious
> problems.  And nonse of these are easy to fix at all -- there is a
> *reason* targets did this nastiness.
>
> > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
> > > Is there any reason this could not work?
>
> Still interested in this btw :-)  (And still very unrelated.)

Sure, why not ...

Richard.

>
> Segher


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-09 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool
 wrote:
>
> On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > > I think we should (longer term) get rid of the overloaded meanings and
> > > uses of subregs.  One fairly simple thing is to make a new rtx code
> > > "bit_cast" (or is there a nice short more traditional name for it?)
> >
> > But subreg _is_ bit_cast.
>
> It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
> bit_cast.  It is taking a part of a register, or a single register from
> a multi-register thing.  Paradoxicals are not bit-casts either.
>
> Subregs from or to (but not both) integer modes are generally bit_cast,
> yeah.
>
> > What is odd to me is that a "disallowed" subreg
> > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 
> > 0).
> > Of course that's nested and invalid but just push the inner subreg to a
> > new pseudo and the thing becomes valid.
>
> Bingo.
>
> And many targets have strange rules for bit-strings in which modes can
> be used as bit-strings in which other modes, and at what offsets in
> which registers.  Now perhaps none of that is optimal (I bet it isn't),
> but changing this without a transition plan simply does not work.
>
> > > But that is not the core problem we had here.  The behaviour of the
> > > generic parts of the compiler was changed, without testing if that
> > > works on other targets but x86.  That is an understandable mistake, it
> > > takes some experience to know where the morasses are.  But this change
> > > should have been accompanied by testcases exercising the changed code.
> > > We would have clearly seen there are issues then, simply by watching
> > > gcc-testresults@ (and/or maintainers are on top of the test results
> > > anyway).  Also, if there were testcases for this, we could have some
> > > confidence that a change in this area is robust.
> >
> > Well, that only works if some maintainers that are familiar enough
> > with all this chime in ;)
>
> Not really.  It works always.  And it works way better than the
> pandemonium we now have with broken targets left and right.
>
> With testcases anyone can see if any specific target is broken here.
>
> > It's stage1 so it's understandable that some
> > people (like me ...) are tyring to help people making progress even
> > if that involves trying to decipher 30 years of GCC history in this
> > area (without much success in the end as we see) ;)
>
> Yeah :-)  And my thanks to you and everyone involved for tackling this
> problematic part of GCC, which has been neglected and patched over for
> way too long.  But from that same history it follows that anything you
> do not super carefully (with testing everywhere) will cause some serious
Frankly, testing everywhere is too heavy a burden for developers,
after all, everyone has a limited variety of machines, and may not be
familiar with using  other targets' simulators.
And back to the problem we were trying to solve at the beginning
(subreg:HF(reg:SI)), I guess this is not just a problem in x86
backend, any backend can encounter similar problems, that's why we
remove all the weird cases in validate_subreg.
> problems.  And nonse of these are easy to fix at all -- there is a
> *reason* targets did this nastiness.
>
> > > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
> > > Is there any reason this could not work?
>
> Still interested in this btw :-)  (And still very unrelated.)
>
>
> Segher



-- 
BR,
Hongtao


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-09 Thread Segher Boessenkool
On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> > I think we should (longer term) get rid of the overloaded meanings and
> > uses of subregs.  One fairly simple thing is to make a new rtx code
> > "bit_cast" (or is there a nice short more traditional name for it?)
> 
> But subreg _is_ bit_cast.

It is not.  (subreg:M (reg:N) O) for O>0, little-endian, is not a
bit_cast.  It is taking a part of a register, or a single register from
a multi-register thing.  Paradoxicals are not bit-casts either.

Subregs from or to (but not both) integer modes are generally bit_cast,
yeah.

> What is odd to me is that a "disallowed" subreg
> like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 0).
> Of course that's nested and invalid but just push the inner subreg to a
> new pseudo and the thing becomes valid.

Bingo.

And many targets have strange rules for bit-strings in which modes can
be used as bit-strings in which other modes, and at what offsets in
which registers.  Now perhaps none of that is optimal (I bet it isn't),
but changing this without a transition plan simply does not work.

> > But that is not the core problem we had here.  The behaviour of the
> > generic parts of the compiler was changed, without testing if that
> > works on other targets but x86.  That is an understandable mistake, it
> > takes some experience to know where the morasses are.  But this change
> > should have been accompanied by testcases exercising the changed code.
> > We would have clearly seen there are issues then, simply by watching
> > gcc-testresults@ (and/or maintainers are on top of the test results
> > anyway).  Also, if there were testcases for this, we could have some
> > confidence that a change in this area is robust.
> 
> Well, that only works if some maintainers that are familiar enough
> with all this chime in ;)

Not really.  It works always.  And it works way better than the
pandemonium we now have with broken targets left and right.

With testcases anyone can see if any specific target is broken here.

> It's stage1 so it's understandable that some
> people (like me ...) are tyring to help people making progress even
> if that involves trying to decipher 30 years of GCC history in this
> area (without much success in the end as we see) ;)

Yeah :-)  And my thanks to you and everyone involved for tackling this
problematic part of GCC, which has been neglected and patched over for
way too long.  But from that same history it follows that anything you
do not super carefully (with testing everywhere) will cause some serious
problems.  And nonse of these are easy to fix at all -- there is a
*reason* targets did this nastiness.

> > p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
> > Is there any reason this could not work?

Still interested in this btw :-)  (And still very unrelated.)


Segher


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-09 Thread Michael Meissner via Gcc-patches
On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote:
> But subreg _is_ bit_cast.  What is odd to me is that a "disallowed" subreg
> like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
> validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 0).
> Of course that's nested and invalid but just push the inner subreg to a
> new pseudo and the thing becomes valid.

Basically I only checked the mode of the SUBREG and the value in SUBREG_REG.  I
agree (subreg:SF (subreg:SI (reg:TI))) should not be allowed.

When I wrote the changes we didn't see SUBREG's as much to access the bottom or
top element of a structure that was held in a larger int field.  If the patch
is not reverted, then we have to deal with this somehow.  Such as with my patch
that allows (subreg:SF (reg:TI xxx)) and (subreg:SF (reg:DI xxx)).

The target hook idea might be too big of a hammer.  I would need to play with
it to see if there are other losses.

On the PowerPC, the problematical subregs only occur when you are moving a
register from GPR to a vector/float register or vice versa.  If you move a
subreg between two GPR registers or two vector/float registers, then you can
just do the copy.

Part of the issue is for some moves, you need to use a temporary register to
convert the results, but of course we have the long standing 'all moves for a
mode must be done in a single insn that covers all of the cases'.  Then you get
into secondary reload territory.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-09 Thread Jim Wilson
On Tue, Sep 7, 2021 at 12:12 AM Michael Meissner via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> This patch fixes the breakage in the PowerPC due to a recent change in
> SUBREG
> behavior.  While it is arguable that the patch that caused the breakage
> should
> be reverted, this patch should be a bandage to prevent these changes from
> happening again.
>

FYI I'm dealing with the same problem in the RISC-V backend via
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102211

My current proposed solution is a new predicate f_register_operand to be
used in patterns that use the f constraint which disallows the subregs that
can't work in FP registers due to NaN-boxing.  This just borrows a few
lines of code that were deleted from validate_subreg and puts it in a
RISC-V specific predicate function.  It worked for newlib and glibc
builds.  I'm trying bootstrap testing now, but my target is a little slower
than yours so this will take some time.

Jim


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-09 Thread Richard Biener via Gcc-patches
On Wed, Sep 8, 2021 at 9:18 PM Segher Boessenkool
 wrote:
>
> On Wed, Sep 08, 2021 at 08:39:31PM +0200, Richard Biener wrote:
> > On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool 
> >  wrote:
> > >It is not a good idea to do allow all those things.  Most backends can
> > >only support a few combinations of them, and everything else results in
> > >*worse* machine code, best case, and more and more complicated (and more
> > >buggy!) backend code.
> > >
> > >But that is a code quality issue.  The current problem is that we have
> > >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on
> > >other targets).  Code that used before doesn't anymore, and we have no
> > >clear way out, no recommendation how to fix this and a) keep the same
> > >functionality without huge changes, and b) keep the same machine code
> > >quality.
> > >
> > >I do not think something like that can be done.  That is why I am asking
> > >for the patch to be reverted until all of the groundwork for it has been
> > >done.  That includes making generic testcases that show how such subregs
> > >behave, so that we can see in testresults what changes do to existing
> > >targets.
> >
> > Heh, I understood your earlier reply that you supported the change in 
> > principle based on the fact that nested subregs are invalid.
>
> Ah.  No.  I meant to lament the fact that we use subregs for multiple
> things, so that doing a bit_cast of a real subreg has to be expressed as
> just one subreg, which is clearly sub-optimal for most backends.
>
> I say *has to* because a subreg of a subreg is not valid RTL; it has to
> be written as just one subreg.  Which makes thing more confusing and
> confused than this already non-trivial thing has to be.
>
> > Now, I don't think that validate_subreg is supposed to be the decision 
> > maker on what a target allows.
>
> Right, some target hook or macro or whatnot should.
>
> > For subregs of hardregs we seem to have a good way of validating, but
> > what do we have for subregs of pseudos? Is it the passes generating
> > the new unsupported subregs that should do different things? Should
> > validate_subreg use a target hook to allow those special casings we
> > removed which all were necessary just for specific targets but
> > appearantly did not do any harm for other targets?
>
> Currently, we can disallow things in predicates and/or the instruction
> conditions.
>
> > Can you give advice as to how to address the needs of the HFmode subregs on 
> > x86 if not by adding another (generic) narrow exception in validate_subreg?
>
> If I knew what the problem was, perhaps?  Is this explained in some mail
> somewhere?
>
> > That said, I fail to see a good way forward after now two appearantly 
> > failed attempts.
> >
> > Btw, I'm fine reverting the patch but then what's the solution here?
>
> I think we should (longer term) get rid of the overloaded meanings and
> uses of subregs.  One fairly simple thing is to make a new rtx code
> "bit_cast" (or is there a nice short more traditional name for it?)

But subreg _is_ bit_cast.  What is odd to me is that a "disallowed" subreg
like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of
validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 0).
Of course that's nested and invalid but just push the inner subreg to a
new pseudo and the thing becomes valid.

That was my original idea for dealing with the issue described in
the thread including
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578437.html
where the original issue is that extract_integral_bit_field creates
a subreg that's not valid according to validate_subreg.  So I thought
well, extracting bitfields should be done on integer modes (thus
pun everything to ints and back), but that had even more(?) fallout.

> But that is not the core problem we had here.  The behaviour of the
> generic parts of the compiler was changed, without testing if that
> works on other targets but x86.  That is an understandable mistake, it
> takes some experience to know where the morasses are.  But this change
> should have been accompanied by testcases exercising the changed code.
> We would have clearly seen there are issues then, simply by watching
> gcc-testresults@ (and/or maintainers are on top of the test results
> anyway).  Also, if there were testcases for this, we could have some
> confidence that a change in this area is robust.

Well, that only works if some maintainers that are familiar enough
with all this chime in ;)   It's stage1 so it's understandable that some
people (like me ...) are tyring to help people making progress even
if that involves trying to decipher 30 years of GCC history in this
area (without much success in the end as we see) ;)

Given validate_subreg's _intent_ is to disallow float subregs
that change size exceptions like

  /* ??? This should not be here.  Temporarily continue to allow word_mode
 subregs of anything.  The most common 

Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-08 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 9, 2021 at 3:17 AM Segher Boessenkool
 wrote:
>
> On Wed, Sep 08, 2021 at 08:39:31PM +0200, Richard Biener wrote:
> > On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool 
> >  wrote:
> > >It is not a good idea to do allow all those things.  Most backends can
> > >only support a few combinations of them, and everything else results in
> > >*worse* machine code, best case, and more and more complicated (and more
> > >buggy!) backend code.
> > >
> > >But that is a code quality issue.  The current problem is that we have
> > >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on
> > >other targets).  Code that used before doesn't anymore, and we have no
> > >clear way out, no recommendation how to fix this and a) keep the same
> > >functionality without huge changes, and b) keep the same machine code
> > >quality.
> > >
> > >I do not think something like that can be done.  That is why I am asking
> > >for the patch to be reverted until all of the groundwork for it has been
> > >done.  That includes making generic testcases that show how such subregs
> > >behave, so that we can see in testresults what changes do to existing
> > >targets.
> >
> > Heh, I understood your earlier reply that you supported the change in 
> > principle based on the fact that nested subregs are invalid.
>
> Ah.  No.  I meant to lament the fact that we use subregs for multiple
> things, so that doing a bit_cast of a real subreg has to be expressed as
> just one subreg, which is clearly sub-optimal for most backends.
>
> I say *has to* because a subreg of a subreg is not valid RTL; it has to
> be written as just one subreg.  Which makes thing more confusing and
> confused than this already non-trivial thing has to be.
>
> > Now, I don't think that validate_subreg is supposed to be the decision 
> > maker on what a target allows.
>
> Right, some target hook or macro or whatnot should.
>
> > For subregs of hardregs we seem to have a good way of validating, but
> > what do we have for subregs of pseudos? Is it the passes generating
> > the new unsupported subregs that should do different things? Should
> > validate_subreg use a target hook to allow those special casings we
> > removed which all were necessary just for specific targets but
> > appearantly did not do any harm for other targets?
>
> Currently, we can disallow things in predicates and/or the instruction
> conditions.
>
> > Can you give advice as to how to address the needs of the HFmode subregs on 
> > x86 if not by adding another (generic) narrow exception in validate_subreg?
>
> If I knew what the problem was, perhaps?  Is this explained in some mail
> somewhere?
it's at [1], target_hook may not be the best solution, if other
targets support HFmode, they may encounter the same problem, and
revisit it again.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576827.html

>
> > That said, I fail to see a good way forward after now two appearantly 
> > failed attempts.
> >
> > Btw, I'm fine reverting the patch but then what's the solution here?
>
> I think we should (longer term) get rid of the overloaded meanings and
> uses of subregs.  One fairly simple thing is to make a new rtx code
> "bit_cast" (or is there a nice short more traditional name for it?)
>
> But that is not the core problem we had here.  The behaviour of the
> generic parts of the compiler was changed, without testing if that
> works on other targets but x86.  That is an understandable mistake, it
> takes some experience to know where the morasses are.  But this change
> should have been accompanied by testcases exercising the changed code.
> We would have clearly seen there are issues then, simply by watching
> gcc-testresults@ (and/or maintainers are on top of the test results
> anyway).  Also, if there were testcases for this, we could have some
> confidence that a change in this area is robust.
>
>
> Segher
>
>
> p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
> Is there any reason this could not work?



-- 
BR,
Hongtao


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-08 Thread Hongtao Liu via Gcc-patches
On Thu, Sep 9, 2021 at 3:17 AM Segher Boessenkool
 wrote:
>
> On Wed, Sep 08, 2021 at 08:39:31PM +0200, Richard Biener wrote:
> > On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool 
> >  wrote:
> > >It is not a good idea to do allow all those things.  Most backends can
> > >only support a few combinations of them, and everything else results in
> > >*worse* machine code, best case, and more and more complicated (and more
> > >buggy!) backend code.
> > >
> > >But that is a code quality issue.  The current problem is that we have
> > >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on
> > >other targets).  Code that used before doesn't anymore, and we have no
> > >clear way out, no recommendation how to fix this and a) keep the same
> > >functionality without huge changes, and b) keep the same machine code
> > >quality.
> > >
> > >I do not think something like that can be done.  That is why I am asking
> > >for the patch to be reverted until all of the groundwork for it has been
> > >done.  That includes making generic testcases that show how such subregs
> > >behave, so that we can see in testresults what changes do to existing
> > >targets.
> >
> > Heh, I understood your earlier reply that you supported the change in 
> > principle based on the fact that nested subregs are invalid.
>
> Ah.  No.  I meant to lament the fact that we use subregs for multiple
> things, so that doing a bit_cast of a real subreg has to be expressed as
> just one subreg, which is clearly sub-optimal for most backends.
>
> I say *has to* because a subreg of a subreg is not valid RTL; it has to
> be written as just one subreg.  Which makes thing more confusing and
> confused than this already non-trivial thing has to be.
>
> > Now, I don't think that validate_subreg is supposed to be the decision 
> > maker on what a target allows.
Since there is no good solution yet, I think we can add a target hook
(targetm.validate_subreg) to recover the original removed code as a
default behavior, and then x86 itself defines the hook as empty
(meaning that the backend allows all kinds of subreg).
something like

@@ -922,6 +922,9 @@ validate_subreg (machine_mode omode, machine_mode imode,

   poly_uint64 regsize = REGMODE_NATURAL_SIZE (imode);

+  if (!targetm.valiate_subreg (omode, imode, reg, offset))
+return false;
+
   /* Paradoxical subregs must have offset zero.  */
   if (maybe_gt (osize, isize))
 return known_eq (offset, 0U);

>
> Right, some target hook or macro or whatnot should.
>
> > For subregs of hardregs we seem to have a good way of validating, but
> > what do we have for subregs of pseudos? Is it the passes generating
> > the new unsupported subregs that should do different things? Should
> > validate_subreg use a target hook to allow those special casings we
> > removed which all were necessary just for specific targets but
> > appearantly did not do any harm for other targets?
>
> Currently, we can disallow things in predicates and/or the instruction
> conditions.
>
> > Can you give advice as to how to address the needs of the HFmode subregs on 
> > x86 if not by adding another (generic) narrow exception in validate_subreg?
>
> If I knew what the problem was, perhaps?  Is this explained in some mail
> somewhere?
>
> > That said, I fail to see a good way forward after now two appearantly 
> > failed attempts.
> >
> > Btw, I'm fine reverting the patch but then what's the solution here?
>
> I think we should (longer term) get rid of the overloaded meanings and
> uses of subregs.  One fairly simple thing is to make a new rtx code
> "bit_cast" (or is there a nice short more traditional name for it?)
>
> But that is not the core problem we had here.  The behaviour of the
> generic parts of the compiler was changed, without testing if that
> works on other targets but x86.  That is an understandable mistake, it
> takes some experience to know where the morasses are.  But this change
> should have been accompanied by testcases exercising the changed code.
> We would have clearly seen there are issues then, simply by watching
> gcc-testresults@ (and/or maintainers are on top of the test results
> anyway).  Also, if there were testcases for this, we could have some
> confidence that a change in this area is robust.
>
>
> Segher
>
>
> p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
> Is there any reason this could not work?



-- 
BR,
Hongtao


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-08 Thread Segher Boessenkool
On Wed, Sep 08, 2021 at 08:39:31PM +0200, Richard Biener wrote:
> On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool 
>  wrote:
> >It is not a good idea to do allow all those things.  Most backends can
> >only support a few combinations of them, and everything else results in
> >*worse* machine code, best case, and more and more complicated (and more
> >buggy!) backend code.
> >
> >But that is a code quality issue.  The current problem is that we have
> >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on
> >other targets).  Code that used before doesn't anymore, and we have no
> >clear way out, no recommendation how to fix this and a) keep the same
> >functionality without huge changes, and b) keep the same machine code
> >quality.
> >
> >I do not think something like that can be done.  That is why I am asking
> >for the patch to be reverted until all of the groundwork for it has been
> >done.  That includes making generic testcases that show how such subregs
> >behave, so that we can see in testresults what changes do to existing
> >targets.
> 
> Heh, I understood your earlier reply that you supported the change in 
> principle based on the fact that nested subregs are invalid.

Ah.  No.  I meant to lament the fact that we use subregs for multiple
things, so that doing a bit_cast of a real subreg has to be expressed as
just one subreg, which is clearly sub-optimal for most backends.

I say *has to* because a subreg of a subreg is not valid RTL; it has to
be written as just one subreg.  Which makes thing more confusing and
confused than this already non-trivial thing has to be.

> Now, I don't think that validate_subreg is supposed to be the decision maker 
> on what a target allows.

Right, some target hook or macro or whatnot should.

> For subregs of hardregs we seem to have a good way of validating, but
> what do we have for subregs of pseudos? Is it the passes generating
> the new unsupported subregs that should do different things? Should
> validate_subreg use a target hook to allow those special casings we
> removed which all were necessary just for specific targets but
> appearantly did not do any harm for other targets? 

Currently, we can disallow things in predicates and/or the instruction
conditions.

> Can you give advice as to how to address the needs of the HFmode subregs on 
> x86 if not by adding another (generic) narrow exception in validate_subreg? 

If I knew what the problem was, perhaps?  Is this explained in some mail
somewhere?

> That said, I fail to see a good way forward after now two appearantly failed 
> attempts. 
> 
> Btw, I'm fine reverting the patch but then what's the solution here? 

I think we should (longer term) get rid of the overloaded meanings and
uses of subregs.  One fairly simple thing is to make a new rtx code
"bit_cast" (or is there a nice short more traditional name for it?)

But that is not the core problem we had here.  The behaviour of the
generic parts of the compiler was changed, without testing if that
works on other targets but x86.  That is an understandable mistake, it
takes some experience to know where the morasses are.  But this change
should have been accompanied by testcases exercising the changed code.
We would have clearly seen there are issues then, simply by watching
gcc-testresults@ (and/or maintainers are on top of the test results
anyway).  Also, if there were testcases for this, we could have some
confidence that a change in this area is robust.


Segher


p.s. Very unrelated...  Should we have __builtin_bit_cast for C as well?
Is there any reason this could not work?


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-08 Thread Richard Biener via Gcc-patches
On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool 
 wrote:
>Hi!
>
>On Wed, Sep 08, 2021 at 08:42:44AM +0200, Richard Biener wrote:
>> On Wed, Sep 8, 2021 at 1:08 AM Segher Boessenkool
>>  wrote:
>> > The core of the problem is that subreg of pseudos has three meanings:
>> >   -- Paradoxical subregs;
>> >   -- Actual subregs;
>> >   -- "bit_cast" thingies: treat the same bits as something else.  Like
>> >  looking at the bits of a float as its memory image.
>> >
>> > Ignoring paradoxical subregs (as well as subregs of mem, which should
>> > have disappeared by now), and subregs of hard registers as well (those
>> > have *different* semantics after all), the other two kinds can be mixed,
>> > and *have to* be mixed, because subregs of subregs are non-canonical.
>> >
>> > Is there any reason why not to allow this kind of subreg?
>> 
>> In fact the causing rev. in question 
>> (d2874d905647a1d146dafa60199d440e837adc4d)
>> made all those subregs "valid" in terms of what validate_subreg is verifying
>> and thus now the few places using validate_subreg to check whether some
>> subreg is valid will now happily do float<->int converting subregs.
>
>Like in PR102154:
>
>error: unrecognizable insn:
>(insn 11 10 12 2 (set (reg:SF 118 [  ])
>(subreg:SF (reg:DI 122) 0))
>
>This generated valid code before, and now it is not anymore.  If the
>patch was meant to allow *more*, it failed that goal: it allows *less*.
>
>The issue is that we do not allow SF subregs in all cases.  We do not
>store SFmode quantities in IEEE SP format in registers normally, so it
>would require expensive extra code to allow such subregs.
>
>Whatever pass made these subregs is wrong, because they do not pass
>recog().
>
>> I do agree that those subregs should be allowed and that the above rev. is
>> a strict improvement (given it removes a lot of "but allow special case X
>> because target Y wants it" cases by simply allowing all of them).  But the
>> previous code seems to have papered over quite some backend issues.
>
>It is not a good idea to do allow all those things.  Most backends can
>only support a few combinations of them, and everything else results in
>*worse* machine code, best case, and more and more complicated (and more
>buggy!) backend code.
>
>But that is a code quality issue.  The current problem is that we have
>at least PR102211 and PR102154 (as well as reports elsewhere of bugs on
>other targets).  Code that used before doesn't anymore, and we have no
>clear way out, no recommendation how to fix this and a) keep the same
>functionality without huge changes, and b) keep the same machine code
>quality.
>
>I do not think something like that can be done.  That is why I am asking
>for the patch to be reverted until all of the groundwork for it has been
>done.  That includes making generic testcases that show how such subregs
>behave, so that we can see in testresults what changes do to existing
>targets.

Heh, I understood your earlier reply that you supported the change in principle 
based on the fact that nested subregs are invalid.

Now, I don't think that validate_subreg is supposed to be the decision maker on 
what a target allows. For subregs of hardregs we seem to have a good way of 
validating, but what do we have for subregs of pseudos? Is it the passes 
generating the new unsupported subregs that should do different things? Should 
validate_subreg use a target hook to allow those special casings we removed 
which all were necessary just for specific targets but appearantly did not do 
any harm for other targets? 

Can you give advice as to how to address the needs of the HFmode subregs on x86 
if not by adding another (generic) narrow exception in validate_subreg? 

That said, I fail to see a good way forward after now two appearantly failed 
attempts. 

Btw, I'm fine reverting the patch but then what's the solution here? 

Richard. 

>
>Segher



Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-08 Thread David Edelsohn via Gcc-patches
On Wed, Sep 8, 2021 at 1:10 PM Segher Boessenkool
 wrote:
>
> Hi!
>
> On Wed, Sep 08, 2021 at 08:42:44AM +0200, Richard Biener wrote:
> > On Wed, Sep 8, 2021 at 1:08 AM Segher Boessenkool
> >  wrote:
> > > The core of the problem is that subreg of pseudos has three meanings:
> > >   -- Paradoxical subregs;
> > >   -- Actual subregs;
> > >   -- "bit_cast" thingies: treat the same bits as something else.  Like
> > >  looking at the bits of a float as its memory image.
> > >
> > > Ignoring paradoxical subregs (as well as subregs of mem, which should
> > > have disappeared by now), and subregs of hard registers as well (those
> > > have *different* semantics after all), the other two kinds can be mixed,
> > > and *have to* be mixed, because subregs of subregs are non-canonical.
> > >
> > > Is there any reason why not to allow this kind of subreg?
> >
> > In fact the causing rev. in question 
> > (d2874d905647a1d146dafa60199d440e837adc4d)
> > made all those subregs "valid" in terms of what validate_subreg is verifying
> > and thus now the few places using validate_subreg to check whether some
> > subreg is valid will now happily do float<->int converting subregs.
>
> Like in PR102154:
>
> error: unrecognizable insn:
> (insn 11 10 12 2 (set (reg:SF 118 [  ])
> (subreg:SF (reg:DI 122) 0))
>
> This generated valid code before, and now it is not anymore.  If the
> patch was meant to allow *more*, it failed that goal: it allows *less*.
>
> The issue is that we do not allow SF subregs in all cases.  We do not
> store SFmode quantities in IEEE SP format in registers normally, so it
> would require expensive extra code to allow such subregs.
>
> Whatever pass made these subregs is wrong, because they do not pass
> recog().
>
> > I do agree that those subregs should be allowed and that the above rev. is
> > a strict improvement (given it removes a lot of "but allow special case X
> > because target Y wants it" cases by simply allowing all of them).  But the
> > previous code seems to have papered over quite some backend issues.
>
> It is not a good idea to do allow all those things.  Most backends can
> only support a few combinations of them, and everything else results in
> *worse* machine code, best case, and more and more complicated (and more
> buggy!) backend code.
>
> But that is a code quality issue.  The current problem is that we have
> at least PR102211 and PR102154 (as well as reports elsewhere of bugs on
> other targets).  Code that used before doesn't anymore, and we have no
> clear way out, no recommendation how to fix this and a) keep the same
> functionality without huge changes, and b) keep the same machine code
> quality.
>
> I do not think something like that can be done.  That is why I am asking
> for the patch to be reverted until all of the groundwork for it has been
> done.  That includes making generic testcases that show how such subregs
> behave, so that we can see in testresults what changes do to existing
> targets.

Richi,

As was mentioned in an earlier reply, the patch removed comments in
the code itself that explained why the code was necessary:

/* ??? This should not be here.  Temporarily continue to allow word_mode
subregs of anything.  The most common offender is (subreg:SI (reg:DF)).
Generally, backends are doing something sketchy but it'll take time to
fix them all.  */

/* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
is the culprit here, and not the backends.  */

The sources of the problem are issues in backends and issues in the
common part of the compiler.  Removal of this functionality should
start with a preemptive analysis of the compiler to determine how this
ugly behavior is being utilized and a plan to fix all backends and the
common code generation paths in the compiler, not a sweeping change to
the backend that flushes out problems.

The GCC Regression policy places the burden on the developer whose
patch introduced the problem.  Liuhong did not test any other targets
although the code comments clearly stated that the behavior was
required by many targets, and he is not stepping forward to resolve
the problems.  This patch should be reverted until the problems
exposed by it can be addressed in a careful and deliberate manner.

Thanks, David


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-08 Thread Segher Boessenkool
Hi!

On Wed, Sep 08, 2021 at 08:42:44AM +0200, Richard Biener wrote:
> On Wed, Sep 8, 2021 at 1:08 AM Segher Boessenkool
>  wrote:
> > The core of the problem is that subreg of pseudos has three meanings:
> >   -- Paradoxical subregs;
> >   -- Actual subregs;
> >   -- "bit_cast" thingies: treat the same bits as something else.  Like
> >  looking at the bits of a float as its memory image.
> >
> > Ignoring paradoxical subregs (as well as subregs of mem, which should
> > have disappeared by now), and subregs of hard registers as well (those
> > have *different* semantics after all), the other two kinds can be mixed,
> > and *have to* be mixed, because subregs of subregs are non-canonical.
> >
> > Is there any reason why not to allow this kind of subreg?
> 
> In fact the causing rev. in question 
> (d2874d905647a1d146dafa60199d440e837adc4d)
> made all those subregs "valid" in terms of what validate_subreg is verifying
> and thus now the few places using validate_subreg to check whether some
> subreg is valid will now happily do float<->int converting subregs.

Like in PR102154:

error: unrecognizable insn:
(insn 11 10 12 2 (set (reg:SF 118 [  ])
(subreg:SF (reg:DI 122) 0))

This generated valid code before, and now it is not anymore.  If the
patch was meant to allow *more*, it failed that goal: it allows *less*.

The issue is that we do not allow SF subregs in all cases.  We do not
store SFmode quantities in IEEE SP format in registers normally, so it
would require expensive extra code to allow such subregs.

Whatever pass made these subregs is wrong, because they do not pass
recog().

> I do agree that those subregs should be allowed and that the above rev. is
> a strict improvement (given it removes a lot of "but allow special case X
> because target Y wants it" cases by simply allowing all of them).  But the
> previous code seems to have papered over quite some backend issues.

It is not a good idea to do allow all those things.  Most backends can
only support a few combinations of them, and everything else results in
*worse* machine code, best case, and more and more complicated (and more
buggy!) backend code.

But that is a code quality issue.  The current problem is that we have
at least PR102211 and PR102154 (as well as reports elsewhere of bugs on
other targets).  Code that used before doesn't anymore, and we have no
clear way out, no recommendation how to fix this and a) keep the same
functionality without huge changes, and b) keep the same machine code
quality.

I do not think something like that can be done.  That is why I am asking
for the patch to be reverted until all of the groundwork for it has been
done.  That includes making generic testcases that show how such subregs
behave, so that we can see in testresults what changes do to existing
targets.


Segher


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-08 Thread Michael Meissner via Gcc-patches
On Tue, Sep 07, 2021 at 06:07:30PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Sep 07, 2021 at 03:12:36AM -0400, Michael Meissner wrote:
> > [PATCH] Fix SFmode subreg of DImode and TImode
> > 
> > This patch fixes the breakage in the PowerPC due to a recent change in 
> > SUBREG
> > behavior.
> 
> But what was that change?  And was that intentional?  If so, why wasn't
> it documented, was the existing behaviour considered buggy?  But the
> documentation agrees with the previous behaviour afaics.

I haven't investigated what the machine independent change was that suddenly
started allowing:

(subreg:SF (reg:TI ...))

This patch just allows the subreg's to exist instead of getting an insn not
found message.  An alternative approach would be to allow wider subregs in the
insn matching and deal with getting the right physical register when
splitting.

> > While it is arguable that the patch that caused the breakage should
> > be reverted, this patch should be a bandage to prevent these changes from
> > happening again.
> 
> NAK.  This patch will likely cause us to generate worse code.  If that
> is not the case it will need a long, in-depth explanation of why not.

Trying to replicate the problem, I get exactly the same code with an older
compiler.  I will attach the test program as an attachment.

> Sorry.
> 
> > I first noticed it in building the Spec 2017 wrf_r and blender_r
> > benchmarks.  Once I applied this patch, I also noticed several of the
> > tests now pass.
> > 
> > The core of the problem is we need to treat SUBREG's of SFmode and SImode
> > specially on the PowerPC.  This is due to the fact that SFmode values that 
> > are
> > in the vector and floating point registers are represented as DFmode.  When 
> > we
> > want to do a direct move between the GPR registers and the vector 
> > registers, we
> > have to convert the value from the DFmode representation to/from the SFmode
> > representation.
> 
> The core of the problem is that subreg of pseudos has three meanings:
>   -- Paradoxical subregs;
>   -- Actual subregs;
>   -- "bit_cast" thingies: treat the same bits as something else.  Like
>  looking at the bits of a float as its memory image.
> 
> Ignoring paradoxical subregs (as well as subregs of mem, which should
> have disappeared by now), and subregs of hard registers as well (those
> have *different* semantics after all), the other two kinds can be mixed,
> and *have to* be mixed, because subregs of subregs are non-canonical.

This isn't subregs of subregs.  This is a subreg of a larger integer type, for
example having a subreg representing an __int128_t or structure, and accessing
the bottom/top element.  Instead of doing an extract, the compiler generates a
subreg.

Before the machine independent part of the compiler would not generate these
subregs, and now it does.

> 
> Is there any reason why not to allow this kind of subreg?
> 
> If we want to not allow mixing bit_cast with subregs, we should make it
> its own RTL code.
> 
> > +  /* In case we are given a SUBREG for a larger type, reduce it to
> > +SImode.  */
> > +  if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4)
> > +   {
> > + rtx tmp = gen_reg_rtx (SImode);
> > + emit_move_insn (tmp, gen_lowpart (SImode, source));
> > + emit_insn (gen_movsf_from_si (dest, tmp));
> > + return true;
> > +   }
> 
> This makes it two separate insns.  Is that always optimised to code that
> is at least as good as before?

It is likely even if it generates an extra move, it will be better, because the
default would be to do the transfer via store on one register bank and load of
the other.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
union u_128 {
  __int128_t i128;
  long l[2];
  int i[4];
  double d[2];
  float f[4];
};

union u_64 {
  long l;
  int i[2];
  double d;
  float f[2];
};

float ret_u128_f0 (long a, long b, long x, long y)
{
  union u_128 u;

  u.l[0] = a ^ x;
  u.l[1] = b ^ y;

  return u.f[0];
}

float ret_u128_f1 (long a, long b, long x, long y)
{
  union u_128 u;

  u.l[0] = a ^ x;
  u.l[1] = b ^ y;

  return u.f[1];
}

float ret_u128_f2 (long a, long b, long x, long y)
{
  union u_128 u;

  u.l[0] = a ^ x;
  u.l[1] = b ^ x;

  return u.f[2];
}

float ret_u128_f3 (long a, long b, long x, long y)
{
  union u_128 u;

  u.l[0] = a ^ x;
  u.l[1] = b ^ x;

  return u.f[3];
}

float ret_u64_f0 (long a, long x)
{
  union u_64 u;

  u.l = a ^ x;

  return u.f[0];
}

float ret_u64_f1 (long a, long x)
{
  union u_64 u;

  u.l = a ^ x;

  return u.f[1];
}


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-08 Thread Richard Biener via Gcc-patches
On Wed, Sep 8, 2021 at 1:08 AM Segher Boessenkool
 wrote:
>
> Hi!
>
> On Tue, Sep 07, 2021 at 03:12:36AM -0400, Michael Meissner wrote:
> > [PATCH] Fix SFmode subreg of DImode and TImode
> >
> > This patch fixes the breakage in the PowerPC due to a recent change in 
> > SUBREG
> > behavior.
>
> But what was that change?  And was that intentional?  If so, why wasn't
> it documented, was the existing behaviour considered buggy?  But the
> documentation agrees with the previous behaviour afaics.
>
> > While it is arguable that the patch that caused the breakage should
> > be reverted, this patch should be a bandage to prevent these changes from
> > happening again.
>
> NAK.  This patch will likely cause us to generate worse code.  If that
> is not the case it will need a long, in-depth explanation of why not.
>
> Sorry.
>
> > I first noticed it in building the Spec 2017 wrf_r and blender_r
> > benchmarks.  Once I applied this patch, I also noticed several of the
> > tests now pass.
> >
> > The core of the problem is we need to treat SUBREG's of SFmode and SImode
> > specially on the PowerPC.  This is due to the fact that SFmode values that 
> > are
> > in the vector and floating point registers are represented as DFmode.  When 
> > we
> > want to do a direct move between the GPR registers and the vector 
> > registers, we
> > have to convert the value from the DFmode representation to/from the SFmode
> > representation.
>
> The core of the problem is that subreg of pseudos has three meanings:
>   -- Paradoxical subregs;
>   -- Actual subregs;
>   -- "bit_cast" thingies: treat the same bits as something else.  Like
>  looking at the bits of a float as its memory image.
>
> Ignoring paradoxical subregs (as well as subregs of mem, which should
> have disappeared by now), and subregs of hard registers as well (those
> have *different* semantics after all), the other two kinds can be mixed,
> and *have to* be mixed, because subregs of subregs are non-canonical.
>
> Is there any reason why not to allow this kind of subreg?

In fact the causing rev. in question (d2874d905647a1d146dafa60199d440e837adc4d)
made all those subregs "valid" in terms of what validate_subreg is verifying
and thus now the few places using validate_subreg to check whether some
subreg is valid will now happily do float<->int converting subregs.

I do agree that those subregs should be allowed and that the above rev. is
a strict improvement (given it removes a lot of "but allow special case X
because target Y wants it" cases by simply allowing all of them).  But the
previous code seems to have papered over quite some backend issues.

Now I have no opinion on the rs6000 patch fixing one of those issues.

Richard.

> If we want to not allow mixing bit_cast with subregs, we should make it
> its own RTL code.
>
> > +  /* In case we are given a SUBREG for a larger type, reduce it to
> > +  SImode.  */
> > +  if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4)
> > + {
> > +   rtx tmp = gen_reg_rtx (SImode);
> > +   emit_move_insn (tmp, gen_lowpart (SImode, source));
> > +   emit_insn (gen_movsf_from_si (dest, tmp));
> > +   return true;
> > + }
>
> This makes it two separate insns.  Is that always optimised to code that
> is at least as good as before?
>
>
> Segher


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-07 Thread Segher Boessenkool
Hi!

On Tue, Sep 07, 2021 at 03:12:36AM -0400, Michael Meissner wrote:
> [PATCH] Fix SFmode subreg of DImode and TImode
> 
> This patch fixes the breakage in the PowerPC due to a recent change in SUBREG
> behavior.

But what was that change?  And was that intentional?  If so, why wasn't
it documented, was the existing behaviour considered buggy?  But the
documentation agrees with the previous behaviour afaics.

> While it is arguable that the patch that caused the breakage should
> be reverted, this patch should be a bandage to prevent these changes from
> happening again.

NAK.  This patch will likely cause us to generate worse code.  If that
is not the case it will need a long, in-depth explanation of why not.

Sorry.

> I first noticed it in building the Spec 2017 wrf_r and blender_r
> benchmarks.  Once I applied this patch, I also noticed several of the
> tests now pass.
> 
> The core of the problem is we need to treat SUBREG's of SFmode and SImode
> specially on the PowerPC.  This is due to the fact that SFmode values that are
> in the vector and floating point registers are represented as DFmode.  When we
> want to do a direct move between the GPR registers and the vector registers, 
> we
> have to convert the value from the DFmode representation to/from the SFmode
> representation.

The core of the problem is that subreg of pseudos has three meanings:
  -- Paradoxical subregs;
  -- Actual subregs;
  -- "bit_cast" thingies: treat the same bits as something else.  Like
 looking at the bits of a float as its memory image.

Ignoring paradoxical subregs (as well as subregs of mem, which should
have disappeared by now), and subregs of hard registers as well (those
have *different* semantics after all), the other two kinds can be mixed,
and *have to* be mixed, because subregs of subregs are non-canonical.

Is there any reason why not to allow this kind of subreg?

If we want to not allow mixing bit_cast with subregs, we should make it
its own RTL code.

> +  /* In case we are given a SUBREG for a larger type, reduce it to
> +  SImode.  */
> +  if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4)
> + {
> +   rtx tmp = gen_reg_rtx (SImode);
> +   emit_move_insn (tmp, gen_lowpart (SImode, source));
> +   emit_insn (gen_movsf_from_si (dest, tmp));
> +   return true;
> + }

This makes it two separate insns.  Is that always optimised to code that
is at least as good as before?


Segher


[PATCH] Fix SFmode subreg of DImode and TImode

2021-09-07 Thread Michael Meissner via Gcc-patches
[PATCH] Fix SFmode subreg of DImode and TImode

This patch fixes the breakage in the PowerPC due to a recent change in SUBREG
behavior.  While it is arguable that the patch that caused the breakage should
be reverted, this patch should be a bandage to prevent these changes from
happening again.

I first noticed it in building the Spec 2017 wrf_r and blender_r
benchmarks.  Once I applied this patch, I also noticed several of the
tests now pass.

The core of the problem is we need to treat SUBREG's of SFmode and SImode
specially on the PowerPC.  This is due to the fact that SFmode values that are
in the vector and floating point registers are represented as DFmode.  When we
want to do a direct move between the GPR registers and the vector registers, we
have to convert the value from the DFmode representation to/from the SFmode
representation.

By doing this special processing instead of doing the transfer via store and
load, we were able to speed up the math library which at times want to use the
SFmode values in a union, and do logical operations on it (to test exponent
ranges, etc.) and then move it over to use as a floating point value.

I did a bootstrap build on a little endian power9 system with and without the
patch applied.  There was no regression in the tests.  I'm doing a build on a
big endian power8 system, but it hasn't finished yet as I sent this email.  I
will check on the big endian progress tomorrow morning.

The following tests now pass once again with the test.

C tests:

gcc.c-torture/compile/20071102-1.c
gcc.c-torture/compile/pr55921.c
gcc.c-torture/compile/pr85945.c
gcc.c-torture/execute/complex-3.c
gcc.dg/atomic/c11-atomic-exec-1.c
gcc.dg/atomic/c11-atomic-exec-2.c
gcc.dg/atomic/c11-atomic-exec-4.c
gcc.dg/atomic/c11-atomic-exec-5.c
gcc.dg/c11-atomic-2.c
gcc.dg/pr42475.c
gcc.dg/pr47201.c
gcc.dg/pr48335-1.c
gcc.dg/torture/pr67741.c
gcc.dg/tree-ssa/ssa-dom-thread-10.c
gcc.dg/tsan/pr88030.c
gcc.dg/ubsan/float-cast-overflow-atomic.c
gcc.dg/vect/no-tree-sra-bb-slp-pr50730.c

C++ tests:
==
g++.dg/opt/alias1.C
g++.dg/template/koenig6.C
g++.dg/torture/pr40924.C
tmpdir-g++.dg-struct-layout-1/t001

Fortran tests:
==
gfortran.dg/array_constructor_type_22.f03
gfortran.dg/array_function_6.f90
gfortran.dg/derived_comp_array_ref_7.f90
gfortran.dg/elemental_scalar_args_1.f90
gfortran.dg/elemental_subroutine_1.f90
gfortran.dg/inline_matmul_5.f90
gfortran.dg/inline_matmul_8.f90
gfortran.dg/inline_matmul_9.f90
gfortran.dg/matmul_bounds_6.f90
gfortran.dg/operator_1.f90
gfortran.dg/past_eor.f90
gfortran.dg/pr101121.f
gfortran.dg/pr91552.f90
gfortran.dg/spread_shape_1.f90
gfortran.dg/typebound_operator_3.f03
gfortran.dg/value_1.f90
gfortran.fortran-torture/execute/entry_4.f90
gfortran.fortran-torture/execute/intrinsic_dotprod.f90
gfortran.fortran-torture/execute/intrinsic_matmul.f90

Can I check this fix into the master branch?

2021-09-06  Michael Meissner  

gcc/

* config/rs6000/rs6000.c (rs6000_emit_move_si_sf_subreg): Deal
with SUBREGs of TImode and DImode.
---
 gcc/config/rs6000/rs6000.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index b9ebd56c993..7bbf29a3e1c 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -10942,6 +10942,16 @@ rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, 
machine_mode mode)
  return true;
}
 
+  /* In case we are given a SUBREG for a larger type, reduce it to
+SImode.  */
+  if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4)
+   {
+ rtx tmp = gen_reg_rtx (SImode);
+ emit_move_insn (tmp, gen_lowpart (SImode, source));
+ emit_insn (gen_movsf_from_si (dest, tmp));
+ return true;
+   }
+
   if (mode == SFmode && inner_mode == SImode)
{
  emit_insn (gen_movsf_from_si (dest, inner_source));
-- 
2.31.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797