Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 5:07 PM Segher Boessenkool wrote: > > On Fri, Sep 10, 2021 at 12:53:37PM +0200, Richard Biener wrote: > > On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool > > wrote: > > > And many targets have strange rules for bit-strings in which modes can > > > be used as bit-strings in which other modes, and at what offsets in > > > which registers. Now perhaps none of that is optimal (I bet it isn't), > > > but changing this without a transition plan simply does not work. > > > > But we _do_ already allow some of them :/ Like > > Yes. And all of this is old and ingrained, and targets depend on the > status quo, so changing this will need more care and planning and > cooperation. It certainly is a worthwhile thing to improve, but it is > not a small project, and it requires a plan. > > > /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though store_bit_field > > is the culprit here, and not the backends. */ > > else if (known_ge (osize, regsize) && known_ge (isize, osize)) > > ; > > > > so for the special case where 'regsize' matches osize it would be > > a bit-cast of a full register from int to float. But as written it also > > allows (subreg:XF (reg:TI)) which will likely wreck havoc? > > That does not pass the isize >= osize test? Or maybe I don't know > what XFmode is well enough :-) Hey I can read, I have source, and it > is Friday... TImode is 16 bytes but XFmode is 12. I meant to construct a case which passes all the tests but where definitely such kind of subreg is very odd to be "allowed" by validate_subreg - a target may of course have means to make sense of it (I don't see how x87 would though). > Ah. So XF has different size on 32-bit and on 64-bit, but that doesn't > even matter here. > > > Similar for the omode == word_mode check which allows > > (subreg:DI (reg:TF ..)). That is, the existing special-cases look > > too broad to me - and they probably exist because when validate_subreg > > rejects sth then we can't put it together later when expand split it > > into two subregs and a pseudo ... > > I said it before, and I'll say it again, it is a very important point: > expand should not try to optimise this, *at all*. And not just this, > not *anything*. Expand's job in the current compiler is only to > translate Gimple to RTL, and nothing more. +1 (or +10!), but unfortunately expand _is_ an important part of optimization - in particular when it gets to avoiding stack usage since we can never get rid of all effects of allocating a stack slot. Splitting to multiple insns with pseudos should be fine though, but it at least seems that splitting (subreg:FLOAT (reg:INT)) into (subreg:INT (reg:INT)) (subreg:FLOAT (reg:INT)) isn't always valid. Richard.
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 11:09:31AM +0800, Hongtao Liu wrote: > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool > wrote: > > way too long. But from that same history it follows that anything you > > do not super carefully (with testing everywhere) will cause some serious > Frankly, testing everywhere is too heavy a burden for developers, > after all, everyone has a limited variety of machines, and may not be > familiar with using other targets' simulators. Your change should be tested on enough relevant targets that we can have confidence it works properly. That does not necessarily mean you have to test everywhere yourself (although that is greatly appreciated, makes life easier for everyone, including yourself, and as David points out the cfarm is a great help). So you didn't realise your patch would wreak havoc on most other targets than what you tested on. It happens, it helps if you can avoid it, but you learn only from things that go wrong :-) (The patch has still not been reverted btw. I'll do this later tonight if you don't). > And back to the problem we were trying to solve at the beginning > (subreg:HF(reg:SI)), I guess this is not just a problem in x86 > backend, any backend can encounter similar problems, that's why we > remove all the weird cases in validate_subreg. Expand should simply expand to two statements: one doing the real subreg, the other doing the bit_cast. Segher
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 12:53:37PM +0200, Richard Biener wrote: > On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool > wrote: > > And many targets have strange rules for bit-strings in which modes can > > be used as bit-strings in which other modes, and at what offsets in > > which registers. Now perhaps none of that is optimal (I bet it isn't), > > but changing this without a transition plan simply does not work. > > But we _do_ already allow some of them :/ Like Yes. And all of this is old and ingrained, and targets depend on the status quo, so changing this will need more care and planning and cooperation. It certainly is a worthwhile thing to improve, but it is not a small project, and it requires a plan. > /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though store_bit_field > is the culprit here, and not the backends. */ > else if (known_ge (osize, regsize) && known_ge (isize, osize)) > ; > > so for the special case where 'regsize' matches osize it would be > a bit-cast of a full register from int to float. But as written it also > allows (subreg:XF (reg:TI)) which will likely wreck havoc? That does not pass the isize >= osize test? Or maybe I don't know what XFmode is well enough :-) Hey I can read, I have source, and it is Friday... Ah. So XF has different size on 32-bit and on 64-bit, but that doesn't even matter here. > Similar for the omode == word_mode check which allows > (subreg:DI (reg:TF ..)). That is, the existing special-cases look > too broad to me - and they probably exist because when validate_subreg > rejects sth then we can't put it together later when expand split it > into two subregs and a pseudo ... I said it before, and I'll say it again, it is a very important point: expand should not try to optimise this, *at all*. And not just this, not *anything*. Expand's job in the current compiler is only to translate Gimple to RTL, and nothing more. When later passes try to optimise things they will always ask the target if it agrees with the change (all RTL has to pass recog() for example). This simplifies life. Segher
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 10:08 PM David Edelsohn wrote: > > On Thu, Sep 9, 2021 at 11:03 PM Hongtao Liu wrote: > > > > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool > > wrote: > > > > > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > > > > > I think we should (longer term) get rid of the overloaded meanings and > > > > > uses of subregs. One fairly simple thing is to make a new rtx code > > > > > "bit_cast" (or is there a nice short more traditional name for it?) > > > > > > > > But subreg _is_ bit_cast. > > > > > > It is not. (subreg:M (reg:N) O) for O>0, little-endian, is not a > > > bit_cast. It is taking a part of a register, or a single register from > > > a multi-register thing. Paradoxicals are not bit-casts either. > > > > > > Subregs from or to (but not both) integer modes are generally bit_cast, > > > yeah. > > > > > > > What is odd to me is that a "disallowed" subreg > > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) > > > > 0) 0). > > > > Of course that's nested and invalid but just push the inner subreg to a > > > > new pseudo and the thing becomes valid. > > > > > > Bingo. > > > > > > And many targets have strange rules for bit-strings in which modes can > > > be used as bit-strings in which other modes, and at what offsets in > > > which registers. Now perhaps none of that is optimal (I bet it isn't), > > > but changing this without a transition plan simply does not work. > > > > > > > > But that is not the core problem we had here. The behaviour of the > > > > > generic parts of the compiler was changed, without testing if that > > > > > works on other targets but x86. That is an understandable mistake, it > > > > > takes some experience to know where the morasses are. But this change > > > > > should have been accompanied by testcases exercising the changed code. > > > > > We would have clearly seen there are issues then, simply by watching > > > > > gcc-testresults@ (and/or maintainers are on top of the test results > > > > > anyway). Also, if there were testcases for this, we could have some > > > > > confidence that a change in this area is robust. > > > > > > > > Well, that only works if some maintainers that are familiar enough > > > > with all this chime in ;) > > > > > > Not really. It works always. And it works way better than the > > > pandemonium we now have with broken targets left and right. > > > > > > With testcases anyone can see if any specific target is broken here. > > > > > > > It's stage1 so it's understandable that some > > > > people (like me ...) are tyring to help people making progress even > > > > if that involves trying to decipher 30 years of GCC history in this > > > > area (without much success in the end as we see) ;) > > > > > > Yeah :-) And my thanks to you and everyone involved for tackling this > > > problematic part of GCC, which has been neglected and patched over for > > > way too long. But from that same history it follows that anything you > > > do not super carefully (with testing everywhere) will cause some serious > > Frankly, testing everywhere is too heavy a burden for developers, > > after all, everyone has a limited variety of machines, and may not be > > familiar with using other targets' simulators. > > Hongtao, > > This is why the GNU Toolchain community sponsors the GCC Compile Farm. > Not every architecture is provided, but a good subset. And there are > a few, powerful Linux on Power systems in the cluster, such as gcc135. > I see. > And the comments in the code warned of port-specific problems. Even > if you cannot test the patch yourself, you should have publicly > requested that other port maintainers test the patch to shake out > problems. Yes, sorry for the inconvenience. > > The Tree-SSA passes are mostly target-independent, but RTL is > target-dependent, which can elicit widely different behavior in the > RTL passes. When you make changes to RTL passes in the common parts > of the compiler, you must consider the impact on all targets. GCC > explicitly supports a wide variety of architectures, ABIs and OSes. > All of the developers strive to ensure that changes don't adversely > affect any targets. > > Thanks, David -- BR, Hongtao
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Thu, Sep 9, 2021 at 11:03 PM Hongtao Liu wrote: > > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool > wrote: > > > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > > > > I think we should (longer term) get rid of the overloaded meanings and > > > > uses of subregs. One fairly simple thing is to make a new rtx code > > > > "bit_cast" (or is there a nice short more traditional name for it?) > > > > > > But subreg _is_ bit_cast. > > > > It is not. (subreg:M (reg:N) O) for O>0, little-endian, is not a > > bit_cast. It is taking a part of a register, or a single register from > > a multi-register thing. Paradoxicals are not bit-casts either. > > > > Subregs from or to (but not both) integer modes are generally bit_cast, > > yeah. > > > > > What is odd to me is that a "disallowed" subreg > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) > > > 0) 0). > > > Of course that's nested and invalid but just push the inner subreg to a > > > new pseudo and the thing becomes valid. > > > > Bingo. > > > > And many targets have strange rules for bit-strings in which modes can > > be used as bit-strings in which other modes, and at what offsets in > > which registers. Now perhaps none of that is optimal (I bet it isn't), > > but changing this without a transition plan simply does not work. > > > > > > But that is not the core problem we had here. The behaviour of the > > > > generic parts of the compiler was changed, without testing if that > > > > works on other targets but x86. That is an understandable mistake, it > > > > takes some experience to know where the morasses are. But this change > > > > should have been accompanied by testcases exercising the changed code. > > > > We would have clearly seen there are issues then, simply by watching > > > > gcc-testresults@ (and/or maintainers are on top of the test results > > > > anyway). Also, if there were testcases for this, we could have some > > > > confidence that a change in this area is robust. > > > > > > Well, that only works if some maintainers that are familiar enough > > > with all this chime in ;) > > > > Not really. It works always. And it works way better than the > > pandemonium we now have with broken targets left and right. > > > > With testcases anyone can see if any specific target is broken here. > > > > > It's stage1 so it's understandable that some > > > people (like me ...) are tyring to help people making progress even > > > if that involves trying to decipher 30 years of GCC history in this > > > area (without much success in the end as we see) ;) > > > > Yeah :-) And my thanks to you and everyone involved for tackling this > > problematic part of GCC, which has been neglected and patched over for > > way too long. But from that same history it follows that anything you > > do not super carefully (with testing everywhere) will cause some serious > Frankly, testing everywhere is too heavy a burden for developers, > after all, everyone has a limited variety of machines, and may not be > familiar with using other targets' simulators. Hongtao, This is why the GNU Toolchain community sponsors the GCC Compile Farm. Not every architecture is provided, but a good subset. And there are a few, powerful Linux on Power systems in the cluster, such as gcc135. And the comments in the code warned of port-specific problems. Even if you cannot test the patch yourself, you should have publicly requested that other port maintainers test the patch to shake out problems. The Tree-SSA passes are mostly target-independent, but RTL is target-dependent, which can elicit widely different behavior in the RTL passes. When you make changes to RTL passes in the common parts of the compiler, you must consider the impact on all targets. GCC explicitly supports a wide variety of architectures, ABIs and OSes. All of the developers strive to ensure that changes don't adversely affect any targets. Thanks, David
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 7:25 PM Hongtao Liu wrote: > > On Fri, Sep 10, 2021 at 6:54 PM Richard Biener > wrote: > > > > On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu wrote: > > > > > > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool > > > wrote: > > > > > > > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > > > > > > I think we should (longer term) get rid of the overloaded meanings > > > > > > and > > > > > > uses of subregs. One fairly simple thing is to make a new rtx code > > > > > > "bit_cast" (or is there a nice short more traditional name for it?) > > > > > > > > > > But subreg _is_ bit_cast. > > > > > > > > It is not. (subreg:M (reg:N) O) for O>0, little-endian, is not a > > > > bit_cast. It is taking a part of a register, or a single register from > > > > a multi-register thing. Paradoxicals are not bit-casts either. > > > > > > > > Subregs from or to (but not both) integer modes are generally bit_cast, > > > > yeah. > > > > > > > > > What is odd to me is that a "disallowed" subreg > > > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > > > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI > > > > > ..) 0) 0). > > > > > Of course that's nested and invalid but just push the inner subreg to > > > > > a > > > > > new pseudo and the thing becomes valid. > > > > > > > > Bingo. > > > > > > > > And many targets have strange rules for bit-strings in which modes can > > > > be used as bit-strings in which other modes, and at what offsets in > > > > which registers. Now perhaps none of that is optimal (I bet it isn't), > > > > but changing this without a transition plan simply does not work. > > > > > > > > > > But that is not the core problem we had here. The behaviour of the > > > > > > generic parts of the compiler was changed, without testing if that > > > > > > works on other targets but x86. That is an understandable mistake, > > > > > > it > > > > > > takes some experience to know where the morasses are. But this > > > > > > change > > > > > > should have been accompanied by testcases exercising the changed > > > > > > code. > > > > > > We would have clearly seen there are issues then, simply by watching > > > > > > gcc-testresults@ (and/or maintainers are on top of the test results > > > > > > anyway). Also, if there were testcases for this, we could have some > > > > > > confidence that a change in this area is robust. > > > > > > > > > > Well, that only works if some maintainers that are familiar enough > > > > > with all this chime in ;) > > > > > > > > Not really. It works always. And it works way better than the > > > > pandemonium we now have with broken targets left and right. > > > > > > > > With testcases anyone can see if any specific target is broken here. > > > > > > > > > It's stage1 so it's understandable that some > > > > > people (like me ...) are tyring to help people making progress even > > > > > if that involves trying to decipher 30 years of GCC history in this > > > > > area (without much success in the end as we see) ;) > > > > > > > > Yeah :-) And my thanks to you and everyone involved for tackling this > > > > problematic part of GCC, which has been neglected and patched over for > > > > way too long. But from that same history it follows that anything you > > > > do not super carefully (with testing everywhere) will cause some serious > > > Frankly, testing everywhere is too heavy a burden for developers, > > > after all, everyone has a limited variety of machines, and may not be > > > familiar with using other targets' simulators. > > > And back to the problem we were trying to solve at the beginning > > > (subreg:HF(reg:SI)), I guess this is not just a problem in x86 > > > backend, any backend can encounter similar problems, that's why we > > > remove all the weird cases in validate_subreg. > > > > So can you please revert the change for now? I think we need to go > > back to the issue in extract_bit_field - does it somehow work to use > > validate_subreg to avoid creating the subreg we ICE on in the first > > place and what happens then to code quality? > Sure, let me test the patch. Survive regtest, code quality seems to be ok. But lose some performance. .cfi_startproc - pinsrw $0, %edi, %xmm0 + movw%di, -2(%rsp) + pinsrw $0, -2(%rsp), %xmm0 ret .cfi_endproc Anyway I'll post the patches first and see if I can fix the performance issue in the x86 backend. > > > > Thanks, > > Richard. > > > > > > problems. And nonse of these are easy to fix at all -- there is a > > > > *reason* targets did this nastiness. > > > > > > > > > > p.s. Very unrelated... Should we have __builtin_bit_cast for C as > > > > > > well? > > > > > > Is there any reason this could not work? > > > > > > > > Still interested in this btw :-) (And still very unrelated.) > > > > > > > > > > > > Segher > > > > > > > > > > > > -- > > > BR, > > > Hongtao > > > > -- >
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 6:54 PM Richard Biener wrote: > > On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu wrote: > > > > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool > > wrote: > > > > > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > > > > > I think we should (longer term) get rid of the overloaded meanings and > > > > > uses of subregs. One fairly simple thing is to make a new rtx code > > > > > "bit_cast" (or is there a nice short more traditional name for it?) > > > > > > > > But subreg _is_ bit_cast. > > > > > > It is not. (subreg:M (reg:N) O) for O>0, little-endian, is not a > > > bit_cast. It is taking a part of a register, or a single register from > > > a multi-register thing. Paradoxicals are not bit-casts either. > > > > > > Subregs from or to (but not both) integer modes are generally bit_cast, > > > yeah. > > > > > > > What is odd to me is that a "disallowed" subreg > > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) > > > > 0) 0). > > > > Of course that's nested and invalid but just push the inner subreg to a > > > > new pseudo and the thing becomes valid. > > > > > > Bingo. > > > > > > And many targets have strange rules for bit-strings in which modes can > > > be used as bit-strings in which other modes, and at what offsets in > > > which registers. Now perhaps none of that is optimal (I bet it isn't), > > > but changing this without a transition plan simply does not work. > > > > > > > > But that is not the core problem we had here. The behaviour of the > > > > > generic parts of the compiler was changed, without testing if that > > > > > works on other targets but x86. That is an understandable mistake, it > > > > > takes some experience to know where the morasses are. But this change > > > > > should have been accompanied by testcases exercising the changed code. > > > > > We would have clearly seen there are issues then, simply by watching > > > > > gcc-testresults@ (and/or maintainers are on top of the test results > > > > > anyway). Also, if there were testcases for this, we could have some > > > > > confidence that a change in this area is robust. > > > > > > > > Well, that only works if some maintainers that are familiar enough > > > > with all this chime in ;) > > > > > > Not really. It works always. And it works way better than the > > > pandemonium we now have with broken targets left and right. > > > > > > With testcases anyone can see if any specific target is broken here. > > > > > > > It's stage1 so it's understandable that some > > > > people (like me ...) are tyring to help people making progress even > > > > if that involves trying to decipher 30 years of GCC history in this > > > > area (without much success in the end as we see) ;) > > > > > > Yeah :-) And my thanks to you and everyone involved for tackling this > > > problematic part of GCC, which has been neglected and patched over for > > > way too long. But from that same history it follows that anything you > > > do not super carefully (with testing everywhere) will cause some serious > > Frankly, testing everywhere is too heavy a burden for developers, > > after all, everyone has a limited variety of machines, and may not be > > familiar with using other targets' simulators. > > And back to the problem we were trying to solve at the beginning > > (subreg:HF(reg:SI)), I guess this is not just a problem in x86 > > backend, any backend can encounter similar problems, that's why we > > remove all the weird cases in validate_subreg. > > So can you please revert the change for now? I think we need to go > back to the issue in extract_bit_field - does it somehow work to use > validate_subreg to avoid creating the subreg we ICE on in the first > place and what happens then to code quality? Sure, let me test the patch. > > Thanks, > Richard. > > > > problems. And nonse of these are easy to fix at all -- there is a > > > *reason* targets did this nastiness. > > > > > > > > p.s. Very unrelated... Should we have __builtin_bit_cast for C as > > > > > well? > > > > > Is there any reason this could not work? > > > > > > Still interested in this btw :-) (And still very unrelated.) > > > > > > > > > Segher > > > > > > > > -- > > BR, > > Hongtao -- BR, Hongtao
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 5:03 AM Hongtao Liu wrote: > > On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool > wrote: > > > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > > > > I think we should (longer term) get rid of the overloaded meanings and > > > > uses of subregs. One fairly simple thing is to make a new rtx code > > > > "bit_cast" (or is there a nice short more traditional name for it?) > > > > > > But subreg _is_ bit_cast. > > > > It is not. (subreg:M (reg:N) O) for O>0, little-endian, is not a > > bit_cast. It is taking a part of a register, or a single register from > > a multi-register thing. Paradoxicals are not bit-casts either. > > > > Subregs from or to (but not both) integer modes are generally bit_cast, > > yeah. > > > > > What is odd to me is that a "disallowed" subreg > > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) > > > 0) 0). > > > Of course that's nested and invalid but just push the inner subreg to a > > > new pseudo and the thing becomes valid. > > > > Bingo. > > > > And many targets have strange rules for bit-strings in which modes can > > be used as bit-strings in which other modes, and at what offsets in > > which registers. Now perhaps none of that is optimal (I bet it isn't), > > but changing this without a transition plan simply does not work. > > > > > > But that is not the core problem we had here. The behaviour of the > > > > generic parts of the compiler was changed, without testing if that > > > > works on other targets but x86. That is an understandable mistake, it > > > > takes some experience to know where the morasses are. But this change > > > > should have been accompanied by testcases exercising the changed code. > > > > We would have clearly seen there are issues then, simply by watching > > > > gcc-testresults@ (and/or maintainers are on top of the test results > > > > anyway). Also, if there were testcases for this, we could have some > > > > confidence that a change in this area is robust. > > > > > > Well, that only works if some maintainers that are familiar enough > > > with all this chime in ;) > > > > Not really. It works always. And it works way better than the > > pandemonium we now have with broken targets left and right. > > > > With testcases anyone can see if any specific target is broken here. > > > > > It's stage1 so it's understandable that some > > > people (like me ...) are tyring to help people making progress even > > > if that involves trying to decipher 30 years of GCC history in this > > > area (without much success in the end as we see) ;) > > > > Yeah :-) And my thanks to you and everyone involved for tackling this > > problematic part of GCC, which has been neglected and patched over for > > way too long. But from that same history it follows that anything you > > do not super carefully (with testing everywhere) will cause some serious > Frankly, testing everywhere is too heavy a burden for developers, > after all, everyone has a limited variety of machines, and may not be > familiar with using other targets' simulators. > And back to the problem we were trying to solve at the beginning > (subreg:HF(reg:SI)), I guess this is not just a problem in x86 > backend, any backend can encounter similar problems, that's why we > remove all the weird cases in validate_subreg. So can you please revert the change for now? I think we need to go back to the issue in extract_bit_field - does it somehow work to use validate_subreg to avoid creating the subreg we ICE on in the first place and what happens then to code quality? Thanks, Richard. > > problems. And nonse of these are easy to fix at all -- there is a > > *reason* targets did this nastiness. > > > > > > p.s. Very unrelated... Should we have __builtin_bit_cast for C as well? > > > > Is there any reason this could not work? > > > > Still interested in this btw :-) (And still very unrelated.) > > > > > > Segher > > > > -- > BR, > Hongtao
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool wrote: > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > > > I think we should (longer term) get rid of the overloaded meanings and > > > uses of subregs. One fairly simple thing is to make a new rtx code > > > "bit_cast" (or is there a nice short more traditional name for it?) > > > > But subreg _is_ bit_cast. > > It is not. (subreg:M (reg:N) O) for O>0, little-endian, is not a > bit_cast. It is taking a part of a register, or a single register from > a multi-register thing. Paradoxicals are not bit-casts either. > > Subregs from or to (but not both) integer modes are generally bit_cast, > yeah. > > > What is odd to me is that a "disallowed" subreg > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) > > 0). > > Of course that's nested and invalid but just push the inner subreg to a > > new pseudo and the thing becomes valid. > > Bingo. > > And many targets have strange rules for bit-strings in which modes can > be used as bit-strings in which other modes, and at what offsets in > which registers. Now perhaps none of that is optimal (I bet it isn't), > but changing this without a transition plan simply does not work. But we _do_ already allow some of them :/ Like /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though store_bit_field is the culprit here, and not the backends. */ else if (known_ge (osize, regsize) && known_ge (isize, osize)) ; so for the special case where 'regsize' matches osize it would be a bit-cast of a full register from int to float. But as written it also allows (subreg:XF (reg:TI)) which will likely wreck havoc? Similar for the omode == word_mode check which allows (subreg:DI (reg:TF ..)). That is, the existing special-cases look too broad to me - and they probably exist because when validate_subreg rejects sth then we can't put it together later when expand split it into two subregs and a pseudo ... > > > But that is not the core problem we had here. The behaviour of the > > > generic parts of the compiler was changed, without testing if that > > > works on other targets but x86. That is an understandable mistake, it > > > takes some experience to know where the morasses are. But this change > > > should have been accompanied by testcases exercising the changed code. > > > We would have clearly seen there are issues then, simply by watching > > > gcc-testresults@ (and/or maintainers are on top of the test results > > > anyway). Also, if there were testcases for this, we could have some > > > confidence that a change in this area is robust. > > > > Well, that only works if some maintainers that are familiar enough > > with all this chime in ;) > > Not really. It works always. And it works way better than the > pandemonium we now have with broken targets left and right. > > With testcases anyone can see if any specific target is broken here. > > > It's stage1 so it's understandable that some > > people (like me ...) are tyring to help people making progress even > > if that involves trying to decipher 30 years of GCC history in this > > area (without much success in the end as we see) ;) > > Yeah :-) And my thanks to you and everyone involved for tackling this > problematic part of GCC, which has been neglected and patched over for > way too long. But from that same history it follows that anything you > do not super carefully (with testing everywhere) will cause some serious > problems. And nonse of these are easy to fix at all -- there is a > *reason* targets did this nastiness. > > > > p.s. Very unrelated... Should we have __builtin_bit_cast for C as well? > > > Is there any reason this could not work? > > Still interested in this btw :-) (And still very unrelated.) Sure, why not ... Richard. > > Segher
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Fri, Sep 10, 2021 at 7:49 AM Segher Boessenkool wrote: > > On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > > > I think we should (longer term) get rid of the overloaded meanings and > > > uses of subregs. One fairly simple thing is to make a new rtx code > > > "bit_cast" (or is there a nice short more traditional name for it?) > > > > But subreg _is_ bit_cast. > > It is not. (subreg:M (reg:N) O) for O>0, little-endian, is not a > bit_cast. It is taking a part of a register, or a single register from > a multi-register thing. Paradoxicals are not bit-casts either. > > Subregs from or to (but not both) integer modes are generally bit_cast, > yeah. > > > What is odd to me is that a "disallowed" subreg > > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) > > 0). > > Of course that's nested and invalid but just push the inner subreg to a > > new pseudo and the thing becomes valid. > > Bingo. > > And many targets have strange rules for bit-strings in which modes can > be used as bit-strings in which other modes, and at what offsets in > which registers. Now perhaps none of that is optimal (I bet it isn't), > but changing this without a transition plan simply does not work. > > > > But that is not the core problem we had here. The behaviour of the > > > generic parts of the compiler was changed, without testing if that > > > works on other targets but x86. That is an understandable mistake, it > > > takes some experience to know where the morasses are. But this change > > > should have been accompanied by testcases exercising the changed code. > > > We would have clearly seen there are issues then, simply by watching > > > gcc-testresults@ (and/or maintainers are on top of the test results > > > anyway). Also, if there were testcases for this, we could have some > > > confidence that a change in this area is robust. > > > > Well, that only works if some maintainers that are familiar enough > > with all this chime in ;) > > Not really. It works always. And it works way better than the > pandemonium we now have with broken targets left and right. > > With testcases anyone can see if any specific target is broken here. > > > It's stage1 so it's understandable that some > > people (like me ...) are tyring to help people making progress even > > if that involves trying to decipher 30 years of GCC history in this > > area (without much success in the end as we see) ;) > > Yeah :-) And my thanks to you and everyone involved for tackling this > problematic part of GCC, which has been neglected and patched over for > way too long. But from that same history it follows that anything you > do not super carefully (with testing everywhere) will cause some serious Frankly, testing everywhere is too heavy a burden for developers, after all, everyone has a limited variety of machines, and may not be familiar with using other targets' simulators. And back to the problem we were trying to solve at the beginning (subreg:HF(reg:SI)), I guess this is not just a problem in x86 backend, any backend can encounter similar problems, that's why we remove all the weird cases in validate_subreg. > problems. And nonse of these are easy to fix at all -- there is a > *reason* targets did this nastiness. > > > > p.s. Very unrelated... Should we have __builtin_bit_cast for C as well? > > > Is there any reason this could not work? > > Still interested in this btw :-) (And still very unrelated.) > > > Segher -- BR, Hongtao
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > > I think we should (longer term) get rid of the overloaded meanings and > > uses of subregs. One fairly simple thing is to make a new rtx code > > "bit_cast" (or is there a nice short more traditional name for it?) > > But subreg _is_ bit_cast. It is not. (subreg:M (reg:N) O) for O>0, little-endian, is not a bit_cast. It is taking a part of a register, or a single register from a multi-register thing. Paradoxicals are not bit-casts either. Subregs from or to (but not both) integer modes are generally bit_cast, yeah. > What is odd to me is that a "disallowed" subreg > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 0). > Of course that's nested and invalid but just push the inner subreg to a > new pseudo and the thing becomes valid. Bingo. And many targets have strange rules for bit-strings in which modes can be used as bit-strings in which other modes, and at what offsets in which registers. Now perhaps none of that is optimal (I bet it isn't), but changing this without a transition plan simply does not work. > > But that is not the core problem we had here. The behaviour of the > > generic parts of the compiler was changed, without testing if that > > works on other targets but x86. That is an understandable mistake, it > > takes some experience to know where the morasses are. But this change > > should have been accompanied by testcases exercising the changed code. > > We would have clearly seen there are issues then, simply by watching > > gcc-testresults@ (and/or maintainers are on top of the test results > > anyway). Also, if there were testcases for this, we could have some > > confidence that a change in this area is robust. > > Well, that only works if some maintainers that are familiar enough > with all this chime in ;) Not really. It works always. And it works way better than the pandemonium we now have with broken targets left and right. With testcases anyone can see if any specific target is broken here. > It's stage1 so it's understandable that some > people (like me ...) are tyring to help people making progress even > if that involves trying to decipher 30 years of GCC history in this > area (without much success in the end as we see) ;) Yeah :-) And my thanks to you and everyone involved for tackling this problematic part of GCC, which has been neglected and patched over for way too long. But from that same history it follows that anything you do not super carefully (with testing everywhere) will cause some serious problems. And nonse of these are easy to fix at all -- there is a *reason* targets did this nastiness. > > p.s. Very unrelated... Should we have __builtin_bit_cast for C as well? > > Is there any reason this could not work? Still interested in this btw :-) (And still very unrelated.) Segher
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Thu, Sep 09, 2021 at 08:16:16AM +0200, Richard Biener wrote: > But subreg _is_ bit_cast. What is odd to me is that a "disallowed" subreg > like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of > validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 0). > Of course that's nested and invalid but just push the inner subreg to a > new pseudo and the thing becomes valid. Basically I only checked the mode of the SUBREG and the value in SUBREG_REG. I agree (subreg:SF (subreg:SI (reg:TI))) should not be allowed. When I wrote the changes we didn't see SUBREG's as much to access the bottom or top element of a structure that was held in a larger int field. If the patch is not reverted, then we have to deal with this somehow. Such as with my patch that allows (subreg:SF (reg:TI xxx)) and (subreg:SF (reg:DI xxx)). The target hook idea might be too big of a hammer. I would need to play with it to see if there are other losses. On the PowerPC, the problematical subregs only occur when you are moving a register from GPR to a vector/float register or vice versa. If you move a subreg between two GPR registers or two vector/float registers, then you can just do the copy. Part of the issue is for some moves, you need to use a temporary register to convert the results, but of course we have the long standing 'all moves for a mode must be done in a single insn that covers all of the cases'. Then you get into secondary reload territory. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Tue, Sep 7, 2021 at 12:12 AM Michael Meissner via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: > This patch fixes the breakage in the PowerPC due to a recent change in > SUBREG > behavior. While it is arguable that the patch that caused the breakage > should > be reverted, this patch should be a bandage to prevent these changes from > happening again. > FYI I'm dealing with the same problem in the RISC-V backend via https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102211 My current proposed solution is a new predicate f_register_operand to be used in patterns that use the f constraint which disallows the subregs that can't work in FP registers due to NaN-boxing. This just borrows a few lines of code that were deleted from validate_subreg and puts it in a RISC-V specific predicate function. It worked for newlib and glibc builds. I'm trying bootstrap testing now, but my target is a little slower than yours so this will take some time. Jim
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Wed, Sep 8, 2021 at 9:18 PM Segher Boessenkool wrote: > > On Wed, Sep 08, 2021 at 08:39:31PM +0200, Richard Biener wrote: > > On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool > > wrote: > > >It is not a good idea to do allow all those things. Most backends can > > >only support a few combinations of them, and everything else results in > > >*worse* machine code, best case, and more and more complicated (and more > > >buggy!) backend code. > > > > > >But that is a code quality issue. The current problem is that we have > > >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on > > >other targets). Code that used before doesn't anymore, and we have no > > >clear way out, no recommendation how to fix this and a) keep the same > > >functionality without huge changes, and b) keep the same machine code > > >quality. > > > > > >I do not think something like that can be done. That is why I am asking > > >for the patch to be reverted until all of the groundwork for it has been > > >done. That includes making generic testcases that show how such subregs > > >behave, so that we can see in testresults what changes do to existing > > >targets. > > > > Heh, I understood your earlier reply that you supported the change in > > principle based on the fact that nested subregs are invalid. > > Ah. No. I meant to lament the fact that we use subregs for multiple > things, so that doing a bit_cast of a real subreg has to be expressed as > just one subreg, which is clearly sub-optimal for most backends. > > I say *has to* because a subreg of a subreg is not valid RTL; it has to > be written as just one subreg. Which makes thing more confusing and > confused than this already non-trivial thing has to be. > > > Now, I don't think that validate_subreg is supposed to be the decision > > maker on what a target allows. > > Right, some target hook or macro or whatnot should. > > > For subregs of hardregs we seem to have a good way of validating, but > > what do we have for subregs of pseudos? Is it the passes generating > > the new unsupported subregs that should do different things? Should > > validate_subreg use a target hook to allow those special casings we > > removed which all were necessary just for specific targets but > > appearantly did not do any harm for other targets? > > Currently, we can disallow things in predicates and/or the instruction > conditions. > > > Can you give advice as to how to address the needs of the HFmode subregs on > > x86 if not by adding another (generic) narrow exception in validate_subreg? > > If I knew what the problem was, perhaps? Is this explained in some mail > somewhere? > > > That said, I fail to see a good way forward after now two appearantly > > failed attempts. > > > > Btw, I'm fine reverting the patch but then what's the solution here? > > I think we should (longer term) get rid of the overloaded meanings and > uses of subregs. One fairly simple thing is to make a new rtx code > "bit_cast" (or is there a nice short more traditional name for it?) But subreg _is_ bit_cast. What is odd to me is that a "disallowed" subreg like (subreg:SF (reg:TI ..) 0) magically becomes valid (in terms of validate_subreg) if you rewrite it as (subreg:SF (subreg:SI (reg:TI ..) 0) 0). Of course that's nested and invalid but just push the inner subreg to a new pseudo and the thing becomes valid. That was my original idea for dealing with the issue described in the thread including https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578437.html where the original issue is that extract_integral_bit_field creates a subreg that's not valid according to validate_subreg. So I thought well, extracting bitfields should be done on integer modes (thus pun everything to ints and back), but that had even more(?) fallout. > But that is not the core problem we had here. The behaviour of the > generic parts of the compiler was changed, without testing if that > works on other targets but x86. That is an understandable mistake, it > takes some experience to know where the morasses are. But this change > should have been accompanied by testcases exercising the changed code. > We would have clearly seen there are issues then, simply by watching > gcc-testresults@ (and/or maintainers are on top of the test results > anyway). Also, if there were testcases for this, we could have some > confidence that a change in this area is robust. Well, that only works if some maintainers that are familiar enough with all this chime in ;) It's stage1 so it's understandable that some people (like me ...) are tyring to help people making progress even if that involves trying to decipher 30 years of GCC history in this area (without much success in the end as we see) ;) Given validate_subreg's _intent_ is to disallow float subregs that change size exceptions like /* ??? This should not be here. Temporarily continue to allow word_mode subregs of anything. The most common
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Thu, Sep 9, 2021 at 3:17 AM Segher Boessenkool wrote: > > On Wed, Sep 08, 2021 at 08:39:31PM +0200, Richard Biener wrote: > > On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool > > wrote: > > >It is not a good idea to do allow all those things. Most backends can > > >only support a few combinations of them, and everything else results in > > >*worse* machine code, best case, and more and more complicated (and more > > >buggy!) backend code. > > > > > >But that is a code quality issue. The current problem is that we have > > >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on > > >other targets). Code that used before doesn't anymore, and we have no > > >clear way out, no recommendation how to fix this and a) keep the same > > >functionality without huge changes, and b) keep the same machine code > > >quality. > > > > > >I do not think something like that can be done. That is why I am asking > > >for the patch to be reverted until all of the groundwork for it has been > > >done. That includes making generic testcases that show how such subregs > > >behave, so that we can see in testresults what changes do to existing > > >targets. > > > > Heh, I understood your earlier reply that you supported the change in > > principle based on the fact that nested subregs are invalid. > > Ah. No. I meant to lament the fact that we use subregs for multiple > things, so that doing a bit_cast of a real subreg has to be expressed as > just one subreg, which is clearly sub-optimal for most backends. > > I say *has to* because a subreg of a subreg is not valid RTL; it has to > be written as just one subreg. Which makes thing more confusing and > confused than this already non-trivial thing has to be. > > > Now, I don't think that validate_subreg is supposed to be the decision > > maker on what a target allows. > > Right, some target hook or macro or whatnot should. > > > For subregs of hardregs we seem to have a good way of validating, but > > what do we have for subregs of pseudos? Is it the passes generating > > the new unsupported subregs that should do different things? Should > > validate_subreg use a target hook to allow those special casings we > > removed which all were necessary just for specific targets but > > appearantly did not do any harm for other targets? > > Currently, we can disallow things in predicates and/or the instruction > conditions. > > > Can you give advice as to how to address the needs of the HFmode subregs on > > x86 if not by adding another (generic) narrow exception in validate_subreg? > > If I knew what the problem was, perhaps? Is this explained in some mail > somewhere? it's at [1], target_hook may not be the best solution, if other targets support HFmode, they may encounter the same problem, and revisit it again. [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576827.html > > > That said, I fail to see a good way forward after now two appearantly > > failed attempts. > > > > Btw, I'm fine reverting the patch but then what's the solution here? > > I think we should (longer term) get rid of the overloaded meanings and > uses of subregs. One fairly simple thing is to make a new rtx code > "bit_cast" (or is there a nice short more traditional name for it?) > > But that is not the core problem we had here. The behaviour of the > generic parts of the compiler was changed, without testing if that > works on other targets but x86. That is an understandable mistake, it > takes some experience to know where the morasses are. But this change > should have been accompanied by testcases exercising the changed code. > We would have clearly seen there are issues then, simply by watching > gcc-testresults@ (and/or maintainers are on top of the test results > anyway). Also, if there were testcases for this, we could have some > confidence that a change in this area is robust. > > > Segher > > > p.s. Very unrelated... Should we have __builtin_bit_cast for C as well? > Is there any reason this could not work? -- BR, Hongtao
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Thu, Sep 9, 2021 at 3:17 AM Segher Boessenkool wrote: > > On Wed, Sep 08, 2021 at 08:39:31PM +0200, Richard Biener wrote: > > On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool > > wrote: > > >It is not a good idea to do allow all those things. Most backends can > > >only support a few combinations of them, and everything else results in > > >*worse* machine code, best case, and more and more complicated (and more > > >buggy!) backend code. > > > > > >But that is a code quality issue. The current problem is that we have > > >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on > > >other targets). Code that used before doesn't anymore, and we have no > > >clear way out, no recommendation how to fix this and a) keep the same > > >functionality without huge changes, and b) keep the same machine code > > >quality. > > > > > >I do not think something like that can be done. That is why I am asking > > >for the patch to be reverted until all of the groundwork for it has been > > >done. That includes making generic testcases that show how such subregs > > >behave, so that we can see in testresults what changes do to existing > > >targets. > > > > Heh, I understood your earlier reply that you supported the change in > > principle based on the fact that nested subregs are invalid. > > Ah. No. I meant to lament the fact that we use subregs for multiple > things, so that doing a bit_cast of a real subreg has to be expressed as > just one subreg, which is clearly sub-optimal for most backends. > > I say *has to* because a subreg of a subreg is not valid RTL; it has to > be written as just one subreg. Which makes thing more confusing and > confused than this already non-trivial thing has to be. > > > Now, I don't think that validate_subreg is supposed to be the decision > > maker on what a target allows. Since there is no good solution yet, I think we can add a target hook (targetm.validate_subreg) to recover the original removed code as a default behavior, and then x86 itself defines the hook as empty (meaning that the backend allows all kinds of subreg). something like @@ -922,6 +922,9 @@ validate_subreg (machine_mode omode, machine_mode imode, poly_uint64 regsize = REGMODE_NATURAL_SIZE (imode); + if (!targetm.valiate_subreg (omode, imode, reg, offset)) +return false; + /* Paradoxical subregs must have offset zero. */ if (maybe_gt (osize, isize)) return known_eq (offset, 0U); > > Right, some target hook or macro or whatnot should. > > > For subregs of hardregs we seem to have a good way of validating, but > > what do we have for subregs of pseudos? Is it the passes generating > > the new unsupported subregs that should do different things? Should > > validate_subreg use a target hook to allow those special casings we > > removed which all were necessary just for specific targets but > > appearantly did not do any harm for other targets? > > Currently, we can disallow things in predicates and/or the instruction > conditions. > > > Can you give advice as to how to address the needs of the HFmode subregs on > > x86 if not by adding another (generic) narrow exception in validate_subreg? > > If I knew what the problem was, perhaps? Is this explained in some mail > somewhere? > > > That said, I fail to see a good way forward after now two appearantly > > failed attempts. > > > > Btw, I'm fine reverting the patch but then what's the solution here? > > I think we should (longer term) get rid of the overloaded meanings and > uses of subregs. One fairly simple thing is to make a new rtx code > "bit_cast" (or is there a nice short more traditional name for it?) > > But that is not the core problem we had here. The behaviour of the > generic parts of the compiler was changed, without testing if that > works on other targets but x86. That is an understandable mistake, it > takes some experience to know where the morasses are. But this change > should have been accompanied by testcases exercising the changed code. > We would have clearly seen there are issues then, simply by watching > gcc-testresults@ (and/or maintainers are on top of the test results > anyway). Also, if there were testcases for this, we could have some > confidence that a change in this area is robust. > > > Segher > > > p.s. Very unrelated... Should we have __builtin_bit_cast for C as well? > Is there any reason this could not work? -- BR, Hongtao
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Wed, Sep 08, 2021 at 08:39:31PM +0200, Richard Biener wrote: > On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool > wrote: > >It is not a good idea to do allow all those things. Most backends can > >only support a few combinations of them, and everything else results in > >*worse* machine code, best case, and more and more complicated (and more > >buggy!) backend code. > > > >But that is a code quality issue. The current problem is that we have > >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on > >other targets). Code that used before doesn't anymore, and we have no > >clear way out, no recommendation how to fix this and a) keep the same > >functionality without huge changes, and b) keep the same machine code > >quality. > > > >I do not think something like that can be done. That is why I am asking > >for the patch to be reverted until all of the groundwork for it has been > >done. That includes making generic testcases that show how such subregs > >behave, so that we can see in testresults what changes do to existing > >targets. > > Heh, I understood your earlier reply that you supported the change in > principle based on the fact that nested subregs are invalid. Ah. No. I meant to lament the fact that we use subregs for multiple things, so that doing a bit_cast of a real subreg has to be expressed as just one subreg, which is clearly sub-optimal for most backends. I say *has to* because a subreg of a subreg is not valid RTL; it has to be written as just one subreg. Which makes thing more confusing and confused than this already non-trivial thing has to be. > Now, I don't think that validate_subreg is supposed to be the decision maker > on what a target allows. Right, some target hook or macro or whatnot should. > For subregs of hardregs we seem to have a good way of validating, but > what do we have for subregs of pseudos? Is it the passes generating > the new unsupported subregs that should do different things? Should > validate_subreg use a target hook to allow those special casings we > removed which all were necessary just for specific targets but > appearantly did not do any harm for other targets? Currently, we can disallow things in predicates and/or the instruction conditions. > Can you give advice as to how to address the needs of the HFmode subregs on > x86 if not by adding another (generic) narrow exception in validate_subreg? If I knew what the problem was, perhaps? Is this explained in some mail somewhere? > That said, I fail to see a good way forward after now two appearantly failed > attempts. > > Btw, I'm fine reverting the patch but then what's the solution here? I think we should (longer term) get rid of the overloaded meanings and uses of subregs. One fairly simple thing is to make a new rtx code "bit_cast" (or is there a nice short more traditional name for it?) But that is not the core problem we had here. The behaviour of the generic parts of the compiler was changed, without testing if that works on other targets but x86. That is an understandable mistake, it takes some experience to know where the morasses are. But this change should have been accompanied by testcases exercising the changed code. We would have clearly seen there are issues then, simply by watching gcc-testresults@ (and/or maintainers are on top of the test results anyway). Also, if there were testcases for this, we could have some confidence that a change in this area is robust. Segher p.s. Very unrelated... Should we have __builtin_bit_cast for C as well? Is there any reason this could not work?
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On September 8, 2021 7:08:09 PM GMT+02:00, Segher Boessenkool wrote: >Hi! > >On Wed, Sep 08, 2021 at 08:42:44AM +0200, Richard Biener wrote: >> On Wed, Sep 8, 2021 at 1:08 AM Segher Boessenkool >> wrote: >> > The core of the problem is that subreg of pseudos has three meanings: >> > -- Paradoxical subregs; >> > -- Actual subregs; >> > -- "bit_cast" thingies: treat the same bits as something else. Like >> > looking at the bits of a float as its memory image. >> > >> > Ignoring paradoxical subregs (as well as subregs of mem, which should >> > have disappeared by now), and subregs of hard registers as well (those >> > have *different* semantics after all), the other two kinds can be mixed, >> > and *have to* be mixed, because subregs of subregs are non-canonical. >> > >> > Is there any reason why not to allow this kind of subreg? >> >> In fact the causing rev. in question >> (d2874d905647a1d146dafa60199d440e837adc4d) >> made all those subregs "valid" in terms of what validate_subreg is verifying >> and thus now the few places using validate_subreg to check whether some >> subreg is valid will now happily do float<->int converting subregs. > >Like in PR102154: > >error: unrecognizable insn: >(insn 11 10 12 2 (set (reg:SF 118 [ ]) >(subreg:SF (reg:DI 122) 0)) > >This generated valid code before, and now it is not anymore. If the >patch was meant to allow *more*, it failed that goal: it allows *less*. > >The issue is that we do not allow SF subregs in all cases. We do not >store SFmode quantities in IEEE SP format in registers normally, so it >would require expensive extra code to allow such subregs. > >Whatever pass made these subregs is wrong, because they do not pass >recog(). > >> I do agree that those subregs should be allowed and that the above rev. is >> a strict improvement (given it removes a lot of "but allow special case X >> because target Y wants it" cases by simply allowing all of them). But the >> previous code seems to have papered over quite some backend issues. > >It is not a good idea to do allow all those things. Most backends can >only support a few combinations of them, and everything else results in >*worse* machine code, best case, and more and more complicated (and more >buggy!) backend code. > >But that is a code quality issue. The current problem is that we have >at least PR102211 and PR102154 (as well as reports elsewhere of bugs on >other targets). Code that used before doesn't anymore, and we have no >clear way out, no recommendation how to fix this and a) keep the same >functionality without huge changes, and b) keep the same machine code >quality. > >I do not think something like that can be done. That is why I am asking >for the patch to be reverted until all of the groundwork for it has been >done. That includes making generic testcases that show how such subregs >behave, so that we can see in testresults what changes do to existing >targets. Heh, I understood your earlier reply that you supported the change in principle based on the fact that nested subregs are invalid. Now, I don't think that validate_subreg is supposed to be the decision maker on what a target allows. For subregs of hardregs we seem to have a good way of validating, but what do we have for subregs of pseudos? Is it the passes generating the new unsupported subregs that should do different things? Should validate_subreg use a target hook to allow those special casings we removed which all were necessary just for specific targets but appearantly did not do any harm for other targets? Can you give advice as to how to address the needs of the HFmode subregs on x86 if not by adding another (generic) narrow exception in validate_subreg? That said, I fail to see a good way forward after now two appearantly failed attempts. Btw, I'm fine reverting the patch but then what's the solution here? Richard. > >Segher
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Wed, Sep 8, 2021 at 1:10 PM Segher Boessenkool wrote: > > Hi! > > On Wed, Sep 08, 2021 at 08:42:44AM +0200, Richard Biener wrote: > > On Wed, Sep 8, 2021 at 1:08 AM Segher Boessenkool > > wrote: > > > The core of the problem is that subreg of pseudos has three meanings: > > > -- Paradoxical subregs; > > > -- Actual subregs; > > > -- "bit_cast" thingies: treat the same bits as something else. Like > > > looking at the bits of a float as its memory image. > > > > > > Ignoring paradoxical subregs (as well as subregs of mem, which should > > > have disappeared by now), and subregs of hard registers as well (those > > > have *different* semantics after all), the other two kinds can be mixed, > > > and *have to* be mixed, because subregs of subregs are non-canonical. > > > > > > Is there any reason why not to allow this kind of subreg? > > > > In fact the causing rev. in question > > (d2874d905647a1d146dafa60199d440e837adc4d) > > made all those subregs "valid" in terms of what validate_subreg is verifying > > and thus now the few places using validate_subreg to check whether some > > subreg is valid will now happily do float<->int converting subregs. > > Like in PR102154: > > error: unrecognizable insn: > (insn 11 10 12 2 (set (reg:SF 118 [ ]) > (subreg:SF (reg:DI 122) 0)) > > This generated valid code before, and now it is not anymore. If the > patch was meant to allow *more*, it failed that goal: it allows *less*. > > The issue is that we do not allow SF subregs in all cases. We do not > store SFmode quantities in IEEE SP format in registers normally, so it > would require expensive extra code to allow such subregs. > > Whatever pass made these subregs is wrong, because they do not pass > recog(). > > > I do agree that those subregs should be allowed and that the above rev. is > > a strict improvement (given it removes a lot of "but allow special case X > > because target Y wants it" cases by simply allowing all of them). But the > > previous code seems to have papered over quite some backend issues. > > It is not a good idea to do allow all those things. Most backends can > only support a few combinations of them, and everything else results in > *worse* machine code, best case, and more and more complicated (and more > buggy!) backend code. > > But that is a code quality issue. The current problem is that we have > at least PR102211 and PR102154 (as well as reports elsewhere of bugs on > other targets). Code that used before doesn't anymore, and we have no > clear way out, no recommendation how to fix this and a) keep the same > functionality without huge changes, and b) keep the same machine code > quality. > > I do not think something like that can be done. That is why I am asking > for the patch to be reverted until all of the groundwork for it has been > done. That includes making generic testcases that show how such subregs > behave, so that we can see in testresults what changes do to existing > targets. Richi, As was mentioned in an earlier reply, the patch removed comments in the code itself that explained why the code was necessary: /* ??? This should not be here. Temporarily continue to allow word_mode subregs of anything. The most common offender is (subreg:SI (reg:DF)). Generally, backends are doing something sketchy but it'll take time to fix them all. */ /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though store_bit_field is the culprit here, and not the backends. */ The sources of the problem are issues in backends and issues in the common part of the compiler. Removal of this functionality should start with a preemptive analysis of the compiler to determine how this ugly behavior is being utilized and a plan to fix all backends and the common code generation paths in the compiler, not a sweeping change to the backend that flushes out problems. The GCC Regression policy places the burden on the developer whose patch introduced the problem. Liuhong did not test any other targets although the code comments clearly stated that the behavior was required by many targets, and he is not stepping forward to resolve the problems. This patch should be reverted until the problems exposed by it can be addressed in a careful and deliberate manner. Thanks, David
Re: [PATCH] Fix SFmode subreg of DImode and TImode
Hi! On Wed, Sep 08, 2021 at 08:42:44AM +0200, Richard Biener wrote: > On Wed, Sep 8, 2021 at 1:08 AM Segher Boessenkool > wrote: > > The core of the problem is that subreg of pseudos has three meanings: > > -- Paradoxical subregs; > > -- Actual subregs; > > -- "bit_cast" thingies: treat the same bits as something else. Like > > looking at the bits of a float as its memory image. > > > > Ignoring paradoxical subregs (as well as subregs of mem, which should > > have disappeared by now), and subregs of hard registers as well (those > > have *different* semantics after all), the other two kinds can be mixed, > > and *have to* be mixed, because subregs of subregs are non-canonical. > > > > Is there any reason why not to allow this kind of subreg? > > In fact the causing rev. in question > (d2874d905647a1d146dafa60199d440e837adc4d) > made all those subregs "valid" in terms of what validate_subreg is verifying > and thus now the few places using validate_subreg to check whether some > subreg is valid will now happily do float<->int converting subregs. Like in PR102154: error: unrecognizable insn: (insn 11 10 12 2 (set (reg:SF 118 [ ]) (subreg:SF (reg:DI 122) 0)) This generated valid code before, and now it is not anymore. If the patch was meant to allow *more*, it failed that goal: it allows *less*. The issue is that we do not allow SF subregs in all cases. We do not store SFmode quantities in IEEE SP format in registers normally, so it would require expensive extra code to allow such subregs. Whatever pass made these subregs is wrong, because they do not pass recog(). > I do agree that those subregs should be allowed and that the above rev. is > a strict improvement (given it removes a lot of "but allow special case X > because target Y wants it" cases by simply allowing all of them). But the > previous code seems to have papered over quite some backend issues. It is not a good idea to do allow all those things. Most backends can only support a few combinations of them, and everything else results in *worse* machine code, best case, and more and more complicated (and more buggy!) backend code. But that is a code quality issue. The current problem is that we have at least PR102211 and PR102154 (as well as reports elsewhere of bugs on other targets). Code that used before doesn't anymore, and we have no clear way out, no recommendation how to fix this and a) keep the same functionality without huge changes, and b) keep the same machine code quality. I do not think something like that can be done. That is why I am asking for the patch to be reverted until all of the groundwork for it has been done. That includes making generic testcases that show how such subregs behave, so that we can see in testresults what changes do to existing targets. Segher
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Tue, Sep 07, 2021 at 06:07:30PM -0500, Segher Boessenkool wrote: > Hi! > > On Tue, Sep 07, 2021 at 03:12:36AM -0400, Michael Meissner wrote: > > [PATCH] Fix SFmode subreg of DImode and TImode > > > > This patch fixes the breakage in the PowerPC due to a recent change in > > SUBREG > > behavior. > > But what was that change? And was that intentional? If so, why wasn't > it documented, was the existing behaviour considered buggy? But the > documentation agrees with the previous behaviour afaics. I haven't investigated what the machine independent change was that suddenly started allowing: (subreg:SF (reg:TI ...)) This patch just allows the subreg's to exist instead of getting an insn not found message. An alternative approach would be to allow wider subregs in the insn matching and deal with getting the right physical register when splitting. > > While it is arguable that the patch that caused the breakage should > > be reverted, this patch should be a bandage to prevent these changes from > > happening again. > > NAK. This patch will likely cause us to generate worse code. If that > is not the case it will need a long, in-depth explanation of why not. Trying to replicate the problem, I get exactly the same code with an older compiler. I will attach the test program as an attachment. > Sorry. > > > I first noticed it in building the Spec 2017 wrf_r and blender_r > > benchmarks. Once I applied this patch, I also noticed several of the > > tests now pass. > > > > The core of the problem is we need to treat SUBREG's of SFmode and SImode > > specially on the PowerPC. This is due to the fact that SFmode values that > > are > > in the vector and floating point registers are represented as DFmode. When > > we > > want to do a direct move between the GPR registers and the vector > > registers, we > > have to convert the value from the DFmode representation to/from the SFmode > > representation. > > The core of the problem is that subreg of pseudos has three meanings: > -- Paradoxical subregs; > -- Actual subregs; > -- "bit_cast" thingies: treat the same bits as something else. Like > looking at the bits of a float as its memory image. > > Ignoring paradoxical subregs (as well as subregs of mem, which should > have disappeared by now), and subregs of hard registers as well (those > have *different* semantics after all), the other two kinds can be mixed, > and *have to* be mixed, because subregs of subregs are non-canonical. This isn't subregs of subregs. This is a subreg of a larger integer type, for example having a subreg representing an __int128_t or structure, and accessing the bottom/top element. Instead of doing an extract, the compiler generates a subreg. Before the machine independent part of the compiler would not generate these subregs, and now it does. > > Is there any reason why not to allow this kind of subreg? > > If we want to not allow mixing bit_cast with subregs, we should make it > its own RTL code. > > > + /* In case we are given a SUBREG for a larger type, reduce it to > > +SImode. */ > > + if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4) > > + { > > + rtx tmp = gen_reg_rtx (SImode); > > + emit_move_insn (tmp, gen_lowpart (SImode, source)); > > + emit_insn (gen_movsf_from_si (dest, tmp)); > > + return true; > > + } > > This makes it two separate insns. Is that always optimised to code that > is at least as good as before? It is likely even if it generates an extra move, it will be better, because the default would be to do the transfer via store on one register bank and load of the other. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797 union u_128 { __int128_t i128; long l[2]; int i[4]; double d[2]; float f[4]; }; union u_64 { long l; int i[2]; double d; float f[2]; }; float ret_u128_f0 (long a, long b, long x, long y) { union u_128 u; u.l[0] = a ^ x; u.l[1] = b ^ y; return u.f[0]; } float ret_u128_f1 (long a, long b, long x, long y) { union u_128 u; u.l[0] = a ^ x; u.l[1] = b ^ y; return u.f[1]; } float ret_u128_f2 (long a, long b, long x, long y) { union u_128 u; u.l[0] = a ^ x; u.l[1] = b ^ x; return u.f[2]; } float ret_u128_f3 (long a, long b, long x, long y) { union u_128 u; u.l[0] = a ^ x; u.l[1] = b ^ x; return u.f[3]; } float ret_u64_f0 (long a, long x) { union u_64 u; u.l = a ^ x; return u.f[0]; } float ret_u64_f1 (long a, long x) { union u_64 u; u.l = a ^ x; return u.f[1]; }
Re: [PATCH] Fix SFmode subreg of DImode and TImode
On Wed, Sep 8, 2021 at 1:08 AM Segher Boessenkool wrote: > > Hi! > > On Tue, Sep 07, 2021 at 03:12:36AM -0400, Michael Meissner wrote: > > [PATCH] Fix SFmode subreg of DImode and TImode > > > > This patch fixes the breakage in the PowerPC due to a recent change in > > SUBREG > > behavior. > > But what was that change? And was that intentional? If so, why wasn't > it documented, was the existing behaviour considered buggy? But the > documentation agrees with the previous behaviour afaics. > > > While it is arguable that the patch that caused the breakage should > > be reverted, this patch should be a bandage to prevent these changes from > > happening again. > > NAK. This patch will likely cause us to generate worse code. If that > is not the case it will need a long, in-depth explanation of why not. > > Sorry. > > > I first noticed it in building the Spec 2017 wrf_r and blender_r > > benchmarks. Once I applied this patch, I also noticed several of the > > tests now pass. > > > > The core of the problem is we need to treat SUBREG's of SFmode and SImode > > specially on the PowerPC. This is due to the fact that SFmode values that > > are > > in the vector and floating point registers are represented as DFmode. When > > we > > want to do a direct move between the GPR registers and the vector > > registers, we > > have to convert the value from the DFmode representation to/from the SFmode > > representation. > > The core of the problem is that subreg of pseudos has three meanings: > -- Paradoxical subregs; > -- Actual subregs; > -- "bit_cast" thingies: treat the same bits as something else. Like > looking at the bits of a float as its memory image. > > Ignoring paradoxical subregs (as well as subregs of mem, which should > have disappeared by now), and subregs of hard registers as well (those > have *different* semantics after all), the other two kinds can be mixed, > and *have to* be mixed, because subregs of subregs are non-canonical. > > Is there any reason why not to allow this kind of subreg? In fact the causing rev. in question (d2874d905647a1d146dafa60199d440e837adc4d) made all those subregs "valid" in terms of what validate_subreg is verifying and thus now the few places using validate_subreg to check whether some subreg is valid will now happily do float<->int converting subregs. I do agree that those subregs should be allowed and that the above rev. is a strict improvement (given it removes a lot of "but allow special case X because target Y wants it" cases by simply allowing all of them). But the previous code seems to have papered over quite some backend issues. Now I have no opinion on the rs6000 patch fixing one of those issues. Richard. > If we want to not allow mixing bit_cast with subregs, we should make it > its own RTL code. > > > + /* In case we are given a SUBREG for a larger type, reduce it to > > + SImode. */ > > + if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4) > > + { > > + rtx tmp = gen_reg_rtx (SImode); > > + emit_move_insn (tmp, gen_lowpart (SImode, source)); > > + emit_insn (gen_movsf_from_si (dest, tmp)); > > + return true; > > + } > > This makes it two separate insns. Is that always optimised to code that > is at least as good as before? > > > Segher
Re: [PATCH] Fix SFmode subreg of DImode and TImode
Hi! On Tue, Sep 07, 2021 at 03:12:36AM -0400, Michael Meissner wrote: > [PATCH] Fix SFmode subreg of DImode and TImode > > This patch fixes the breakage in the PowerPC due to a recent change in SUBREG > behavior. But what was that change? And was that intentional? If so, why wasn't it documented, was the existing behaviour considered buggy? But the documentation agrees with the previous behaviour afaics. > While it is arguable that the patch that caused the breakage should > be reverted, this patch should be a bandage to prevent these changes from > happening again. NAK. This patch will likely cause us to generate worse code. If that is not the case it will need a long, in-depth explanation of why not. Sorry. > I first noticed it in building the Spec 2017 wrf_r and blender_r > benchmarks. Once I applied this patch, I also noticed several of the > tests now pass. > > The core of the problem is we need to treat SUBREG's of SFmode and SImode > specially on the PowerPC. This is due to the fact that SFmode values that are > in the vector and floating point registers are represented as DFmode. When we > want to do a direct move between the GPR registers and the vector registers, > we > have to convert the value from the DFmode representation to/from the SFmode > representation. The core of the problem is that subreg of pseudos has three meanings: -- Paradoxical subregs; -- Actual subregs; -- "bit_cast" thingies: treat the same bits as something else. Like looking at the bits of a float as its memory image. Ignoring paradoxical subregs (as well as subregs of mem, which should have disappeared by now), and subregs of hard registers as well (those have *different* semantics after all), the other two kinds can be mixed, and *have to* be mixed, because subregs of subregs are non-canonical. Is there any reason why not to allow this kind of subreg? If we want to not allow mixing bit_cast with subregs, we should make it its own RTL code. > + /* In case we are given a SUBREG for a larger type, reduce it to > + SImode. */ > + if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4) > + { > + rtx tmp = gen_reg_rtx (SImode); > + emit_move_insn (tmp, gen_lowpart (SImode, source)); > + emit_insn (gen_movsf_from_si (dest, tmp)); > + return true; > + } This makes it two separate insns. Is that always optimised to code that is at least as good as before? Segher
[PATCH] Fix SFmode subreg of DImode and TImode
[PATCH] Fix SFmode subreg of DImode and TImode This patch fixes the breakage in the PowerPC due to a recent change in SUBREG behavior. While it is arguable that the patch that caused the breakage should be reverted, this patch should be a bandage to prevent these changes from happening again. I first noticed it in building the Spec 2017 wrf_r and blender_r benchmarks. Once I applied this patch, I also noticed several of the tests now pass. The core of the problem is we need to treat SUBREG's of SFmode and SImode specially on the PowerPC. This is due to the fact that SFmode values that are in the vector and floating point registers are represented as DFmode. When we want to do a direct move between the GPR registers and the vector registers, we have to convert the value from the DFmode representation to/from the SFmode representation. By doing this special processing instead of doing the transfer via store and load, we were able to speed up the math library which at times want to use the SFmode values in a union, and do logical operations on it (to test exponent ranges, etc.) and then move it over to use as a floating point value. I did a bootstrap build on a little endian power9 system with and without the patch applied. There was no regression in the tests. I'm doing a build on a big endian power8 system, but it hasn't finished yet as I sent this email. I will check on the big endian progress tomorrow morning. The following tests now pass once again with the test. C tests: gcc.c-torture/compile/20071102-1.c gcc.c-torture/compile/pr55921.c gcc.c-torture/compile/pr85945.c gcc.c-torture/execute/complex-3.c gcc.dg/atomic/c11-atomic-exec-1.c gcc.dg/atomic/c11-atomic-exec-2.c gcc.dg/atomic/c11-atomic-exec-4.c gcc.dg/atomic/c11-atomic-exec-5.c gcc.dg/c11-atomic-2.c gcc.dg/pr42475.c gcc.dg/pr47201.c gcc.dg/pr48335-1.c gcc.dg/torture/pr67741.c gcc.dg/tree-ssa/ssa-dom-thread-10.c gcc.dg/tsan/pr88030.c gcc.dg/ubsan/float-cast-overflow-atomic.c gcc.dg/vect/no-tree-sra-bb-slp-pr50730.c C++ tests: == g++.dg/opt/alias1.C g++.dg/template/koenig6.C g++.dg/torture/pr40924.C tmpdir-g++.dg-struct-layout-1/t001 Fortran tests: == gfortran.dg/array_constructor_type_22.f03 gfortran.dg/array_function_6.f90 gfortran.dg/derived_comp_array_ref_7.f90 gfortran.dg/elemental_scalar_args_1.f90 gfortran.dg/elemental_subroutine_1.f90 gfortran.dg/inline_matmul_5.f90 gfortran.dg/inline_matmul_8.f90 gfortran.dg/inline_matmul_9.f90 gfortran.dg/matmul_bounds_6.f90 gfortran.dg/operator_1.f90 gfortran.dg/past_eor.f90 gfortran.dg/pr101121.f gfortran.dg/pr91552.f90 gfortran.dg/spread_shape_1.f90 gfortran.dg/typebound_operator_3.f03 gfortran.dg/value_1.f90 gfortran.fortran-torture/execute/entry_4.f90 gfortran.fortran-torture/execute/intrinsic_dotprod.f90 gfortran.fortran-torture/execute/intrinsic_matmul.f90 Can I check this fix into the master branch? 2021-09-06 Michael Meissner gcc/ * config/rs6000/rs6000.c (rs6000_emit_move_si_sf_subreg): Deal with SUBREGs of TImode and DImode. --- gcc/config/rs6000/rs6000.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index b9ebd56c993..7bbf29a3e1c 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -10942,6 +10942,16 @@ rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode) return true; } + /* In case we are given a SUBREG for a larger type, reduce it to +SImode. */ + if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4) + { + rtx tmp = gen_reg_rtx (SImode); + emit_move_insn (tmp, gen_lowpart (SImode, source)); + emit_insn (gen_movsf_from_si (dest, tmp)); + return true; + } + if (mode == SFmode && inner_mode == SImode) { emit_insn (gen_movsf_from_si (dest, inner_source)); -- 2.31.1 -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797