Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-29 Thread Richard Biener via Gcc-patches
On Wed, Jul 29, 2020 at 9:49 AM Stefan Schulze Frielinghaus
 wrote:
>
> On Wed, Jul 29, 2020 at 09:11:12AM +0200, Richard Biener wrote:
> > On Tue, Jul 28, 2020 at 5:36 PM Stefan Schulze Frielinghaus
> >  wrote:
> > >
> > > On Tue, Jul 28, 2020 at 08:55:57AM +0200, Richard Biener wrote:
> > > > On Mon, Jul 27, 2020 at 4:20 PM Stefan Schulze Frielinghaus
> > > >  wrote:
> > > > >
> > > > > On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote:
> > > > > > On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
> > > > > >  wrote:
> > > > > > >
> > > > > > > Richard Biener  writes:
> > > > > > > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> > > > > > > >  wrote:
> > > > > > > >>
> > > > > > > >> Richard Biener via Gcc-patches  
> > > > > > > >> writes:
> > > > > > > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus 
> > > > > > > >> > via
> > > > > > > >> > Gcc-patches  wrote:
> > > > > > > >> >>
> > > > > > > >> >> This is a follow up to commit 5c9669a0e6c respectively 
> > > > > > > >> >> discussion
> > > > > > > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> > > > > > > >> >>
> > > > > > > >> >> In case that an alignment constraint is less than the size 
> > > > > > > >> >> of a
> > > > > > > >> >> corresponding scalar type, ensure that we advance at least 
> > > > > > > >> >> by one
> > > > > > > >> >> iteration.  For example, on s390x we have for a long double 
> > > > > > > >> >> an alignment
> > > > > > > >> >> constraint of 8 bytes whereas the size is 16 bytes.  
> > > > > > > >> >> Therefore,
> > > > > > > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite 
> > > > > > > >> >> loop which
> > > > > > > >> >> can be reproduced by the following MWE:
> > > > > > > >> >
> > > > > > > >> > But we guard this case with vector_alignment_reachable_p, so 
> > > > > > > >> > we shouldn't
> > > > > > > >> > have ended up here and the patch looks bogus.
> > > > > > > >>
> > > > > > > >> The above sounds like it ought to count as reachable alignment 
> > > > > > > >> though.
> > > > > > > >> If a type requires a lower alignment than its size, then 
> > > > > > > >> that's even
> > > > > > > >> more easily reachable than a type that requires the same 
> > > > > > > >> alignment as
> > > > > > > >> the size.  I guess at one extreme, a target alignment of 1 is 
> > > > > > > >> always
> > > > > > > >> reachable.
> > > > > > > >
> > > > > > > > Well, if the element alignment is 8 but its size is 16 then 
> > > > > > > > when presumably
> > > > > > > > the desired vector alignment is a multiple of 16 we can never 
> > > > > > > > reach it.
> > > > > > > > Isn't this the case here?
> > > > > > >
> > > > > > > If the desired vector alignment (TARGET_ALIGN) is a multiple of 
> > > > > > > 16 then
> > > > > > > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch 
> > > > > > > is
> > > > > > > fixing wouldn't occur.  I agree that we might never be able to 
> > > > > > > reach
> > > > > > > that alignment if the pointer starts out misaligned by 8 bytes.
> > > > > > >
> > > > > > > But I think that's why it makes sense for the target to only ask
> > > > > > > for 8-byte alignment for vectors too, if it can cope with it.  
> > > > > > > 8-byte
> > > > > > > alignment should always be achievable if the scalars are 
> > > > > > > ABI-aligned.
> > > > > > > And if the target does ask for only 8-byte alignment, 
> > > > > > > TARGET_ALIGN /
> > > > > > > DR_SIZE would be zero and the loop would never progress, which is 
> > > > > > > the
> > > > > > > problem that the patch is fixing.
> > > > > > >
> > > > > > > It would even make sense for the target to ask for 1-byte 
> > > > > > > alignment,
> > > > > > > if the target doesn't care about alignment at all.
> > > > > >
> > > > > > Hmm, OK.  Guess I still think we should detect this somewhere upward
> > > > > > and avoid this peeling compute at all.  Somehow.
> > > > >
> > > > > I've been playing around with another solution which works for me by
> > > > > changing vector_alignment_reachable_p to return also false if the
> > > > > alignment requirements are already satisfied, i.e., by adding:
> > > > >
> > > > > if (known_alignment_for_access_p (dr_info) && aligned_access_p 
> > > > > (dr_info))
> > > > >   return false;
> > > >
> > > > That sounds wrong, instead ...
> > >
> > > Can you elaborate on that?  A similar test exists for predicate
> > > vector_alignment_reachable_p where the second conjunct is the same but
> > > negated in order to test for the case where a misalignment is known:
> > > https://gcc.gnu.org/git?p=gcc.git;a=blob;f=gcc/tree-vect-data-refs.c;h=e35a215e042478d11d6545f1f829d816d0c3620f;hb=refs/heads/master#l1263
> > > Therefore, I'm wondering why the non-negated case should be wrong.
> > >
> > > > > Though, I'm not entirely sure whether this makes it better or not.
> > > > > Strictly speaking if the alignment was reachable before peeling, then
> > > > > reaching alignment with 

Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-29 Thread Stefan Schulze Frielinghaus via Gcc-patches
On Wed, Jul 29, 2020 at 09:11:12AM +0200, Richard Biener wrote:
> On Tue, Jul 28, 2020 at 5:36 PM Stefan Schulze Frielinghaus
>  wrote:
> >
> > On Tue, Jul 28, 2020 at 08:55:57AM +0200, Richard Biener wrote:
> > > On Mon, Jul 27, 2020 at 4:20 PM Stefan Schulze Frielinghaus
> > >  wrote:
> > > >
> > > > On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote:
> > > > > On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
> > > > >  wrote:
> > > > > >
> > > > > > Richard Biener  writes:
> > > > > > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> > > > > > >  wrote:
> > > > > > >>
> > > > > > >> Richard Biener via Gcc-patches  writes:
> > > > > > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> > > > > > >> > Gcc-patches  wrote:
> > > > > > >> >>
> > > > > > >> >> This is a follow up to commit 5c9669a0e6c respectively 
> > > > > > >> >> discussion
> > > > > > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> > > > > > >> >>
> > > > > > >> >> In case that an alignment constraint is less than the size of 
> > > > > > >> >> a
> > > > > > >> >> corresponding scalar type, ensure that we advance at least by 
> > > > > > >> >> one
> > > > > > >> >> iteration.  For example, on s390x we have for a long double 
> > > > > > >> >> an alignment
> > > > > > >> >> constraint of 8 bytes whereas the size is 16 bytes.  
> > > > > > >> >> Therefore,
> > > > > > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite 
> > > > > > >> >> loop which
> > > > > > >> >> can be reproduced by the following MWE:
> > > > > > >> >
> > > > > > >> > But we guard this case with vector_alignment_reachable_p, so 
> > > > > > >> > we shouldn't
> > > > > > >> > have ended up here and the patch looks bogus.
> > > > > > >>
> > > > > > >> The above sounds like it ought to count as reachable alignment 
> > > > > > >> though.
> > > > > > >> If a type requires a lower alignment than its size, then that's 
> > > > > > >> even
> > > > > > >> more easily reachable than a type that requires the same 
> > > > > > >> alignment as
> > > > > > >> the size.  I guess at one extreme, a target alignment of 1 is 
> > > > > > >> always
> > > > > > >> reachable.
> > > > > > >
> > > > > > > Well, if the element alignment is 8 but its size is 16 then when 
> > > > > > > presumably
> > > > > > > the desired vector alignment is a multiple of 16 we can never 
> > > > > > > reach it.
> > > > > > > Isn't this the case here?
> > > > > >
> > > > > > If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 
> > > > > > then
> > > > > > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
> > > > > > fixing wouldn't occur.  I agree that we might never be able to reach
> > > > > > that alignment if the pointer starts out misaligned by 8 bytes.
> > > > > >
> > > > > > But I think that's why it makes sense for the target to only ask
> > > > > > for 8-byte alignment for vectors too, if it can cope with it.  
> > > > > > 8-byte
> > > > > > alignment should always be achievable if the scalars are 
> > > > > > ABI-aligned.
> > > > > > And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
> > > > > > DR_SIZE would be zero and the loop would never progress, which is 
> > > > > > the
> > > > > > problem that the patch is fixing.
> > > > > >
> > > > > > It would even make sense for the target to ask for 1-byte alignment,
> > > > > > if the target doesn't care about alignment at all.
> > > > >
> > > > > Hmm, OK.  Guess I still think we should detect this somewhere upward
> > > > > and avoid this peeling compute at all.  Somehow.
> > > >
> > > > I've been playing around with another solution which works for me by
> > > > changing vector_alignment_reachable_p to return also false if the
> > > > alignment requirements are already satisfied, i.e., by adding:
> > > >
> > > > if (known_alignment_for_access_p (dr_info) && aligned_access_p 
> > > > (dr_info))
> > > >   return false;
> > >
> > > That sounds wrong, instead ...
> >
> > Can you elaborate on that?  A similar test exists for predicate
> > vector_alignment_reachable_p where the second conjunct is the same but
> > negated in order to test for the case where a misalignment is known:
> > https://gcc.gnu.org/git?p=gcc.git;a=blob;f=gcc/tree-vect-data-refs.c;h=e35a215e042478d11d6545f1f829d816d0c3620f;hb=refs/heads/master#l1263
> > Therefore, I'm wondering why the non-negated case should be wrong.
> >
> > > > Though, I'm not entirely sure whether this makes it better or not.
> > > > Strictly speaking if the alignment was reachable before peeling, then
> > > > reaching alignment with peeling is also possible but probably not what
> > > > was intended.  So I guess returning false in this case is sensible.  Any
> > > > comments?
> > >
> > > ... why is the DR considered for peeling at all?  If it is already
> > > aligned there's
> > > no point to do that.
> >
> > Isn't the whole point of vector_alignment_reachable_p to check DRs in
> > order 

Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-29 Thread Richard Biener via Gcc-patches
On Tue, Jul 28, 2020 at 5:36 PM Stefan Schulze Frielinghaus
 wrote:
>
> On Tue, Jul 28, 2020 at 08:55:57AM +0200, Richard Biener wrote:
> > On Mon, Jul 27, 2020 at 4:20 PM Stefan Schulze Frielinghaus
> >  wrote:
> > >
> > > On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote:
> > > > On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
> > > >  wrote:
> > > > >
> > > > > Richard Biener  writes:
> > > > > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> > > > > >  wrote:
> > > > > >>
> > > > > >> Richard Biener via Gcc-patches  writes:
> > > > > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> > > > > >> > Gcc-patches  wrote:
> > > > > >> >>
> > > > > >> >> This is a follow up to commit 5c9669a0e6c respectively 
> > > > > >> >> discussion
> > > > > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> > > > > >> >>
> > > > > >> >> In case that an alignment constraint is less than the size of a
> > > > > >> >> corresponding scalar type, ensure that we advance at least by 
> > > > > >> >> one
> > > > > >> >> iteration.  For example, on s390x we have for a long double an 
> > > > > >> >> alignment
> > > > > >> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> > > > > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite 
> > > > > >> >> loop which
> > > > > >> >> can be reproduced by the following MWE:
> > > > > >> >
> > > > > >> > But we guard this case with vector_alignment_reachable_p, so we 
> > > > > >> > shouldn't
> > > > > >> > have ended up here and the patch looks bogus.
> > > > > >>
> > > > > >> The above sounds like it ought to count as reachable alignment 
> > > > > >> though.
> > > > > >> If a type requires a lower alignment than its size, then that's 
> > > > > >> even
> > > > > >> more easily reachable than a type that requires the same alignment 
> > > > > >> as
> > > > > >> the size.  I guess at one extreme, a target alignment of 1 is 
> > > > > >> always
> > > > > >> reachable.
> > > > > >
> > > > > > Well, if the element alignment is 8 but its size is 16 then when 
> > > > > > presumably
> > > > > > the desired vector alignment is a multiple of 16 we can never reach 
> > > > > > it.
> > > > > > Isn't this the case here?
> > > > >
> > > > > If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 
> > > > > then
> > > > > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
> > > > > fixing wouldn't occur.  I agree that we might never be able to reach
> > > > > that alignment if the pointer starts out misaligned by 8 bytes.
> > > > >
> > > > > But I think that's why it makes sense for the target to only ask
> > > > > for 8-byte alignment for vectors too, if it can cope with it.  8-byte
> > > > > alignment should always be achievable if the scalars are ABI-aligned.
> > > > > And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
> > > > > DR_SIZE would be zero and the loop would never progress, which is the
> > > > > problem that the patch is fixing.
> > > > >
> > > > > It would even make sense for the target to ask for 1-byte alignment,
> > > > > if the target doesn't care about alignment at all.
> > > >
> > > > Hmm, OK.  Guess I still think we should detect this somewhere upward
> > > > and avoid this peeling compute at all.  Somehow.
> > >
> > > I've been playing around with another solution which works for me by
> > > changing vector_alignment_reachable_p to return also false if the
> > > alignment requirements are already satisfied, i.e., by adding:
> > >
> > > if (known_alignment_for_access_p (dr_info) && aligned_access_p (dr_info))
> > >   return false;
> >
> > That sounds wrong, instead ...
>
> Can you elaborate on that?  A similar test exists for predicate
> vector_alignment_reachable_p where the second conjunct is the same but
> negated in order to test for the case where a misalignment is known:
> https://gcc.gnu.org/git?p=gcc.git;a=blob;f=gcc/tree-vect-data-refs.c;h=e35a215e042478d11d6545f1f829d816d0c3620f;hb=refs/heads/master#l1263
> Therefore, I'm wondering why the non-negated case should be wrong.
>
> > > Though, I'm not entirely sure whether this makes it better or not.
> > > Strictly speaking if the alignment was reachable before peeling, then
> > > reaching alignment with peeling is also possible but probably not what
> > > was intended.  So I guess returning false in this case is sensible.  Any
> > > comments?
> >
> > ... why is the DR considered for peeling at all?  If it is already
> > aligned there's
> > no point to do that.
>
> Isn't the whole point of vector_alignment_reachable_p to check DRs in
> order to decide whether peeling should be done or not?  At least this is
> my intuition and the reason why I was suggesting to return false in case
> it is aligned.

Doh, you are right - I confused the function to be a mere wrapper
around the VECTOR_ALIGNMENT_REACHABLE target hook.  But
yes, it's exactly what you say.  But with your suggested extra 

Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-28 Thread Stefan Schulze Frielinghaus via Gcc-patches
On Tue, Jul 28, 2020 at 08:55:57AM +0200, Richard Biener wrote:
> On Mon, Jul 27, 2020 at 4:20 PM Stefan Schulze Frielinghaus
>  wrote:
> >
> > On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote:
> > > On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
> > >  wrote:
> > > >
> > > > Richard Biener  writes:
> > > > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> > > > >  wrote:
> > > > >>
> > > > >> Richard Biener via Gcc-patches  writes:
> > > > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> > > > >> > Gcc-patches  wrote:
> > > > >> >>
> > > > >> >> This is a follow up to commit 5c9669a0e6c respectively discussion
> > > > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> > > > >> >>
> > > > >> >> In case that an alignment constraint is less than the size of a
> > > > >> >> corresponding scalar type, ensure that we advance at least by one
> > > > >> >> iteration.  For example, on s390x we have for a long double an 
> > > > >> >> alignment
> > > > >> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> > > > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop 
> > > > >> >> which
> > > > >> >> can be reproduced by the following MWE:
> > > > >> >
> > > > >> > But we guard this case with vector_alignment_reachable_p, so we 
> > > > >> > shouldn't
> > > > >> > have ended up here and the patch looks bogus.
> > > > >>
> > > > >> The above sounds like it ought to count as reachable alignment 
> > > > >> though.
> > > > >> If a type requires a lower alignment than its size, then that's even
> > > > >> more easily reachable than a type that requires the same alignment as
> > > > >> the size.  I guess at one extreme, a target alignment of 1 is always
> > > > >> reachable.
> > > > >
> > > > > Well, if the element alignment is 8 but its size is 16 then when 
> > > > > presumably
> > > > > the desired vector alignment is a multiple of 16 we can never reach 
> > > > > it.
> > > > > Isn't this the case here?
> > > >
> > > > If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then
> > > > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
> > > > fixing wouldn't occur.  I agree that we might never be able to reach
> > > > that alignment if the pointer starts out misaligned by 8 bytes.
> > > >
> > > > But I think that's why it makes sense for the target to only ask
> > > > for 8-byte alignment for vectors too, if it can cope with it.  8-byte
> > > > alignment should always be achievable if the scalars are ABI-aligned.
> > > > And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
> > > > DR_SIZE would be zero and the loop would never progress, which is the
> > > > problem that the patch is fixing.
> > > >
> > > > It would even make sense for the target to ask for 1-byte alignment,
> > > > if the target doesn't care about alignment at all.
> > >
> > > Hmm, OK.  Guess I still think we should detect this somewhere upward
> > > and avoid this peeling compute at all.  Somehow.
> >
> > I've been playing around with another solution which works for me by
> > changing vector_alignment_reachable_p to return also false if the
> > alignment requirements are already satisfied, i.e., by adding:
> >
> > if (known_alignment_for_access_p (dr_info) && aligned_access_p (dr_info))
> >   return false;
> 
> That sounds wrong, instead ...

Can you elaborate on that?  A similar test exists for predicate
vector_alignment_reachable_p where the second conjunct is the same but
negated in order to test for the case where a misalignment is known:
https://gcc.gnu.org/git?p=gcc.git;a=blob;f=gcc/tree-vect-data-refs.c;h=e35a215e042478d11d6545f1f829d816d0c3620f;hb=refs/heads/master#l1263
Therefore, I'm wondering why the non-negated case should be wrong.

> > Though, I'm not entirely sure whether this makes it better or not.
> > Strictly speaking if the alignment was reachable before peeling, then
> > reaching alignment with peeling is also possible but probably not what
> > was intended.  So I guess returning false in this case is sensible.  Any
> > comments?
> 
> ... why is the DR considered for peeling at all?  If it is already
> aligned there's
> no point to do that.

Isn't the whole point of vector_alignment_reachable_p to check DRs in
order to decide whether peeling should be done or not?  At least this is
my intuition and the reason why I was suggesting to return false in case
it is aligned.

Cheers,
Stefan

> If we want to align another DR then the loop you fix
> should run on that DRs align/size, no?
> 
> Richard.
> 
> > Thanks,
> > Stefan
> >
> > >
> > > Richard.
> > >
> > > > Thanks,
> > > > Richard


Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-28 Thread Richard Biener via Gcc-patches
On Mon, Jul 27, 2020 at 4:20 PM Stefan Schulze Frielinghaus
 wrote:
>
> On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote:
> > On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
> >  wrote:
> > >
> > > Richard Biener  writes:
> > > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> > > >  wrote:
> > > >>
> > > >> Richard Biener via Gcc-patches  writes:
> > > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> > > >> > Gcc-patches  wrote:
> > > >> >>
> > > >> >> This is a follow up to commit 5c9669a0e6c respectively discussion
> > > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> > > >> >>
> > > >> >> In case that an alignment constraint is less than the size of a
> > > >> >> corresponding scalar type, ensure that we advance at least by one
> > > >> >> iteration.  For example, on s390x we have for a long double an 
> > > >> >> alignment
> > > >> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> > > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop 
> > > >> >> which
> > > >> >> can be reproduced by the following MWE:
> > > >> >
> > > >> > But we guard this case with vector_alignment_reachable_p, so we 
> > > >> > shouldn't
> > > >> > have ended up here and the patch looks bogus.
> > > >>
> > > >> The above sounds like it ought to count as reachable alignment though.
> > > >> If a type requires a lower alignment than its size, then that's even
> > > >> more easily reachable than a type that requires the same alignment as
> > > >> the size.  I guess at one extreme, a target alignment of 1 is always
> > > >> reachable.
> > > >
> > > > Well, if the element alignment is 8 but its size is 16 then when 
> > > > presumably
> > > > the desired vector alignment is a multiple of 16 we can never reach it.
> > > > Isn't this the case here?
> > >
> > > If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then
> > > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
> > > fixing wouldn't occur.  I agree that we might never be able to reach
> > > that alignment if the pointer starts out misaligned by 8 bytes.
> > >
> > > But I think that's why it makes sense for the target to only ask
> > > for 8-byte alignment for vectors too, if it can cope with it.  8-byte
> > > alignment should always be achievable if the scalars are ABI-aligned.
> > > And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
> > > DR_SIZE would be zero and the loop would never progress, which is the
> > > problem that the patch is fixing.
> > >
> > > It would even make sense for the target to ask for 1-byte alignment,
> > > if the target doesn't care about alignment at all.
> >
> > Hmm, OK.  Guess I still think we should detect this somewhere upward
> > and avoid this peeling compute at all.  Somehow.
>
> I've been playing around with another solution which works for me by
> changing vector_alignment_reachable_p to return also false if the
> alignment requirements are already satisfied, i.e., by adding:
>
> if (known_alignment_for_access_p (dr_info) && aligned_access_p (dr_info))
>   return false;

That sounds wrong, instead ...

> Though, I'm not entirely sure whether this makes it better or not.
> Strictly speaking if the alignment was reachable before peeling, then
> reaching alignment with peeling is also possible but probably not what
> was intended.  So I guess returning false in this case is sensible.  Any
> comments?

... why is the DR considered for peeling at all?  If it is already
aligned there's
no point to do that.  If we want to align another DR then the loop you fix
should run on that DRs align/size, no?

Richard.

> Thanks,
> Stefan
>
> >
> > Richard.
> >
> > > Thanks,
> > > Richard


Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Stefan Schulze Frielinghaus via Gcc-patches
On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote:
> On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
>  wrote:
> >
> > Richard Biener  writes:
> > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> > >  wrote:
> > >>
> > >> Richard Biener via Gcc-patches  writes:
> > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> > >> > Gcc-patches  wrote:
> > >> >>
> > >> >> This is a follow up to commit 5c9669a0e6c respectively discussion
> > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> > >> >>
> > >> >> In case that an alignment constraint is less than the size of a
> > >> >> corresponding scalar type, ensure that we advance at least by one
> > >> >> iteration.  For example, on s390x we have for a long double an 
> > >> >> alignment
> > >> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
> > >> >> can be reproduced by the following MWE:
> > >> >
> > >> > But we guard this case with vector_alignment_reachable_p, so we 
> > >> > shouldn't
> > >> > have ended up here and the patch looks bogus.
> > >>
> > >> The above sounds like it ought to count as reachable alignment though.
> > >> If a type requires a lower alignment than its size, then that's even
> > >> more easily reachable than a type that requires the same alignment as
> > >> the size.  I guess at one extreme, a target alignment of 1 is always
> > >> reachable.
> > >
> > > Well, if the element alignment is 8 but its size is 16 then when 
> > > presumably
> > > the desired vector alignment is a multiple of 16 we can never reach it.
> > > Isn't this the case here?
> >
> > If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then
> > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
> > fixing wouldn't occur.  I agree that we might never be able to reach
> > that alignment if the pointer starts out misaligned by 8 bytes.
> >
> > But I think that's why it makes sense for the target to only ask
> > for 8-byte alignment for vectors too, if it can cope with it.  8-byte
> > alignment should always be achievable if the scalars are ABI-aligned.
> > And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
> > DR_SIZE would be zero and the loop would never progress, which is the
> > problem that the patch is fixing.
> >
> > It would even make sense for the target to ask for 1-byte alignment,
> > if the target doesn't care about alignment at all.
> 
> Hmm, OK.  Guess I still think we should detect this somewhere upward
> and avoid this peeling compute at all.  Somehow.

I've been playing around with another solution which works for me by
changing vector_alignment_reachable_p to return also false if the
alignment requirements are already satisfied, i.e., by adding:

if (known_alignment_for_access_p (dr_info) && aligned_access_p (dr_info))
  return false;

Though, I'm not entirely sure whether this makes it better or not.
Strictly speaking if the alignment was reachable before peeling, then
reaching alignment with peeling is also possible but probably not what
was intended.  So I guess returning false in this case is sensible.  Any
comments?

Thanks,
Stefan

> 
> Richard.
> 
> > Thanks,
> > Richard


Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Richard Biener via Gcc-patches
On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> >  wrote:
> >>
> >> Richard Biener via Gcc-patches  writes:
> >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> >> > Gcc-patches  wrote:
> >> >>
> >> >> This is a follow up to commit 5c9669a0e6c respectively discussion
> >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> >> >>
> >> >> In case that an alignment constraint is less than the size of a
> >> >> corresponding scalar type, ensure that we advance at least by one
> >> >> iteration.  For example, on s390x we have for a long double an alignment
> >> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
> >> >> can be reproduced by the following MWE:
> >> >
> >> > But we guard this case with vector_alignment_reachable_p, so we shouldn't
> >> > have ended up here and the patch looks bogus.
> >>
> >> The above sounds like it ought to count as reachable alignment though.
> >> If a type requires a lower alignment than its size, then that's even
> >> more easily reachable than a type that requires the same alignment as
> >> the size.  I guess at one extreme, a target alignment of 1 is always
> >> reachable.
> >
> > Well, if the element alignment is 8 but its size is 16 then when presumably
> > the desired vector alignment is a multiple of 16 we can never reach it.
> > Isn't this the case here?
>
> If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then
> TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
> fixing wouldn't occur.  I agree that we might never be able to reach
> that alignment if the pointer starts out misaligned by 8 bytes.
>
> But I think that's why it makes sense for the target to only ask
> for 8-byte alignment for vectors too, if it can cope with it.  8-byte
> alignment should always be achievable if the scalars are ABI-aligned.
> And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
> DR_SIZE would be zero and the loop would never progress, which is the
> problem that the patch is fixing.
>
> It would even make sense for the target to ask for 1-byte alignment,
> if the target doesn't care about alignment at all.

Hmm, OK.  Guess I still think we should detect this somewhere upward
and avoid this peeling compute at all.  Somehow.

Richard.

> Thanks,
> Richard


Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
>  wrote:
>>
>> Richard Biener via Gcc-patches  writes:
>> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
>> > Gcc-patches  wrote:
>> >>
>> >> This is a follow up to commit 5c9669a0e6c respectively discussion
>> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
>> >>
>> >> In case that an alignment constraint is less than the size of a
>> >> corresponding scalar type, ensure that we advance at least by one
>> >> iteration.  For example, on s390x we have for a long double an alignment
>> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
>> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
>> >> can be reproduced by the following MWE:
>> >
>> > But we guard this case with vector_alignment_reachable_p, so we shouldn't
>> > have ended up here and the patch looks bogus.
>>
>> The above sounds like it ought to count as reachable alignment though.
>> If a type requires a lower alignment than its size, then that's even
>> more easily reachable than a type that requires the same alignment as
>> the size.  I guess at one extreme, a target alignment of 1 is always
>> reachable.
>
> Well, if the element alignment is 8 but its size is 16 then when presumably
> the desired vector alignment is a multiple of 16 we can never reach it.
> Isn't this the case here?

If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then
TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
fixing wouldn't occur.  I agree that we might never be able to reach
that alignment if the pointer starts out misaligned by 8 bytes.

But I think that's why it makes sense for the target to only ask
for 8-byte alignment for vectors too, if it can cope with it.  8-byte
alignment should always be achievable if the scalars are ABI-aligned.
And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
DR_SIZE would be zero and the loop would never progress, which is the
problem that the patch is fixing.

It would even make sense for the target to ask for 1-byte alignment,
if the target doesn't care about alignment at all.

Thanks,
Richard


Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Richard Biener via Gcc-patches
On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> > Gcc-patches  wrote:
> >>
> >> This is a follow up to commit 5c9669a0e6c respectively discussion
> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> >>
> >> In case that an alignment constraint is less than the size of a
> >> corresponding scalar type, ensure that we advance at least by one
> >> iteration.  For example, on s390x we have for a long double an alignment
> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
> >> can be reproduced by the following MWE:
> >
> > But we guard this case with vector_alignment_reachable_p, so we shouldn't
> > have ended up here and the patch looks bogus.
>
> The above sounds like it ought to count as reachable alignment though.
> If a type requires a lower alignment than its size, then that's even
> more easily reachable than a type that requires the same alignment as
> the size.  I guess at one extreme, a target alignment of 1 is always
> reachable.

Well, if the element alignment is 8 but its size is 16 then when presumably
the desired vector alignment is a multiple of 16 we can never reach it.
Isn't this the case here?

Richard.

> Thanks,
> Richard


Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Richard Sandiford
Richard Biener via Gcc-patches  writes:
> On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> Gcc-patches  wrote:
>>
>> This is a follow up to commit 5c9669a0e6c respectively discussion
>> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
>>
>> In case that an alignment constraint is less than the size of a
>> corresponding scalar type, ensure that we advance at least by one
>> iteration.  For example, on s390x we have for a long double an alignment
>> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
>> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
>> can be reproduced by the following MWE:
>
> But we guard this case with vector_alignment_reachable_p, so we shouldn't
> have ended up here and the patch looks bogus.

The above sounds like it ought to count as reachable alignment though.
If a type requires a lower alignment than its size, then that's even
more easily reachable than a type that requires the same alignment as
the size.  I guess at one extreme, a target alignment of 1 is always
reachable.

Thanks,
Richard


Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Richard Biener via Gcc-patches
On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
Gcc-patches  wrote:
>
> This is a follow up to commit 5c9669a0e6c respectively discussion
> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
>
> In case that an alignment constraint is less than the size of a
> corresponding scalar type, ensure that we advance at least by one
> iteration.  For example, on s390x we have for a long double an alignment
> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
> can be reproduced by the following MWE:

But we guard this case with vector_alignment_reachable_p, so we shouldn't
have ended up here and the patch looks bogus.

Richard.

> extern long double *a;
> extern double *b;
> void fun(void) {
>   for (int i = 0; i < 42; i++)
> a[i] = b[i];
> }
>
> Increasing the number of peelings in each iteration at least by one
> fixes the issue for me.  Any comments?
>
> Bootstrapped and regtested on s390x.
>
> gcc/ChangeLog:
>
> * tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
> Ensure that loop variable npeel_tmp advances in each iteration.
> ---
>  gcc/tree-vect-data-refs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index e35a215e042..a78ae61d1b0 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1779,7 +1779,7 @@ vect_enhance_data_refs_alignment (loop_vec_info 
> loop_vinfo)
>  {
>vect_peeling_hash_insert (_htab, loop_vinfo,
> dr_info, npeel_tmp);
> - npeel_tmp += target_align / dr_size;
> + npeel_tmp += MAX (1, target_align / dr_size);
>  }
>
>   one_misalignment_known = true;
> --
> 2.25.3
>


Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-24 Thread Richard Sandiford
Stefan Schulze Frielinghaus via Gcc-patches  writes:
> This is a follow up to commit 5c9669a0e6c respectively discussion
> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
>
> In case that an alignment constraint is less than the size of a
> corresponding scalar type, ensure that we advance at least by one
> iteration.  For example, on s390x we have for a long double an alignment
> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
> can be reproduced by the following MWE:
>
> extern long double *a;
> extern double *b;
> void fun(void) {
>   for (int i = 0; i < 42; i++)
> a[i] = b[i];
> }
>
> Increasing the number of peelings in each iteration at least by one
> fixes the issue for me.  Any comments?
>
> Bootstrapped and regtested on s390x.
>
> gcc/ChangeLog:
>
>   * tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
>   Ensure that loop variable npeel_tmp advances in each iteration.

OK, thanks.  (For the record, I wondered whether changing the
/ to CEIL (…) would be better, but given that the alignment must
be a power of 2, I guess there's no practical difference.)

Richard

> ---
>  gcc/tree-vect-data-refs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index e35a215e042..a78ae61d1b0 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1779,7 +1779,7 @@ vect_enhance_data_refs_alignment (loop_vec_info 
> loop_vinfo)
>  {
>vect_peeling_hash_insert (_htab, loop_vinfo,
>   dr_info, npeel_tmp);
> -   npeel_tmp += target_align / dr_size;
> +   npeel_tmp += MAX (1, target_align / dr_size);
>  }
>  
> one_misalignment_known = true;


[PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-22 Thread Stefan Schulze Frielinghaus via Gcc-patches
This is a follow up to commit 5c9669a0e6c respectively discussion
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html

In case that an alignment constraint is less than the size of a
corresponding scalar type, ensure that we advance at least by one
iteration.  For example, on s390x we have for a long double an alignment
constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
can be reproduced by the following MWE:

extern long double *a;
extern double *b;
void fun(void) {
  for (int i = 0; i < 42; i++)
a[i] = b[i];
}

Increasing the number of peelings in each iteration at least by one
fixes the issue for me.  Any comments?

Bootstrapped and regtested on s390x.

gcc/ChangeLog:

* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
Ensure that loop variable npeel_tmp advances in each iteration.
---
 gcc/tree-vect-data-refs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index e35a215e042..a78ae61d1b0 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1779,7 +1779,7 @@ vect_enhance_data_refs_alignment (loop_vec_info 
loop_vinfo)
 {
   vect_peeling_hash_insert (_htab, loop_vinfo,
dr_info, npeel_tmp);
- npeel_tmp += target_align / dr_size;
+ npeel_tmp += MAX (1, target_align / dr_size);
 }
 
  one_misalignment_known = true;
-- 
2.25.3