Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-07-03 Thread Richard Biener
On Fri, Jun 30, 2017 at 12:37 PM, Bin.Cheng  wrote:
> On Wed, Jun 28, 2017 at 8:29 AM, Richard Biener
>  wrote:
>> On Tue, Jun 27, 2017 at 6:46 PM, Bin.Cheng  wrote:
>>> On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
>>>  wrote:
 On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng"  
 wrote:
>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
> wrote:
>> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng 
>wrote:
>>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng 
>wrote:
 Hi,
 I was asked by upstream to split the loop distribution patch into
>small ones.
 It is hard because data structure and algorithm are closely coupled
>together.
 Anyway, this is the patch series with smaller patches.  Basically I
>tried to
 separate data structure and bug-fix changes apart with one as the
>main patch.
 Note I only made necessary code refactoring in order to separate
>patch, apart
 from that, there is no change against the last version.

 This is the first patch introducing new internal function
>IFN_LOOP_DIST_ALIAS.
 GCC will distribute loops under condition of this function call.

 Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>> Hi,
>>> I need to update this patch fixing an issue in
>>> vect_loop_dist_alias_call.  The previous patch fails to find some
>>> IFN_LOOP_DIST_ALIAS calls.
>>>
>>> Bootstrap and test in series.  Is it OK?
>>
>> So I wonder if we really need to track ldist_alias_id or if we can do
>sth
>Yes, it is needed because otherwise we probably falsely trying to
>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>loop.
>
>> more "general", like tracking a copy_of or origin and then directly
>> go to nearest_common_dominator (loop->header, copy_of->header)
>> to find the controlling condition?
>I tend to not record any pointer in loop structure, it can easily go
>dangling for a across passes data structure.

 I didn't mean to record a pointer, just rename your field and make it more 
 general.  The common dominator thing shod still work, no?
>>> I might not be following.  If we record the original loop->num in the
>>> renamed field, nearest_common_dominator can't work because we don't
>>> have basic blocks to start the call?  The original loop could be
>>> eliminated at several points, for example, instantly after
>>> distribution, or folded in vectorizer for other loops distributed from
>>> the original loop.
>>> BTW, setting the copy_of/origin field in loop_version is not enough
>>> for this use case, all generated loops (actually, except the versioned
>>> loop) from distribution need to be set.
>>
>> Of course it would need to be set for all distributed loops.
>>
>> I'm not sure "loop vanishes" is the case to optimize for though.  If the loop
>> is still available then origin->header should work as BB.  If not then can't
>> we conclude, for the purpose of IFN_LOOP_DIST_ALIAS, that the whole
>> region must be dead?  We still have to identify it of course, but it means
>> we can fold stray IFN_LOOP_DIST_ALIAS calls left over after vectorization
>> to whatever we like?
>>

 As far as memory usage
>is concerned.  I actually don't need a whole integer to record the
>loop num.  I can simply restrict number of distributions in one
>function to at most 256, and record such id in a char field in struct
>loop?  Does this sounds better?

 As said, tracking loop origin sounds useful anyway so I'd rather add and 
 use that somehow.
>>> To be honest, I don't know.  the current field works like a unique
>>> index of distribution operation.  The original loop could be destroyed
>>> at different points thus no longer exists, this makes the recorded
>>> copy_of/origin less meaningful?
>>
>> I think we talked about prologue and epilogue loops to be easier identifiable
>> as so (and as to what "main" loop).  So lets say we have one "origin" field
>> and accompaning flags "origin_is_loop_dist_alias_version",
>> "origin_is_main_loop_of_prologue", etc.?  I can't think of the case where
>> origin would be two things at the same time (you can always walk up
>> the origin tree).
> Hi,
> Here is the updated patch working in this way.  There is still one
> problem with this method.  Considering one distributed loop is
> if-converted later, the orig loop for if-converted distributed loop is
> different.  Though we can update orig_loop_num, it's inaccurate and
> one consequence is we need to walk up dominance tree till entry_block.
> Note if orig_loop_num is not shared, we can stop once basic block goes
> beyond outer loop.
> I didn't introduce flags in this patch, which can be done along with
> solving the above problem.
>
>>
>> Vanishing origins could also be cleared btw, or we can make the new origin
>> the origin of the vanishing origin...
> Yes, I 

Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-30 Thread Bin.Cheng
On Wed, Jun 28, 2017 at 8:29 AM, Richard Biener
 wrote:
> On Tue, Jun 27, 2017 at 6:46 PM, Bin.Cheng  wrote:
>> On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
>>  wrote:
>>> On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng"  
>>> wrote:
On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
 wrote:
> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng 
wrote:
>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng 
wrote:
>>> Hi,
>>> I was asked by upstream to split the loop distribution patch into
small ones.
>>> It is hard because data structure and algorithm are closely coupled
together.
>>> Anyway, this is the patch series with smaller patches.  Basically I
tried to
>>> separate data structure and bug-fix changes apart with one as the
main patch.
>>> Note I only made necessary code refactoring in order to separate
patch, apart
>>> from that, there is no change against the last version.
>>>
>>> This is the first patch introducing new internal function
IFN_LOOP_DIST_ALIAS.
>>> GCC will distribute loops under condition of this function call.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>> Hi,
>> I need to update this patch fixing an issue in
>> vect_loop_dist_alias_call.  The previous patch fails to find some
>> IFN_LOOP_DIST_ALIAS calls.
>>
>> Bootstrap and test in series.  Is it OK?
>
> So I wonder if we really need to track ldist_alias_id or if we can do
sth
Yes, it is needed because otherwise we probably falsely trying to
search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
loop.

> more "general", like tracking a copy_of or origin and then directly
> go to nearest_common_dominator (loop->header, copy_of->header)
> to find the controlling condition?
I tend to not record any pointer in loop structure, it can easily go
dangling for a across passes data structure.
>>>
>>> I didn't mean to record a pointer, just rename your field and make it more 
>>> general.  The common dominator thing shod still work, no?
>> I might not be following.  If we record the original loop->num in the
>> renamed field, nearest_common_dominator can't work because we don't
>> have basic blocks to start the call?  The original loop could be
>> eliminated at several points, for example, instantly after
>> distribution, or folded in vectorizer for other loops distributed from
>> the original loop.
>> BTW, setting the copy_of/origin field in loop_version is not enough
>> for this use case, all generated loops (actually, except the versioned
>> loop) from distribution need to be set.
>
> Of course it would need to be set for all distributed loops.
>
> I'm not sure "loop vanishes" is the case to optimize for though.  If the loop
> is still available then origin->header should work as BB.  If not then can't
> we conclude, for the purpose of IFN_LOOP_DIST_ALIAS, that the whole
> region must be dead?  We still have to identify it of course, but it means
> we can fold stray IFN_LOOP_DIST_ALIAS calls left over after vectorization
> to whatever we like?
>
>>>
>>> As far as memory usage
is concerned.  I actually don't need a whole integer to record the
loop num.  I can simply restrict number of distributions in one
function to at most 256, and record such id in a char field in struct
loop?  Does this sounds better?
>>>
>>> As said, tracking loop origin sounds useful anyway so I'd rather add and 
>>> use that somehow.
>> To be honest, I don't know.  the current field works like a unique
>> index of distribution operation.  The original loop could be destroyed
>> at different points thus no longer exists, this makes the recorded
>> copy_of/origin less meaningful?
>
> I think we talked about prologue and epilogue loops to be easier identifiable
> as so (and as to what "main" loop).  So lets say we have one "origin" field
> and accompaning flags "origin_is_loop_dist_alias_version",
> "origin_is_main_loop_of_prologue", etc.?  I can't think of the case where
> origin would be two things at the same time (you can always walk up
> the origin tree).
Hi,
Here is the updated patch working in this way.  There is still one
problem with this method.  Considering one distributed loop is
if-converted later, the orig loop for if-converted distributed loop is
different.  Though we can update orig_loop_num, it's inaccurate and
one consequence is we need to walk up dominance tree till entry_block.
Note if orig_loop_num is not shared, we can stop once basic block goes
beyond outer loop.
I didn't introduce flags in this patch, which can be done along with
solving the above problem.

>
> Vanishing origins could also be cleared btw, or we can make the new origin
> the origin of the vanishing origin...
Yes, I did this in distribution patch, so that dominance walking
routine can be simplified.

Bootstrap and test.  Is it OK?

Thanks,
bin

2017-06-27  Bin Cheng  

* cfgloop.h (st

Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-28 Thread Richard Biener
On Tue, Jun 27, 2017 at 6:46 PM, Bin.Cheng  wrote:
> On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
>  wrote:
>> On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng"  
>> wrote:
>>>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
>>> wrote:
 On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng 
>>>wrote:
> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng 
>>>wrote:
>> Hi,
>> I was asked by upstream to split the loop distribution patch into
>>>small ones.
>> It is hard because data structure and algorithm are closely coupled
>>>together.
>> Anyway, this is the patch series with smaller patches.  Basically I
>>>tried to
>> separate data structure and bug-fix changes apart with one as the
>>>main patch.
>> Note I only made necessary code refactoring in order to separate
>>>patch, apart
>> from that, there is no change against the last version.
>>
>> This is the first patch introducing new internal function
>>>IFN_LOOP_DIST_ALIAS.
>> GCC will distribute loops under condition of this function call.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
> Hi,
> I need to update this patch fixing an issue in
> vect_loop_dist_alias_call.  The previous patch fails to find some
> IFN_LOOP_DIST_ALIAS calls.
>
> Bootstrap and test in series.  Is it OK?

 So I wonder if we really need to track ldist_alias_id or if we can do
>>>sth
>>>Yes, it is needed because otherwise we probably falsely trying to
>>>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>>>loop.
>>>
 more "general", like tracking a copy_of or origin and then directly
 go to nearest_common_dominator (loop->header, copy_of->header)
 to find the controlling condition?
>>>I tend to not record any pointer in loop structure, it can easily go
>>>dangling for a across passes data structure.
>>
>> I didn't mean to record a pointer, just rename your field and make it more 
>> general.  The common dominator thing shod still work, no?
> I might not be following.  If we record the original loop->num in the
> renamed field, nearest_common_dominator can't work because we don't
> have basic blocks to start the call?  The original loop could be
> eliminated at several points, for example, instantly after
> distribution, or folded in vectorizer for other loops distributed from
> the original loop.
> BTW, setting the copy_of/origin field in loop_version is not enough
> for this use case, all generated loops (actually, except the versioned
> loop) from distribution need to be set.

Of course it would need to be set for all distributed loops.

I'm not sure "loop vanishes" is the case to optimize for though.  If the loop
is still available then origin->header should work as BB.  If not then can't
we conclude, for the purpose of IFN_LOOP_DIST_ALIAS, that the whole
region must be dead?  We still have to identify it of course, but it means
we can fold stray IFN_LOOP_DIST_ALIAS calls left over after vectorization
to whatever we like?

>>
>> As far as memory usage
>>>is concerned.  I actually don't need a whole integer to record the
>>>loop num.  I can simply restrict number of distributions in one
>>>function to at most 256, and record such id in a char field in struct
>>>loop?  Does this sounds better?
>>
>> As said, tracking loop origin sounds useful anyway so I'd rather add and use 
>> that somehow.
> To be honest, I don't know.  the current field works like a unique
> index of distribution operation.  The original loop could be destroyed
> at different points thus no longer exists, this makes the recorded
> copy_of/origin less meaningful?

I think we talked about prologue and epilogue loops to be easier identifiable
as so (and as to what "main" loop).  So lets say we have one "origin" field
and accompaning flags "origin_is_loop_dist_alias_version",
"origin_is_main_loop_of_prologue", etc.?  I can't think of the case where
origin would be two things at the same time (you can always walk up
the origin tree).

Vanishing origins could also be cleared btw, or we can make the new origin
the origin of the vanishing origin...

Richard.

> Thanks,
> bin
>>
>>>Thanks,
>>>bin

 That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
 an int inside struct loop?  I'd set copy_of/origi in loop_version for
>>>example.
 'origin' would probably be better given the ldist cases aren't really
 full "copies".

 fold_loop_dist_alias_call should re-use / rename
>>>fold_loop_vectorized_call
 by just passing folded_value to it.

 Richard.

> Thanks,
> bin
>>
>> Thanks,
>> bin
>> 2017-06-07  Bin Cheng  
>>
>> * cfgloop.h (struct loop): New field ldist_alias_id.
>> * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment
>>>change.
>> * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>> * internal-fn.def (LOOP_DIST_ALIAS): New.
>> * tree-vectorizer.c (vect

Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-27 Thread Bin.Cheng
On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
 wrote:
> On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng"  
> wrote:
>>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
>> wrote:
>>> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng 
>>wrote:
 On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng 
>>wrote:
> Hi,
> I was asked by upstream to split the loop distribution patch into
>>small ones.
> It is hard because data structure and algorithm are closely coupled
>>together.
> Anyway, this is the patch series with smaller patches.  Basically I
>>tried to
> separate data structure and bug-fix changes apart with one as the
>>main patch.
> Note I only made necessary code refactoring in order to separate
>>patch, apart
> from that, there is no change against the last version.
>
> This is the first patch introducing new internal function
>>IFN_LOOP_DIST_ALIAS.
> GCC will distribute loops under condition of this function call.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?
 Hi,
 I need to update this patch fixing an issue in
 vect_loop_dist_alias_call.  The previous patch fails to find some
 IFN_LOOP_DIST_ALIAS calls.

 Bootstrap and test in series.  Is it OK?
>>>
>>> So I wonder if we really need to track ldist_alias_id or if we can do
>>sth
>>Yes, it is needed because otherwise we probably falsely trying to
>>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>>loop.
>>
>>> more "general", like tracking a copy_of or origin and then directly
>>> go to nearest_common_dominator (loop->header, copy_of->header)
>>> to find the controlling condition?
>>I tend to not record any pointer in loop structure, it can easily go
>>dangling for a across passes data structure.
>
> I didn't mean to record a pointer, just rename your field and make it more 
> general.  The common dominator thing shod still work, no?
I might not be following.  If we record the original loop->num in the
renamed field, nearest_common_dominator can't work because we don't
have basic blocks to start the call?  The original loop could be
eliminated at several points, for example, instantly after
distribution, or folded in vectorizer for other loops distributed from
the original loop.
BTW, setting the copy_of/origin field in loop_version is not enough
for this use case, all generated loops (actually, except the versioned
loop) from distribution need to be set.
>
> As far as memory usage
>>is concerned.  I actually don't need a whole integer to record the
>>loop num.  I can simply restrict number of distributions in one
>>function to at most 256, and record such id in a char field in struct
>>loop?  Does this sounds better?
>
> As said, tracking loop origin sounds useful anyway so I'd rather add and use 
> that somehow.
To be honest, I don't know.  the current field works like a unique
index of distribution operation.  The original loop could be destroyed
at different points thus no longer exists, this makes the recorded
copy_of/origin less meaningful?

Thanks,
bin
>
>>Thanks,
>>bin
>>>
>>> That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
>>> an int inside struct loop?  I'd set copy_of/origi in loop_version for
>>example.
>>> 'origin' would probably be better given the ldist cases aren't really
>>> full "copies".
>>>
>>> fold_loop_dist_alias_call should re-use / rename
>>fold_loop_vectorized_call
>>> by just passing folded_value to it.
>>>
>>> Richard.
>>>
 Thanks,
 bin
>
> Thanks,
> bin
> 2017-06-07  Bin Cheng  
>
> * cfgloop.h (struct loop): New field ldist_alias_id.
> * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment
>>change.
> * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
> * internal-fn.def (LOOP_DIST_ALIAS): New.
> * tree-vectorizer.c (vect_loop_dist_alias_call): New
>>function.
> (fold_loop_dist_alias_call): New function.
> (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending
>>on
> successful vectorization or not.
>


Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-27 Thread Richard Biener
On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng"  
wrote:
>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
> wrote:
>> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng 
>wrote:
>>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng 
>wrote:
 Hi,
 I was asked by upstream to split the loop distribution patch into
>small ones.
 It is hard because data structure and algorithm are closely coupled
>together.
 Anyway, this is the patch series with smaller patches.  Basically I
>tried to
 separate data structure and bug-fix changes apart with one as the
>main patch.
 Note I only made necessary code refactoring in order to separate
>patch, apart
 from that, there is no change against the last version.

 This is the first patch introducing new internal function
>IFN_LOOP_DIST_ALIAS.
 GCC will distribute loops under condition of this function call.

 Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>> Hi,
>>> I need to update this patch fixing an issue in
>>> vect_loop_dist_alias_call.  The previous patch fails to find some
>>> IFN_LOOP_DIST_ALIAS calls.
>>>
>>> Bootstrap and test in series.  Is it OK?
>>
>> So I wonder if we really need to track ldist_alias_id or if we can do
>sth
>Yes, it is needed because otherwise we probably falsely trying to
>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>loop.
>
>> more "general", like tracking a copy_of or origin and then directly
>> go to nearest_common_dominator (loop->header, copy_of->header)
>> to find the controlling condition?
>I tend to not record any pointer in loop structure, it can easily go
>dangling for a across passes data structure.  

I didn't mean to record a pointer, just rename your field and make it more 
general.  The common dominator thing shod still work, no?

As far as memory usage
>is concerned.  I actually don't need a whole integer to record the
>loop num.  I can simply restrict number of distributions in one
>function to at most 256, and record such id in a char field in struct
>loop?  Does this sounds better?

As said, tracking loop origin sounds useful anyway so I'd rather add and use 
that somehow.

>Thanks,
>bin
>>
>> That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
>> an int inside struct loop?  I'd set copy_of/origi in loop_version for
>example.
>> 'origin' would probably be better given the ldist cases aren't really
>> full "copies".
>>
>> fold_loop_dist_alias_call should re-use / rename
>fold_loop_vectorized_call
>> by just passing folded_value to it.
>>
>> Richard.
>>
>>> Thanks,
>>> bin

 Thanks,
 bin
 2017-06-07  Bin Cheng  

 * cfgloop.h (struct loop): New field ldist_alias_id.
 * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment
>change.
 * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
 * internal-fn.def (LOOP_DIST_ALIAS): New.
 * tree-vectorizer.c (vect_loop_dist_alias_call): New
>function.
 (fold_loop_dist_alias_call): New function.
 (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending
>on
 successful vectorization or not.



Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-27 Thread Bin.Cheng
On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
 wrote:
> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng  wrote:
>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng  wrote:
>>> Hi,
>>> I was asked by upstream to split the loop distribution patch into small 
>>> ones.
>>> It is hard because data structure and algorithm are closely coupled 
>>> together.
>>> Anyway, this is the patch series with smaller patches.  Basically I tried to
>>> separate data structure and bug-fix changes apart with one as the main 
>>> patch.
>>> Note I only made necessary code refactoring in order to separate patch, 
>>> apart
>>> from that, there is no change against the last version.
>>>
>>> This is the first patch introducing new internal function 
>>> IFN_LOOP_DIST_ALIAS.
>>> GCC will distribute loops under condition of this function call.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>> Hi,
>> I need to update this patch fixing an issue in
>> vect_loop_dist_alias_call.  The previous patch fails to find some
>> IFN_LOOP_DIST_ALIAS calls.
>>
>> Bootstrap and test in series.  Is it OK?
>
> So I wonder if we really need to track ldist_alias_id or if we can do sth
Yes, it is needed because otherwise we probably falsely trying to
search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
loop.

> more "general", like tracking a copy_of or origin and then directly
> go to nearest_common_dominator (loop->header, copy_of->header)
> to find the controlling condition?
I tend to not record any pointer in loop structure, it can easily go
dangling for a across passes data structure.  As far as memory usage
is concerned.  I actually don't need a whole integer to record the
loop num.  I can simply restrict number of distributions in one
function to at most 256, and record such id in a char field in struct
loop?  Does this sounds better?

Thanks,
bin
>
> That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
> an int inside struct loop?  I'd set copy_of/origi in loop_version for example.
> 'origin' would probably be better given the ldist cases aren't really
> full "copies".
>
> fold_loop_dist_alias_call should re-use / rename fold_loop_vectorized_call
> by just passing folded_value to it.
>
> Richard.
>
>> Thanks,
>> bin
>>>
>>> Thanks,
>>> bin
>>> 2017-06-07  Bin Cheng  
>>>
>>> * cfgloop.h (struct loop): New field ldist_alias_id.
>>> * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
>>> * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>>> * internal-fn.def (LOOP_DIST_ALIAS): New.
>>> * tree-vectorizer.c (vect_loop_dist_alias_call): New function.
>>> (fold_loop_dist_alias_call): New function.
>>> (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending on
>>> successful vectorization or not.


Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-27 Thread Richard Biener
On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng  wrote:
> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng  wrote:
>> Hi,
>> I was asked by upstream to split the loop distribution patch into small ones.
>> It is hard because data structure and algorithm are closely coupled together.
>> Anyway, this is the patch series with smaller patches.  Basically I tried to
>> separate data structure and bug-fix changes apart with one as the main patch.
>> Note I only made necessary code refactoring in order to separate patch, apart
>> from that, there is no change against the last version.
>>
>> This is the first patch introducing new internal function 
>> IFN_LOOP_DIST_ALIAS.
>> GCC will distribute loops under condition of this function call.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
> Hi,
> I need to update this patch fixing an issue in
> vect_loop_dist_alias_call.  The previous patch fails to find some
> IFN_LOOP_DIST_ALIAS calls.
>
> Bootstrap and test in series.  Is it OK?

So I wonder if we really need to track ldist_alias_id or if we can do sth
more "general", like tracking a copy_of or origin and then directly
go to nearest_common_dominator (loop->header, copy_of->header)
to find the controlling condition?

That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
an int inside struct loop?  I'd set copy_of/origi in loop_version for example.
'origin' would probably be better given the ldist cases aren't really
full "copies".

fold_loop_dist_alias_call should re-use / rename fold_loop_vectorized_call
by just passing folded_value to it.

Richard.

> Thanks,
> bin
>>
>> Thanks,
>> bin
>> 2017-06-07  Bin Cheng  
>>
>> * cfgloop.h (struct loop): New field ldist_alias_id.
>> * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
>> * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>> * internal-fn.def (LOOP_DIST_ALIAS): New.
>> * tree-vectorizer.c (vect_loop_dist_alias_call): New function.
>> (fold_loop_dist_alias_call): New function.
>> (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending on
>> successful vectorization or not.


Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-26 Thread Richard Sandiford
Just a couple of cosmetic things:

"Bin.Cheng"  writes:
> @@ -225,6 +225,15 @@ struct GTY ((chain_next ("%h.next"))) loop {
>   builtins.  */
>tree simduid;
>  
> +  /* For loops generated by distribution with runtime alias checks, this
> + is a unique identifier of the original distributed loop.  Generally
> + it is the number of the original loop.  IFN_LOOP_DIST_ALIAS builtin
> + uses this id as its first argument.  Give a loop with an id, we can
> + look upward in dominance tree for the corresponding IFN_LOOP_DIST_ALIAS
> + buildin.  Note this id has no meanling after IFN_LOOP_DIST_ALIAS is

s/meanling/meaning/

> +/* Fold LOOP_DIST_ALIAS internal call stmt according to KEEP_P and update
> +   any immediate uses of it's LHS.  Stmt is folded to its second argument

s/it's/its/

Thanks,
Richard


Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-23 Thread Bin.Cheng
On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng  wrote:
> Hi,
> I was asked by upstream to split the loop distribution patch into small ones.
> It is hard because data structure and algorithm are closely coupled together.
> Anyway, this is the patch series with smaller patches.  Basically I tried to
> separate data structure and bug-fix changes apart with one as the main patch.
> Note I only made necessary code refactoring in order to separate patch, apart
> from that, there is no change against the last version.
>
> This is the first patch introducing new internal function IFN_LOOP_DIST_ALIAS.
> GCC will distribute loops under condition of this function call.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?
Hi,
I need to update this patch fixing an issue in
vect_loop_dist_alias_call.  The previous patch fails to find some
IFN_LOOP_DIST_ALIAS calls.

Bootstrap and test in series.  Is it OK?

Thanks,
bin
>
> Thanks,
> bin
> 2017-06-07  Bin Cheng  
>
> * cfgloop.h (struct loop): New field ldist_alias_id.
> * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
> * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
> * internal-fn.def (LOOP_DIST_ALIAS): New.
> * tree-vectorizer.c (vect_loop_dist_alias_call): New function.
> (fold_loop_dist_alias_call): New function.
> (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending on
> successful vectorization or not.
From ab8334c5f109c593610df3efcf1aa5a2edcf6be9 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 7 Jun 2017 13:04:03 +0100
Subject: [PATCH 01/13] ifn_loop_dist_alias-20170608.txt

---
 gcc/cfgloop.h |  9 ++
 gcc/cfgloopmanip.c|  3 +-
 gcc/internal-fn.c |  8 ++
 gcc/internal-fn.def   |  1 +
 gcc/tree-vectorizer.c | 79 ++-
 5 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index a8bec1d..be4187a 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -225,6 +225,15 @@ struct GTY ((chain_next ("%h.next"))) loop {
  builtins.  */
   tree simduid;
 
+  /* For loops generated by distribution with runtime alias checks, this
+ is a unique identifier of the original distributed loop.  Generally
+ it is the number of the original loop.  IFN_LOOP_DIST_ALIAS builtin
+ uses this id as its first argument.  Give a loop with an id, we can
+ look upward in dominance tree for the corresponding IFN_LOOP_DIST_ALIAS
+ buildin.  Note this id has no meanling after IFN_LOOP_DIST_ALIAS is
+ folded and eliminated.  */
+  int ldist_alias_id;
+
   /* Upper bound on number of iterations of a loop.  */
   struct nb_iter_bound *bounds;
 
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index d764ab9..adb2f65 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1653,7 +1653,8 @@ force_single_succ_latches (void)
 
   THEN_PROB is the probability of then branch of the condition.
   ELSE_PROB is the probability of else branch. Note that they may be both
-  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED.  */
+  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED or
+  IFN_LOOP_DIST_ALIAS.  */
 
 static basic_block
 lv_adjust_loop_entry_edge (basic_block first_head, basic_block second_head,
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 75fe027..96e40cb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2250,6 +2250,14 @@ expand_LOOP_VECTORIZED (internal_fn, gcall *)
   gcc_unreachable ();
 }
 
+/* This should get folded in tree-vectorizer.c.  */
+
+static void
+expand_LOOP_DIST_ALIAS (internal_fn, gcall *)
+{
+  gcc_unreachable ();
+}
+
 /* Expand MASK_LOAD call STMT using optab OPTAB.  */
 
 static void
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index e162d81..79c19fb 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -158,6 +158,7 @@ DEF_INTERNAL_FN (GOMP_SIMD_LAST_LANE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_START, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_END, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (LOOP_VECTORIZED, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (LOOP_DIST_ALIAS, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (ANNOTATE,  ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (UBSAN_NULL, ECF_LEAF | ECF_NOTHROW, ".R.")
 DEF_INTERNAL_FN (UBSAN_BOUNDS, ECF_LEAF | ECF_NOTHROW, NULL)
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 1bef2e4..05e9f84 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -469,6 +469,67 @@ fold_loop_vectorized_call (gimple *g, tree value)
 }
 }
 
+/* If LOOP has been versioned during loop distribution, return the internal
+   call guarding it.  */
+
+static gimple *
+vect_loop_dist_alias_call (struct loop *loop)
+{
+  gimple_stmt_iterator gsi;
+  gimple *g;
+  basic_block bb = loop_preheader_edge (loop)->src;
+  basic_block entry = EN

[PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-12 Thread Bin Cheng
Hi,
I was asked by upstream to split the loop distribution patch into small ones.
It is hard because data structure and algorithm are closely coupled together.
Anyway, this is the patch series with smaller patches.  Basically I tried to
separate data structure and bug-fix changes apart with one as the main patch.
Note I only made necessary code refactoring in order to separate patch, apart
from that, there is no change against the last version.

This is the first patch introducing new internal function IFN_LOOP_DIST_ALIAS.
GCC will distribute loops under condition of this function call.

Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin
2017-06-07  Bin Cheng  

* cfgloop.h (struct loop): New field ldist_alias_id.
* cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
* internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
* internal-fn.def (LOOP_DIST_ALIAS): New.
* tree-vectorizer.c (vect_loop_dist_alias_call): New function.
(fold_loop_dist_alias_call): New function.
(vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending on
successful vectorization or not.From 3598491598e0b425f1cfa4b7bb4c180886a08bef Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 7 Jun 2017 13:04:03 +0100
Subject: [PATCH 01/14] ifn_loop_dist_alias-20170607.txt

---
 gcc/cfgloop.h |  9 +++
 gcc/cfgloopmanip.c|  3 ++-
 gcc/internal-fn.c |  8 ++
 gcc/internal-fn.def   |  1 +
 gcc/tree-vectorizer.c | 75 ++-
 5 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index a8bec1d..be4187a 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -225,6 +225,15 @@ struct GTY ((chain_next ("%h.next"))) loop {
  builtins.  */
   tree simduid;
 
+  /* For loops generated by distribution with runtime alias checks, this
+ is a unique identifier of the original distributed loop.  Generally
+ it is the number of the original loop.  IFN_LOOP_DIST_ALIAS builtin
+ uses this id as its first argument.  Give a loop with an id, we can
+ look upward in dominance tree for the corresponding IFN_LOOP_DIST_ALIAS
+ buildin.  Note this id has no meanling after IFN_LOOP_DIST_ALIAS is
+ folded and eliminated.  */
+  int ldist_alias_id;
+
   /* Upper bound on number of iterations of a loop.  */
   struct nb_iter_bound *bounds;
 
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index d764ab9..adb2f65 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1653,7 +1653,8 @@ force_single_succ_latches (void)
 
   THEN_PROB is the probability of then branch of the condition.
   ELSE_PROB is the probability of else branch. Note that they may be both
-  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED.  */
+  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED or
+  IFN_LOOP_DIST_ALIAS.  */
 
 static basic_block
 lv_adjust_loop_entry_edge (basic_block first_head, basic_block second_head,
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 75fe027..96e40cb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2250,6 +2250,14 @@ expand_LOOP_VECTORIZED (internal_fn, gcall *)
   gcc_unreachable ();
 }
 
+/* This should get folded in tree-vectorizer.c.  */
+
+static void
+expand_LOOP_DIST_ALIAS (internal_fn, gcall *)
+{
+  gcc_unreachable ();
+}
+
 /* Expand MASK_LOAD call STMT using optab OPTAB.  */
 
 static void
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index e162d81..79c19fb 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -158,6 +158,7 @@ DEF_INTERNAL_FN (GOMP_SIMD_LAST_LANE, ECF_CONST | ECF_LEAF 
| ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_START, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_END, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (LOOP_VECTORIZED, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (LOOP_DIST_ALIAS, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (ANNOTATE,  ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (UBSAN_NULL, ECF_LEAF | ECF_NOTHROW, ".R.")
 DEF_INTERNAL_FN (UBSAN_BOUNDS, ECF_LEAF | ECF_NOTHROW, NULL)
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 1bef2e4..0d83d33 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -469,6 +469,63 @@ fold_loop_vectorized_call (gimple *g, tree value)
 }
 }
 
+/* If LOOP has been versioned during loop distribution, return the internal
+   call guarding it.  */
+
+static gimple *
+vect_loop_dist_alias_call (struct loop *loop)
+{
+  gimple_stmt_iterator gsi;
+  gimple *g;
+  basic_block bb = loop_preheader_edge (loop)->src;
+  struct loop *outer_loop = bb->loop_father;
+
+  /* Look upward in dominance tree.  */
+  for (; bb != ENTRY_BLOCK_PTR_FOR_FN (cfun) && bb->loop_father == outer_loop;
+   bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+{
+  g = last_stmt (bb);
+  if (g == NULL || gimple_code (g) != GIMPLE_COND)
+