On Wed, 4 Nov 2020, Tamar Christina wrote:

> Hi Richi,
> 
> > -----Original Message-----
> > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On
> > Behalf Of Richard Biener
> > Sent: Wednesday, November 4, 2020 8:07 AM
> > To: Tamar Christina <tamar.christ...@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz
> > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > 
> > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > 
> > > Hi Richi,
> > >
> > > We decided to take the regression in any code-gen this could give and
> > > fix it properly next stage-1.  As such here's a new patch based on
> > > your previous feedback.
> > >
> > > Ok for master?
> > 
> > Looks good sofar but be aware that you elide the
> > 
> > -             && vect_store_lanes_supported
> > -                  (STMT_VINFO_VECTYPE (scalar_stmts[0]), group_size,
> > false))
> > 
> > part of the check - that is, you don't verify the store part of the 
> > instance can
> > use store-lanes.  Btw, this means the original code cancelled an instance 
> > only
> > when the SLP graph entry is a store-lane capable store but your variant
> > would also cancel in case there's a load-lane capable reduction.
> > 
> 
> I do still have it,
> 
>         if (loads_permuted
>             && vect_store_lanes_supported (vectype, group_size, false))
> 
> I just grab the type from the SLP_TREE_VECTYPE (slp_root); which should be 
> the store if
> one exists. 
> 
> > I think that you eventually want to re-instantiate the store-lane check but
> > treat it the same as any of the load checks (thus not require all instances 
> > to
> > be stores for the cancellation).
> > But at least when a store cannot use store-lanes we probably shouldn't
> > cancel the SLP.
> 
> I did however elide the kind check, that was added as part of the rebase, it 
> looked like kind wasn't
> Being stored inside the SLP instance and I'd have to redo the analysis to 
> find it.
> 
> Does it does reasonable to include kind as a field in the SLP instance?
> 
> > 
> > Anyway, the patch is OK for master.  The store-lane check part can be re-
> > added as followup.
> > 
> 
> Thanks! Will do.

Btw, the patch regressed

FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump 
vect "Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump 
vect "LOAD_LANES"
FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump 
vect "STORE_LANES"
FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled: 
can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"

on x86_64.  The slp-11b.c testcase is interesting since there
extract_muldiv folding makes the group of four stores
not matching so we split into a size of 3 and one remaining store.
This causes us to arrive at

note:   node 0x441a940 (max_nunits=4, refcnt=2)
note:       stmt 0 _2 = in[_1];
note:       stmt 1 _6 = in[_5];
note:       stmt 2 _10 = in[_8];
note:       load permutation { 0 2 1 }

which on x86_64 we in the end cannot handle (without SSE4 I think)
so it fails to SLP there.  Guess arm can do the permute but not
the load-lane here.

For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done for
!vect_load_lanes targets.  Not sure if that's possible easily,
like with a { target vect_load_lanes } { xfail vect_load_lanes }
combo ...?  I suggest to make it xfail everywhere instead and
add a comment as to we're expecting those only for vect_load_lanes
targets.

Richard.

> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-vect-slp.c (vect_analyze_slp_instance): Moved load/store
> > lanes
> > >   check to ...
> > >   * tree-vect-loop.c (vect_analyze_loop_2): ..Here
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.dg/vect/slp-11b.c: Update output scan.
> > >   * gcc.dg/vect/slp-perm-6.c: Likewise.
> > >
> > > > -----Original Message-----
> > > > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On
> > > > Behalf Of Richard Biener
> > > > Sent: Thursday, October 22, 2020 9:44 AM
> > > > To: Tamar Christina <tamar.christ...@arm.com>
> > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz
> > > > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> > > >
> > > > On Wed, 21 Oct 2020, Tamar Christina wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > This moves the code that checks for load/store lanes further in
> > > > > the pipeline and places it after slp_optimize.  This would allow
> > > > > us to perform optimizations on the SLP tree and only bail out if
> > > > > we really have a
> > > > permute.
> > > > >
> > > > > With this change it allows us to handle permutes such as {1,1,1,1}
> > > > > which should be handled by a load and replicate.
> > > > >
> > > > > This change however makes it all or nothing. Either all instances
> > > > > can be handled or none at all.  This is why some of the test cases
> > > > > have been
> > > > adjusted.
> > > >
> > > > So this possibly leaves a loop unvectorized in case there's a
> > > > ldN/stN opportunity but another SLP instance with a permutation not
> > > > handled by interleaving is present.  What I was originally
> > > > suggesting is to only cancel the SLP build if _all_ instances can be 
> > > > handled
> > with ldN/stN.
> > > >
> > > > Of course I'm also happy with completely removing this heuristics.
> > > >
> > > > Note some of the comments look off now, also the assignment to ok
> > > > before the goto is pointless and you should probably turn this into
> > > > a dump print instead.
> > > >
> > > > Thanks,
> > > > Richard.
> > > >
> > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > > -x86_64-pc-linux-gnu and no issues.
> > > > >
> > > > > Ok for master?
> > > >
> > > >
> > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >       * tree-vect-slp.c (vect_analyze_slp_instance): Moved load/store
> > > > lanes
> > > > >       check to ...
> > > > >       * tree-vect-loop.c (vect_analyze_loop_2): ..Here
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >       * gcc.dg/vect/slp-11b.c: Update output scan.
> > > > >       * gcc.dg/vect/slp-perm-6.c: Likewise.
> > > > >
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguent...@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > > Nuernberg, Germany; GF: Felix Imend
> > >
> > 
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > Nuernberg, Germany; GF: Felix Imend
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

Reply via email to