On Fri, 6 Nov 2020 at 11:18, Tamar Christina <tamar.christ...@arm.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Christophe Lyon <christophe.l...@linaro.org>
> > Sent: Friday, November 6, 2020 8:42 AM
> > To: Tamar Christina <tamar.christ...@arm.com>
> > Cc: Richard Biener <rguent...@suse.de>; nd <n...@arm.com>; gcc-
> > patc...@gcc.gnu.org; o...@ucw.cz
> > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> >
> > On Thu, 5 Nov 2020 at 11:21, Tamar Christina via Gcc-patches <gcc-
> > patc...@gcc.gnu.org> wrote:
> > >
> > > > -----Original Message-----
> > > > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On
> > > > Behalf Of Richard Biener
> > > > Sent: Thursday, November 5, 2020 10:17 AM
> > > > To: Tamar Christina <tamar.christ...@arm.com>
> > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz
> > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > >
> > > > On Wed, 4 Nov 2020, Tamar Christina wrote:
> > > >
> > > > > Hi Richi,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de>
> > > > > > On Behalf Of Richard Biener
> > > > > > Sent: Wednesday, November 4, 2020 8:07 AM
> > > > > > To: Tamar Christina <tamar.christ...@arm.com>
> > > > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz
> > > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > > > >
> > > > > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > > > > >
> > > > > > > Hi Richi,
> > > > > > >
> > > > > > > We decided to take the regression in any code-gen this could
> > > > > > > give and fix it properly next stage-1.  As such here's a new
> > > > > > > patch based on your previous feedback.
> > > > > > >
> > > > > > > Ok for master?
> > > > > >
> > > > > > Looks good sofar but be aware that you elide the
> > > > > >
> > > > > > -             && vect_store_lanes_supported
> > > > > > -                  (STMT_VINFO_VECTYPE (scalar_stmts[0]), 
> > > > > > group_size,
> > > > > > false))
> > > > > >
> > > > > > part of the check - that is, you don't verify the store part of
> > > > > > the instance can use store-lanes.  Btw, this means the original
> > > > > > code cancelled an instance only when the SLP graph entry is a
> > > > > > store-lane capable store but your variant would also cancel in
> > > > > > case there's a load-
> > > > lane capable reduction.
> > > > > >
> > > > >
> > > > > I do still have it,
> > > > >
> > > > >       if (loads_permuted
> > > > >           && vect_store_lanes_supported (vectype, group_size,
> > > > > false))
> > > > >
> > > > > I just grab the type from the SLP_TREE_VECTYPE (slp_root); which
> > > > > should be the store if one exists.
> > > > >
> > > > > > I think that you eventually want to re-instantiate the
> > > > > > store-lane check but treat it the same as any of the load checks
> > > > > > (thus not require all instances to be stores for the cancellation).
> > > > > > But at least when a store cannot use store-lanes we probably
> > > > > > shouldn't cancel the SLP.
> > > > >
> > > > > I did however elide the kind check, that was added as part of the
> > > > > rebase, it looked like kind wasn't Being stored inside the SLP
> > > > > instance and
> > > > I'd have to redo the analysis to find it.
> > > > >
> > > > > Does it does reasonable to include kind as a field in the SLP 
> > > > > instance?
> > > > >
> > > > > >
> > > > > > Anyway, the patch is OK for master.  The store-lane check part
> > > > > > can be re- added as followup.
> > > > > >
> > > > >
> > > > > Thanks! Will do.
> > > >
> > > > Btw, the patch regressed
> > > >
> > > > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects
> > > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > > > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing
> > > > stmts using SLP" 1
> > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > scan-tree-dump vect "Built SLP cancelled: can use load/store-lanes"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > scan-tree-dump vect "LOAD_LANES"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > scan-tree-dump vect "STORE_LANES"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled:
> > > > can use load/store-lanes"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"
> > > >
> > > > on x86_64.  The slp-11b.c testcase is interesting since there
> > > > extract_muldiv folding makes the group of four stores not matching
> > > > so we split into a size of
> > > > 3 and one remaining store.
> > > > This causes us to arrive at
> > > >
> > > > note:   node 0x441a940 (max_nunits=4, refcnt=2)
> > > > note:       stmt 0 _2 = in[_1];
> > > > note:       stmt 1 _6 = in[_5];
> > > > note:       stmt 2 _10 = in[_8];
> > > > note:       load permutation { 0 2 1 }
> > > >
> > > > which on x86_64 we in the end cannot handle (without SSE4 I think)
> > > > so it fails to SLP there.  Guess arm can do the permute but not the 
> > > > load-
> > lane here.
> > > >
> > > > For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done for
> > > > !vect_load_lanes targets.  Not sure if that's possible easily, like
> > > > with a { target vect_load_lanes } { xfail vect_load_lanes } combo
> > > > ...?  I suggest to make it xfail everywhere instead and add a
> > > > comment as to we're expecting those only for vect_load_lanes targets.
> > >
> > > Yes just fixed these, the change in gcc.dg/vect/slp-11b.c shouldn't be
> > > there and I updated the target selector properly :)
> > >
> >
> > Hi Tamar,
> >
> > This patch (r11-4728) introduced a regression on arm:
> > FAIL:    gcc.dg/vect/slp-reduc-9.c -flto -ffat-lto-objects
> > scan-tree-dump vect "vectorized 1 loops"
> > FAIL:    gcc.dg/vect/slp-reduc-9.c -flto -ffat-lto-objects
> > scan-tree-dump vect "vectorizing stmts using SLP"
> > FAIL:    gcc.dg/vect/slp-reduc-9.c scan-tree-dump vect "vectorized 1 loops"
> > FAIL:    gcc.dg/vect/slp-reduc-9.c scan-tree-dump vect "vectorizing
> > stmts using SLP"
> >
> > Can you check?
>
> Hi Christoph,
>
> Which arm target is this?
> I checked arm-none-linux-gnueabihf and slp-reduc-9.c is passing there.
>

Weird, I wasn't more specific because it fails for all the arm targets
I'm testing, see:
https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/r11-4728-g4d76079fdfa3d36ebd95ac8b489519945e8ee88f/report-build-info.html

For instance:
--target=arm-none-linux-gnueabihf
--prefix=/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools
--with-sysroot=/aci-gcc-fsf/builds/gcc-fsf-gccsrc/sysroot-arm-none-linux-gnueabihf
--disable-nls --disable-libgomp --disable-libmudflap
--disable-libcilkrts --enable-checking
--enable-languages=c,c++,fortran --with-float=hard
--enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a9
--with-fpu=neon-fp16

Christophe

> Regards,
> Tamar
> >
> > > Cheers,
> > > Tamar
> > >
> > > >
> > > > Richard.
> > > >
> > > > > > Thanks,
> > > > > > Richard.
> > > > > >
> > > > > > > Thanks,
> > > > > > > Tamar
> > > > > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > >         * tree-vect-slp.c (vect_analyze_slp_instance): Moved
> > > > > > > load/store
> > > > > > lanes
> > > > > > >         check to ...
> > > > > > >         * tree-vect-loop.c (vect_analyze_loop_2): ..Here
> > > > > > >
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > >
> > > > > > >         * gcc.dg/vect/slp-11b.c: Update output scan.
> > > > > > >         * gcc.dg/vect/slp-perm-6.c: Likewise.
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: rguent...@c653.arch.suse.de
> > > > > > > > <rguent...@c653.arch.suse.de> On Behalf Of Richard Biener
> > > > > > > > Sent: Thursday, October 22, 2020 9:44 AM
> > > > > > > > To: Tamar Christina <tamar.christ...@arm.com>
> > > > > > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz
> > > > > > > > Subject: Re: [PATCH] SLP: Move load/store-lanes check till
> > > > > > > > late
> > > > > > > >
> > > > > > > > On Wed, 21 Oct 2020, Tamar Christina wrote:
> > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > > This moves the code that checks for load/store lanes
> > > > > > > > > further in the pipeline and places it after slp_optimize.
> > > > > > > > > This would allow us to perform optimizations on the SLP
> > > > > > > > > tree and only bail out if we really have a
> > > > > > > > permute.
> > > > > > > > >
> > > > > > > > > With this change it allows us to handle permutes such as
> > > > > > > > > {1,1,1,1} which should be handled by a load and replicate.
> > > > > > > > >
> > > > > > > > > This change however makes it all or nothing. Either all
> > > > > > > > > instances can be handled or none at all.  This is why some
> > > > > > > > > of the test cases have been
> > > > > > > > adjusted.
> > > > > > > >
> > > > > > > > So this possibly leaves a loop unvectorized in case there's
> > > > > > > > a ldN/stN opportunity but another SLP instance with a
> > > > > > > > permutation not handled by interleaving is present.  What I
> > > > > > > > was originally suggesting is to only cancel the SLP build if
> > > > > > > > _all_ instances can be handled
> > > > > > with ldN/stN.
> > > > > > > >
> > > > > > > > Of course I'm also happy with completely removing this 
> > > > > > > > heuristics.
> > > > > > > >
> > > > > > > > Note some of the comments look off now, also the assignment
> > > > > > > > to ok before the goto is pointless and you should probably
> > > > > > > > turn this into a dump print instead.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Richard.
> > > > > > > >
> > > > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > > > > > > -x86_64-pc-linux-gnu and no issues.
> > > > > > > > >
> > > > > > > > > Ok for master?
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Tamar
> > > > > > > > >
> > > > > > > > > gcc/ChangeLog:
> > > > > > > > >
> > > > > > > > >     * tree-vect-slp.c (vect_analyze_slp_instance): Moved
> > > > > > > > > load/store
> > > > > > > > lanes
> > > > > > > > >     check to ...
> > > > > > > > >     * tree-vect-loop.c (vect_analyze_loop_2): ..Here
> > > > > > > > >
> > > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > > >
> > > > > > > > >     * gcc.dg/vect/slp-11b.c: Update output scan.
> > > > > > > > >     * gcc.dg/vect/slp-perm-6.c: Likewise.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Richard Biener <rguent...@suse.de> SUSE Software Solutions
> > > > > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> > GF:
> > > > > > > > Felix Imend
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguent...@suse.de> SUSE Software Solutions
> > > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF:
> > > > > > Felix Imend
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguent...@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > > Nuernberg, Germany; GF: Felix Imend

Reply via email to