> -----Original Message----- > From: Christophe Lyon <christophe.l...@linaro.org> > Sent: Friday, November 6, 2020 8:42 AM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: Richard Biener <rguent...@suse.de>; nd <n...@arm.com>; gcc- > patc...@gcc.gnu.org; o...@ucw.cz > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late > > On Thu, 5 Nov 2020 at 11:21, Tamar Christina via Gcc-patches <gcc- > patc...@gcc.gnu.org> wrote: > > > > > -----Original Message----- > > > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On > > > Behalf Of Richard Biener > > > Sent: Thursday, November 5, 2020 10:17 AM > > > To: Tamar Christina <tamar.christ...@arm.com> > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late > > > > > > On Wed, 4 Nov 2020, Tamar Christina wrote: > > > > > > > Hi Richi, > > > > > > > > > -----Original Message----- > > > > > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> > > > > > On Behalf Of Richard Biener > > > > > Sent: Wednesday, November 4, 2020 8:07 AM > > > > > To: Tamar Christina <tamar.christ...@arm.com> > > > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz > > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late > > > > > > > > > > On Tue, 3 Nov 2020, Tamar Christina wrote: > > > > > > > > > > > Hi Richi, > > > > > > > > > > > > We decided to take the regression in any code-gen this could > > > > > > give and fix it properly next stage-1. As such here's a new > > > > > > patch based on your previous feedback. > > > > > > > > > > > > Ok for master? > > > > > > > > > > Looks good sofar but be aware that you elide the > > > > > > > > > > - && vect_store_lanes_supported > > > > > - (STMT_VINFO_VECTYPE (scalar_stmts[0]), group_size, > > > > > false)) > > > > > > > > > > part of the check - that is, you don't verify the store part of > > > > > the instance can use store-lanes. Btw, this means the original > > > > > code cancelled an instance only when the SLP graph entry is a > > > > > store-lane capable store but your variant would also cancel in > > > > > case there's a load- > > > lane capable reduction. > > > > > > > > > > > > > I do still have it, > > > > > > > > if (loads_permuted > > > > && vect_store_lanes_supported (vectype, group_size, > > > > false)) > > > > > > > > I just grab the type from the SLP_TREE_VECTYPE (slp_root); which > > > > should be the store if one exists. > > > > > > > > > I think that you eventually want to re-instantiate the > > > > > store-lane check but treat it the same as any of the load checks > > > > > (thus not require all instances to be stores for the cancellation). > > > > > But at least when a store cannot use store-lanes we probably > > > > > shouldn't cancel the SLP. > > > > > > > > I did however elide the kind check, that was added as part of the > > > > rebase, it looked like kind wasn't Being stored inside the SLP > > > > instance and > > > I'd have to redo the analysis to find it. > > > > > > > > Does it does reasonable to include kind as a field in the SLP instance? > > > > > > > > > > > > > > Anyway, the patch is OK for master. The store-lane check part > > > > > can be re- added as followup. > > > > > > > > > > > > > Thanks! Will do. > > > > > > Btw, the patch regressed > > > > > > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects > > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1 > > > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing > > > stmts using SLP" 1 > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects > > > scan-tree-dump vect "Built SLP cancelled: can use load/store-lanes" > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects > > > scan-tree-dump vect "LOAD_LANES" > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects > > > scan-tree-dump vect "STORE_LANES" > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled: > > > can use load/store-lanes" > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES" > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES" > > > > > > on x86_64. The slp-11b.c testcase is interesting since there > > > extract_muldiv folding makes the group of four stores not matching > > > so we split into a size of > > > 3 and one remaining store. > > > This causes us to arrive at > > > > > > note: node 0x441a940 (max_nunits=4, refcnt=2) > > > note: stmt 0 _2 = in[_1]; > > > note: stmt 1 _6 = in[_5]; > > > note: stmt 2 _10 = in[_8]; > > > note: load permutation { 0 2 1 } > > > > > > which on x86_64 we in the end cannot handle (without SSE4 I think) > > > so it fails to SLP there. Guess arm can do the permute but not the load- > lane here. > > > > > > For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done for > > > !vect_load_lanes targets. Not sure if that's possible easily, like > > > with a { target vect_load_lanes } { xfail vect_load_lanes } combo > > > ...? I suggest to make it xfail everywhere instead and add a > > > comment as to we're expecting those only for vect_load_lanes targets. > > > > Yes just fixed these, the change in gcc.dg/vect/slp-11b.c shouldn't be > > there and I updated the target selector properly :) > > > > Hi Tamar, > > This patch (r11-4728) introduced a regression on arm: > FAIL: gcc.dg/vect/slp-reduc-9.c -flto -ffat-lto-objects > scan-tree-dump vect "vectorized 1 loops" > FAIL: gcc.dg/vect/slp-reduc-9.c -flto -ffat-lto-objects > scan-tree-dump vect "vectorizing stmts using SLP" > FAIL: gcc.dg/vect/slp-reduc-9.c scan-tree-dump vect "vectorized 1 loops" > FAIL: gcc.dg/vect/slp-reduc-9.c scan-tree-dump vect "vectorizing > stmts using SLP" > > Can you check?
Hi Christoph, Which arm target is this? I checked arm-none-linux-gnueabihf and slp-reduc-9.c is passing there. Regards, Tamar > > > Cheers, > > Tamar > > > > > > > > Richard. > > > > > > > > Thanks, > > > > > Richard. > > > > > > > > > > > Thanks, > > > > > > Tamar > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > * tree-vect-slp.c (vect_analyze_slp_instance): Moved > > > > > > load/store > > > > > lanes > > > > > > check to ... > > > > > > * tree-vect-loop.c (vect_analyze_loop_2): ..Here > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > * gcc.dg/vect/slp-11b.c: Update output scan. > > > > > > * gcc.dg/vect/slp-perm-6.c: Likewise. > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: rguent...@c653.arch.suse.de > > > > > > > <rguent...@c653.arch.suse.de> On Behalf Of Richard Biener > > > > > > > Sent: Thursday, October 22, 2020 9:44 AM > > > > > > > To: Tamar Christina <tamar.christ...@arm.com> > > > > > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz > > > > > > > Subject: Re: [PATCH] SLP: Move load/store-lanes check till > > > > > > > late > > > > > > > > > > > > > > On Wed, 21 Oct 2020, Tamar Christina wrote: > > > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > > > This moves the code that checks for load/store lanes > > > > > > > > further in the pipeline and places it after slp_optimize. > > > > > > > > This would allow us to perform optimizations on the SLP > > > > > > > > tree and only bail out if we really have a > > > > > > > permute. > > > > > > > > > > > > > > > > With this change it allows us to handle permutes such as > > > > > > > > {1,1,1,1} which should be handled by a load and replicate. > > > > > > > > > > > > > > > > This change however makes it all or nothing. Either all > > > > > > > > instances can be handled or none at all. This is why some > > > > > > > > of the test cases have been > > > > > > > adjusted. > > > > > > > > > > > > > > So this possibly leaves a loop unvectorized in case there's > > > > > > > a ldN/stN opportunity but another SLP instance with a > > > > > > > permutation not handled by interleaving is present. What I > > > > > > > was originally suggesting is to only cancel the SLP build if > > > > > > > _all_ instances can be handled > > > > > with ldN/stN. > > > > > > > > > > > > > > Of course I'm also happy with completely removing this heuristics. > > > > > > > > > > > > > > Note some of the comments look off now, also the assignment > > > > > > > to ok before the goto is pointless and you should probably > > > > > > > turn this into a dump print instead. > > > > > > > > > > > > > > Thanks, > > > > > > > Richard. > > > > > > > > > > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, > > > > > > > > -x86_64-pc-linux-gnu and no issues. > > > > > > > > > > > > > > > > Ok for master? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Tamar > > > > > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > > > > > * tree-vect-slp.c (vect_analyze_slp_instance): Moved > > > > > > > > load/store > > > > > > > lanes > > > > > > > > check to ... > > > > > > > > * tree-vect-loop.c (vect_analyze_loop_2): ..Here > > > > > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > > > > > * gcc.dg/vect/slp-11b.c: Update output scan. > > > > > > > > * gcc.dg/vect/slp-perm-6.c: Likewise. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Richard Biener <rguent...@suse.de> SUSE Software Solutions > > > > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > GF: > > > > > > > Felix Imend > > > > > > > > > > > > > > > > -- > > > > > Richard Biener <rguent...@suse.de> SUSE Software Solutions > > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: > > > > > Felix Imend > > > > > > > > > > -- > > > Richard Biener <rguent...@suse.de> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 > > > Nuernberg, Germany; GF: Felix Imend