-----Original Message----- From: Jeff Law [mailto:l...@redhat.com] Sent: Friday, November 13, 2015 3:28 AM To: Richard Biener Cc: Ajit Kumar Agarwal; GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
On 11/12/2015 11:32 AM, Jeff Law wrote: > On 11/12/2015 10:05 AM, Jeff Law wrote: >>> But IIRC you mentioned it should enable vectorization or so? In >>> this case that's obviously too late. >> The opposite. Path splitting interferes with if-conversion & >> vectorization. Path splitting mucks up the CFG enough that >> if-conversion won't fire and as a result vectorization is inhibited. >> It also creates multi-latch loops, which isn't a great situation either. >> >> It *may* be the case that dropping it that far down in the pipeline >> and making the modifications necessary to handle simple latches may >> in turn make the path splitting code play better with if-conversion >> and vectorization and avoid creation of multi-latch loops. At least >> that's how it looks on paper when I draw out the CFG manipulations. >> >> I'll do some experiments. > It doesn't look too terrible to ravamp the recognition code to work > later in the pipeline with simple latches. Sadly that doesn't seem to > have fixed the bad interactions with if-conversion. > > *But* that does open up the possibility of moving the path splitting > pass even deeper in the pipeline -- in particular we can move it past > the vectorizer. Which is may be a win. > > So the big question is whether or not we'll still see enough benefits > from having it so late in the pipeline. It's still early enough that > we get DOM, VRP, reassoc, forwprop, phiopt, etc. > > Ajit, I'll pass along an updated patch after doing some more testing. Hello Jeff: >>So here's what I'm working with. It runs after the vectorizer now. >>Ajit, if you could benchmark this it would be greatly appreciated. I know >>you saw significant improvements on one or more benchmarks in the past. It'd >>be good to know that the >>updated placement of the pass doesn't invalidate >>the gains you saw. >>With the updated pass placement, we don't have to worry about switching the >>pass on/off based on whether or not the vectorizer & if-conversion are >>enabled. So that hackery is gone. >>I think I've beefed up the test to identify the diamond patterns we want so >>that it's stricter in what we accept. The call to ignore_bb_p is a part of >>that test so that we're actually looking at >>the right block in a world >>where we're doing this transformation with simple latches. >>I've also put a graphical comment before perform_path_splitting which >>hopefully shows the CFG transformation we're making a bit clearer. >>This bootstraps and regression tests cleanly on x86_64-linux-gnu. Thank you for the inputs. I will build the compiler and run SPEC CPU 2000 benchmarks for X86 target and respond back as soon as run is done. I will also run the EEMBC/Mibench benchmarks for Microblaze target. Would let you know the results at the earliest. Thanks & Regards Ajit