On Mon, Mar 14, 2016 at 11:32 PM, Jeff Law <l...@redhat.com> wrote: > On 03/11/2016 03:02 AM, Richard Biener wrote: >> >> >> >> For the other part I noticed a few things >> 1) having a bitmap_count_ior_bits () would be an improvement > > Yea, I almost built one. That's easy enough to add. > >> 2) you might end up with redundant work(?) as you are iterating >> over SSA name coalesce candidates but look at partition conflicts > > We'd have redundant work if the elements mapped back to SSA_NAMEs which in > turn mapped to partitions which appeared as a coalescing pair already. But > there's no way to know that in advance. > > This is mitigated somewhat in the next version which computes the conflict > sizes lazily when the qsort comparison function is given two conflict pairs > with an equal cost.
That sounds good. >> 3) having this extra heuristic might be best guarded by >> flag_expensive_optimizations > > Perhaps. I don't see this tie breaker as being all that expensive. But I > won't object to guarding with flag_expensive_optimizations. Yeah, we should first address the quadraticness of the live compute which is usually what hits us first when hitting a bottleneck in coalescing / out-of-SSA. >> as it is a quite expensive "tie breaker" - maybe even improve things >> by first sorting >> after cost and then only doing the tie breaking when necessary, >> re-sorting the >> sub-sequence with same original cost. It may also be enough to only >> perform >> this for "important" candidates, say within the first 100 of the >> function or so >> or with cost > X. > > The problem with this is qsort's interface into the comparison function has > a terribly narrow API and I don't think we want to rely on qsort_r. In fact > that's the whole reason why I didn't do lazy evaluation on the conflict > sizes initially. > > To work around the narrow API in the comparison function we have to either > store additional data in each node or have them available in globals. The > former would be horribly wasteful, the latter is just ugly. I choose the > latter in the lazy evaluation of the conflicts version. Works for me. >> >> And finally - if we really think that looking at the conflict size >> increase is the way to go >> it would maybe be better to use a fibheap updating keys in >> attempt_coalesce >> when we merge the conflicts. That would also mean to work on a list >> (fibheap) >> of coalesces of partitions rather than SSA names. > > I really doubt it's worth this effort. The literature I've been looking at > in this space essentially says that given a reasonable coalescer, > improvements, while measurable, are very very small in terms of the > efficiency of the final code. > > Thus I rejected conservative coalescing + iteration, biased coalescing, & > trial coalescing as too expensive given the trivial benefit. Similarly I > rejected trying to update the costs as we coalesce partitions. A single > successful coalesce could have a significant ripple effect. Perhaps that > could be mitigated by realizing that many updates wouldn't be needed, but > it's just a level of complexity that's not needed here. Ok. > And note we work on partitions, not SSA_NAMEs. It just happens that we > start with each SSA_NAME in its own partition. Most SSA_NAMEs actually > don't participate in coalescing as they're not used in a copy instruction or > as a phi arg/result. That's why we compact the partitions after we've > scanned the IL for names that are going to participate in coalescing. > > > > > >> >> I think the patch is reasonable enough for GCC 6 if we can bring >> compile-time >> cost down a bit (it can be quadratic in the number of SSA names if we have >> a lot of coalesce candidates and nearly full conflict bitmaps - of course >> that's >> not a case we handle very well right now but still). I would have hoped >> the >> index part of the patch fixed the regression (by luck)... > > I'd hoped it'd fix the regression by luck as well, but that was not the case > :( > > >> >> As far as a testcase goes we want to scan the dumps for the actual >> coalesces >> being done. Might be a bit fragile though... > > I suspect that's going to be quite fragile and may have more target > dependencies than we'd like (due to branch costing and such). Yes. Otherwise -ENOPATCH. Thanks, Richard. > > > Jeff