Richard Biener <rguent...@suse.de> writes: > On Wed, 14 Feb 2024, Richard Biener wrote: > >> For the testcase in PR113910 we spend a lot of time in PTA comparing >> bitmaps for looking up equivalence class members. This points to >> the very weak bitmap_hash function which effectively hashes set >> and a subset of not set bits. The following improves it by mixing >> that weak result with the population count of the bitmap, reducing >> the number of collisions significantly. It's still by no means >> a good hash function. >> >> One major problem with it was that it simply truncated the >> BITMAP_WORD sized intermediate hash to hashval_t which is >> unsigned int, effectively not hashing half of the bits. That solves >> most of the slowness. Mixing in the population count improves >> compile-time by another 30% though. >> >> This reduces the compile-time for the testcase from tens of minutes >> to 30 seconds and PTA time from 99% to 25%. bitmap_equal_p is gone >> from the profile. >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu, will >> push to trunk and branches. > > Ha, and it breaks bootstrap because I misunderstood > bitmap_count_bits_in_word (should be word_s_). Fixing this turns > out that hashing the population count doesn't help anything > so I'm re-testing the following simpler variant, giving up on the > cheap last 25% but solving the regression as well. > > Richard. > > From a76aebfdc4b6247db6a061e6395fd088a5694122 Mon Sep 17 00:00:00 2001 > From: Richard Biener <rguent...@suse.de> > Date: Wed, 14 Feb 2024 12:33:13 +0100 > Subject: [PATCH] tree-optimization/113910 - huge compile time during PTA > To: gcc-patches@gcc.gnu.org > > For the testcase in PR113910 we spend a lot of time in PTA comparing > bitmaps for looking up equivalence class members. This points to > the very weak bitmap_hash function which effectively hashes set > and a subset of not set bits. > > The major problem with it is that it simply truncates the > BITMAP_WORD sized intermediate hash to hashval_t which is > unsigned int, effectively not hashing half of the bits. > > This reduces the compile-time for the testcase from tens of minutes > to 42 seconds and PTA time from 99% to 46%. > > PR tree-optimization/113910 > * bitmap.cc (bitmap_hash): Mix the full element "hash" to > the hashval_t hash. > --- > gcc/bitmap.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/bitmap.cc b/gcc/bitmap.cc > index 6cf326bca5a..459e32c1ad1 100644 > --- a/gcc/bitmap.cc > +++ b/gcc/bitmap.cc > @@ -2706,7 +2706,7 @@ bitmap_hash (const_bitmap head) > for (ix = 0; ix != BITMAP_ELEMENT_WORDS; ix++) > hash ^= ptr->bits[ix]; > } > - return (hashval_t)hash; > + return iterative_hash (&hash, sizeof (hash), 0); > } > >
LGTM FWIW, but just curious: does using the iterative hash routines for each update (instead of ^) help too, or is it too slow? Or maybe do an iterative hash for the idx part and keep ^ for the bits accumulation? Also wonder whether using + rather than ^ for the bits accumulation would help... Thanks, Richard