Richard Biener <rguent...@suse.de> writes:
> On Wed, 14 Feb 2024, Richard Biener wrote:
>
>> For the testcase in PR113910 we spend a lot of time in PTA comparing
>> bitmaps for looking up equivalence class members.  This points to
>> the very weak bitmap_hash function which effectively hashes set
>> and a subset of not set bits.  The following improves it by mixing
>> that weak result with the population count of the bitmap, reducing
>> the number of collisions significantly.  It's still by no means
>> a good hash function.
>> 
>> One major problem with it was that it simply truncated the
>> BITMAP_WORD sized intermediate hash to hashval_t which is
>> unsigned int, effectively not hashing half of the bits.  That solves
>> most of the slowness.  Mixing in the population count improves
>> compile-time by another 30% though.
>> 
>> This reduces the compile-time for the testcase from tens of minutes
>> to 30 seconds and PTA time from 99% to 25%.  bitmap_equal_p is gone
>> from the profile.
>> 
>> Bootstrap and regtest running on x86_64-unknown-linux-gnu, will
>> push to trunk and branches.
>
> Ha, and it breaks bootstrap because I misunderstood
> bitmap_count_bits_in_word (should be word_s_).  Fixing this turns
> out that hashing the population count doesn't help anything
> so I'm re-testing the following simpler variant, giving up on the
> cheap last 25% but solving the regression as well.
>
> Richard.
>
> From a76aebfdc4b6247db6a061e6395fd088a5694122 Mon Sep 17 00:00:00 2001
> From: Richard Biener <rguent...@suse.de>
> Date: Wed, 14 Feb 2024 12:33:13 +0100
> Subject: [PATCH] tree-optimization/113910 - huge compile time during PTA
> To: gcc-patches@gcc.gnu.org
>
> For the testcase in PR113910 we spend a lot of time in PTA comparing
> bitmaps for looking up equivalence class members.  This points to
> the very weak bitmap_hash function which effectively hashes set
> and a subset of not set bits.
>
> The major problem with it is that it simply truncates the
> BITMAP_WORD sized intermediate hash to hashval_t which is
> unsigned int, effectively not hashing half of the bits.
>
> This reduces the compile-time for the testcase from tens of minutes
> to 42 seconds and PTA time from 99% to 46%.
>
>       PR tree-optimization/113910
>       * bitmap.cc (bitmap_hash): Mix the full element "hash" to
>       the hashval_t hash.
> ---
>  gcc/bitmap.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/bitmap.cc b/gcc/bitmap.cc
> index 6cf326bca5a..459e32c1ad1 100644
> --- a/gcc/bitmap.cc
> +++ b/gcc/bitmap.cc
> @@ -2706,7 +2706,7 @@ bitmap_hash (const_bitmap head)
>        for (ix = 0; ix != BITMAP_ELEMENT_WORDS; ix++)
>       hash ^= ptr->bits[ix];
>      }
> -  return (hashval_t)hash;
> +  return iterative_hash (&hash, sizeof (hash), 0);
>  }
>  
>  

LGTM FWIW, but just curious: does using the iterative hash routines for
each update (instead of ^) help too, or is it too slow?  Or maybe do an
iterative hash for the idx part and keep ^ for the bits accumulation?
Also wonder whether using + rather than ^ for the bits accumulation
would help...

Thanks,
Richard


Reply via email to