Re: [PATCH 3/9] Use murmurhash3 for pointer map hashing

2013-04-23 Thread Richard Biener
On Mon, Apr 22, 2013 at 5:42 PM, Andi Kleen a...@linux.intel.com wrote:
 On Mon, Apr 22, 2013 at 01:46:58PM +0200, Richard Biener wrote:
 On Fri, Apr 19, 2013 at 11:31 PM, Andi Kleen a...@firstfloor.org wrote:
  From: Andi Kleen a...@linux.intel.com
 
  For a large LTO test case The previous pointer hash change brought
  the collision rate for the WPA gimple type hash table from 90% to
  70. This patch uses the well known murmur3 to improve it further
  to 64%.

 But if they are pointers then pointer_hash should be good enough... ?

 The original pointer hash (ptr  3) % hashsize and throwing away most bits is
 very poor.

 The evahash based on I sent earlier is better, but murmur3 is even better than
 that, at least for this case.

I'd rather not have different pointer hashes for things where there isn't a
fundamental difference between the pointer values.

Richard.

 -Andi



Re: [PATCH 3/9] Use murmurhash3 for pointer map hashing

2013-04-23 Thread Andi Kleen
On Tue, Apr 23, 2013 at 12:27:43PM +0200, Richard Biener wrote:
 On Mon, Apr 22, 2013 at 5:42 PM, Andi Kleen a...@linux.intel.com wrote:
  On Mon, Apr 22, 2013 at 01:46:58PM +0200, Richard Biener wrote:
  On Fri, Apr 19, 2013 at 11:31 PM, Andi Kleen a...@firstfloor.org wrote:
   From: Andi Kleen a...@linux.intel.com
  
   For a large LTO test case The previous pointer hash change brought
   the collision rate for the WPA gimple type hash table from 90% to
   70. This patch uses the well known murmur3 to improve it further
   to 64%.
 
  But if they are pointers then pointer_hash should be good enough... ?
 
  The original pointer hash (ptr  3) % hashsize and throwing away most bits 
  is
  very poor.
 
  The evahash based on I sent earlier is better, but murmur3 is even better 
  than
  that, at least for this case.
 
 I'd rather not have different pointer hashes for things where there isn't a
 fundamental difference between the pointer values.

One of the reasons I did it explicitely is that the murmur3 reference code is 
C++
(well really only the mixed code/declarations I think) and libiberty seems to be
C only. I suppose can port it to C and put it into libiberty though, and make
hashtab.c always use it.

My understanding is that murmur is generally stronger than evahash.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only


Re: [PATCH 3/9] Use murmurhash3 for pointer map hashing

2013-04-23 Thread Richard Biener
On Tue, Apr 23, 2013 at 4:08 PM, Andi Kleen a...@linux.intel.com wrote:
 On Tue, Apr 23, 2013 at 12:27:43PM +0200, Richard Biener wrote:
 On Mon, Apr 22, 2013 at 5:42 PM, Andi Kleen a...@linux.intel.com wrote:
  On Mon, Apr 22, 2013 at 01:46:58PM +0200, Richard Biener wrote:
  On Fri, Apr 19, 2013 at 11:31 PM, Andi Kleen a...@firstfloor.org wrote:
   From: Andi Kleen a...@linux.intel.com
  
   For a large LTO test case The previous pointer hash change brought
   the collision rate for the WPA gimple type hash table from 90% to
   70. This patch uses the well known murmur3 to improve it further
   to 64%.
 
  But if they are pointers then pointer_hash should be good enough... ?
 
  The original pointer hash (ptr  3) % hashsize and throwing away most 
  bits is
  very poor.
 
  The evahash based on I sent earlier is better, but murmur3 is even better 
  than
  that, at least for this case.

 I'd rather not have different pointer hashes for things where there isn't a
 fundamental difference between the pointer values.

 One of the reasons I did it explicitely is that the murmur3 reference code is 
 C++
 (well really only the mixed code/declarations I think) and libiberty seems to 
 be
 C only. I suppose can port it to C and put it into libiberty though, and make
 hashtab.c always use it.

 My understanding is that murmur is generally stronger than evahash.

It would be nice to back this with some numbers on the collision rate
for GCC hashtables, for example during bootstrap (or just for a set of
.ii files from libbackend.a sources for example).  pointer-set.c also
contains its own hash function but doesn't contain infrastructure for
statistics.  Also the new hash-table.h C++-style hashtable has its
own pointer-hash which is still

template typename Type
inline hashval_t
pointer_hash Type::hash (const value_type *candidate)
{
  /* This is a really poor hash function, but it is what the current code uses,
 so I am reusing it to avoid an additional axis in testing.  */
  return (hashval_t) ((intptr_t)candidate  3);
}

Richard.

 -Andi

 --
 a...@linux.intel.com -- Speaking for myself only


Re: [PATCH 3/9] Use murmurhash3 for pointer map hashing

2013-04-23 Thread Andi Kleen
Richard Biener richard.guent...@gmail.com writes:

 One of the reasons I did it explicitely is that the murmur3 reference code 
 is C++
 (well really only the mixed code/declarations I think) and libiberty seems 
 to be
 C only. I suppose can port it to C and put it into libiberty though, and make
 hashtab.c always use it.

 My understanding is that murmur is generally stronger than evahash.

 It would be nice to back this with some numbers on the collision rate
 for GCC hashtables,

I tested it for the preprocessor symbol hash at some point, but had some
second thoughts because it could potentially be called with unaligned 
arguments (ok on x86 with murmur, but may not be elsewhere). Don't 
remember the exact numbers, but I think it was a bit better.

For WPA types at least murmur3 got me from 70% collisions with the
eva ptrhash to 64%. The original was much worse.

Good point about the new hash table. Right now nothing using it is
on my performance radar though.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only


Re: [PATCH 3/9] Use murmurhash3 for pointer map hashing

2013-04-22 Thread Richard Biener
On Fri, Apr 19, 2013 at 11:31 PM, Andi Kleen a...@firstfloor.org wrote:
 From: Andi Kleen a...@linux.intel.com

 For a large LTO test case The previous pointer hash change brought
 the collision rate for the WPA gimple type hash table from 90% to
 70. This patch uses the well known murmur3 to improve it further
 to 64%.

But if they are pointers then pointer_hash should be good enough... ?

That said, I still have that large type merging reorg pending ... (just my
day only has 24h ...)

Richard.

 gcc/:

 2013-04-18  Andi Kleen a...@linux.intel.com

 * Makefile.in (tree.o): Add murmurhash3.h dependency.
 * tree.c (tree_map_base_hash): Use murmurhash3.
 ---
  gcc/Makefile.in | 2 +-
  gcc/tree.c  | 4 +++-
  2 files changed, 4 insertions(+), 2 deletions(-)

 diff --git a/gcc/Makefile.in b/gcc/Makefile.in
 index 109f865..28815a2 100644
 --- a/gcc/Makefile.in
 +++ b/gcc/Makefile.in
 @@ -2216,7 +2216,7 @@ tree.o: tree.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
 $(TM_H) $(TREE_H) \
 $(BASIC_BLOCK_H) $(TREE_FLOW_H) $(OBSTACK_H) pointer-set.h \
 $(TREE_PASS_H) $(LANGHOOKS_DEF_H) $(DIAGNOSTIC_H) $(CGRAPH_H) \
 $(EXCEPT_H) debug.h intl.h tree-diagnostic.h $(TREE_PRETTY_PRINT_H) \
 -   $(COMMON_TARGET_H)
 +   $(COMMON_TARGET_H) murmurhash.h
  tree-dump.o: tree-dump.c $(CONFIG_H) $(SYSTEM_H) $(TM_H) $(TREE_H) \
 langhooks.h $(TREE_DUMP_H) tree-iterator.h $(TREE_PRETTY_PRINT_H)
  tree-inline.o : tree-inline.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
 diff --git a/gcc/tree.c b/gcc/tree.c
 index d8f2424..2fb732f 100644
 --- a/gcc/tree.c
 +++ b/gcc/tree.c
 @@ -59,6 +59,7 @@ along with GCC; see the file COPYING3.  If not see
  #include except.h
  #include debug.h
  #include intl.h
 +#include murmurhash3.h

  /* Tree code classes.  */

 @@ -5935,7 +5936,8 @@ tree_map_base_eq (const void *va, const void *vb)
  unsigned int
  tree_map_base_hash (const void *item)
  {
 -  return htab_hash_pointer (((const struct tree_map_base *)item)-from);
 +  return murmurhash3_32 (((const struct tree_map_base *)item)-from,
 +sizeof (void *), 0);
  }

  /* Return true if this tree map structure is marked for garbage collection
 --
 1.8.1.4



Re: [PATCH 3/9] Use murmurhash3 for pointer map hashing

2013-04-22 Thread Andi Kleen
On Mon, Apr 22, 2013 at 01:46:58PM +0200, Richard Biener wrote:
 On Fri, Apr 19, 2013 at 11:31 PM, Andi Kleen a...@firstfloor.org wrote:
  From: Andi Kleen a...@linux.intel.com
 
  For a large LTO test case The previous pointer hash change brought
  the collision rate for the WPA gimple type hash table from 90% to
  70. This patch uses the well known murmur3 to improve it further
  to 64%.
 
 But if they are pointers then pointer_hash should be good enough... ?

The original pointer hash (ptr  3) % hashsize and throwing away most bits is
very poor.

The evahash based on I sent earlier is better, but murmur3 is even better than
that, at least for this case.

-Andi



[PATCH 3/9] Use murmurhash3 for pointer map hashing

2013-04-19 Thread Andi Kleen
From: Andi Kleen a...@linux.intel.com

For a large LTO test case The previous pointer hash change brought
the collision rate for the WPA gimple type hash table from 90% to
70. This patch uses the well known murmur3 to improve it further
to 64%.

gcc/:

2013-04-18  Andi Kleen a...@linux.intel.com

* Makefile.in (tree.o): Add murmurhash3.h dependency.
* tree.c (tree_map_base_hash): Use murmurhash3.
---
 gcc/Makefile.in | 2 +-
 gcc/tree.c  | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 109f865..28815a2 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2216,7 +2216,7 @@ tree.o: tree.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
$(TM_H) $(TREE_H) \
$(BASIC_BLOCK_H) $(TREE_FLOW_H) $(OBSTACK_H) pointer-set.h \
$(TREE_PASS_H) $(LANGHOOKS_DEF_H) $(DIAGNOSTIC_H) $(CGRAPH_H) \
$(EXCEPT_H) debug.h intl.h tree-diagnostic.h $(TREE_PRETTY_PRINT_H) \
-   $(COMMON_TARGET_H)
+   $(COMMON_TARGET_H) murmurhash.h
 tree-dump.o: tree-dump.c $(CONFIG_H) $(SYSTEM_H) $(TM_H) $(TREE_H) \
langhooks.h $(TREE_DUMP_H) tree-iterator.h $(TREE_PRETTY_PRINT_H)
 tree-inline.o : tree-inline.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
diff --git a/gcc/tree.c b/gcc/tree.c
index d8f2424..2fb732f 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -59,6 +59,7 @@ along with GCC; see the file COPYING3.  If not see
 #include except.h
 #include debug.h
 #include intl.h
+#include murmurhash3.h
 
 /* Tree code classes.  */
 
@@ -5935,7 +5936,8 @@ tree_map_base_eq (const void *va, const void *vb)
 unsigned int
 tree_map_base_hash (const void *item)
 {
-  return htab_hash_pointer (((const struct tree_map_base *)item)-from);
+  return murmurhash3_32 (((const struct tree_map_base *)item)-from, 
+sizeof (void *), 0);
 }
 
 /* Return true if this tree map structure is marked for garbage collection
-- 
1.8.1.4