I actually forgot to attach the patch, so here it is.

On 29 May 2016 at 22:27, lemonboy <[email protected]> wrote:
> Hello,
> While investigating the ticket #1293 [1] I noticed that the hashing
> procedures would stop hashing all the vectors
> at the 4th element and while that's a valuable way to speed up the
> hashing process it causes some problems in
> the wild.
>
> If you use a record with N > 4 slots as a key and then set! one of the
> last slots after the 4th one the equal?-hash
> procedure would return the same hash where equal? definitely fails.
>
> The attached patch fixes this problem by lifting the limit when
> hashing tagged vectors aka records, the rationale
> behind this decision is that the number of slots a record has is
> usually very small and it's more likely to use a
> record as a key instead of a huge vector.
>
> While inspecting this I also came across another issue, eq?-hash
> crunches some plain values by just xor-ing them
> with a random seed, while it calls hash-equal? for complex objects
> like records/vectors, that's IMO quite a convoluted
> way to implement it (and it might also be wrong as eq? performs a
> shallow comparison only) that could be replaced by
> stirring [2] the raw C_Word value as that's faster (and incidentally
> also what guile does [3]).
> (Rinse and repeat for the eqv? version)
>
> Cheers,
> LemonBoy
>
> [3] 
> http://git.savannah.gnu.org/gitweb/?p=guile.git;a=blob;f=libguile/hash.c;h=d6ddb6b3b631f8f694171fe18eaa7dcb5325df1f;hb=HEAD
> [2] https://gist.github.com/badboy/6267743
> [1] http://bugs.call-cc.org/ticket/1293
From b9e1b45232a7b8c73e8b726a8263225173626e05 Mon Sep 17 00:00:00 2001
From: LemonBoy <[email protected]>
Date: Sun, 29 May 2016 12:03:11 +0200
Subject: [PATCH] Hash all the record slots in vector-hash
To: [email protected]

Records are represented with tagged vectors and as such they are hashed
by vector-hash that has a cutoff limit that's configurable using
the *recursive-hash-max-length* parameter.
Lift the restriction when the vector being hashed is a record so that
all the slots are effectively hashed.
---
 srfi-69.scm | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/srfi-69.scm b/srfi-69.scm
index bdc7f72..3594920 100644
--- a/srfi-69.scm
+++ b/srfi-69.scm
@@ -306,17 +306,20 @@
 
   ; Recurse into some portion of the vector's slots
   (define (vector-hash obj seed depth start rnd)
-    (let ([len (##sys#size obj)])
-      (let loop ([hsh (fx+ len (fxxor seed rnd))]
-		 [i start]
-		 [len (fx- (fxmax start (fxmin *recursive-hash-max-length* len)) start)] )
+    (let* ((len (##sys#size obj))
+	   (struct? (##sys#generic-structure? obj))
+	   ; Make sure to hash all the slots when obj is a record
+	   (max-length (if struct? len *recursive-hash-max-length*)))
+      (let loop ((hsh (fx+ len (fxxor seed rnd)))
+		 (i start)
+		 (len (fx- (fxmax start (fxmin max-length len)) start)))
 	(if (fx= len 0)
 	    hsh
 	    (loop (fx+ hsh
 		       (fx+ (fxshl hsh 4)
 			    (recursive-hash (##sys#slot obj i) (fx+ depth 1) rnd)))
 		  (fx+ i 1)
-		  (fx- len 1) ) ) ) ) )
+		  (fx- len 1))))))
 
   ; Recurse into structured objects
   (define (recursive-hash obj depth rnd)
-- 
2.8.3

_______________________________________________
Chicken-hackers mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/chicken-hackers

Reply via email to