Hi Olivier,

thank you for very informative response.

Olivier Dion <[email protected]> writes:

> TLDR: Go to the end for root cause.
>
> On Sat, 23 May 2026, Tomas Volf <[email protected]> wrote:
>> Hello,
>>
>> I just noticed this warning, and I am not sure how to address it (or
>> whether I should).  Given the following file (`test.scm')
> [...]
>> Am I doing something I am not supposed to (like, is define-record-type
>> limited to once per module or something)?  Looking at the expansion of
>> the macro, it *seems* safe to ignore the warning?
>
> Nothing you did fine!

Thank you, I will just ignore the warning for now, since it seems
harmless.

> You just encountered a bug in the hashing system of Guile!  Basically,
> you seems to have hit this paragraph in "Hygiene and the Top-Level":
>
>      Note that the introduced binding has the same name!  This is because
> the source expression, ‘(define t 42)’, was the same.  Probably you will
> never see an error in this area
>
>
> Now let me explain why.  When you introduce a top-level binding, Guile
> will hash the form to generate a symbol for you.  Here's an example:
>
>   (define-syntax-rule (define-const id value)
>    (begin
>     (define t value)
>     (define (id) t)))
>
>   scheme@(guile-user)> ,expand (define-const x 10)
>    => (begin (define t-798f2ffcb9d7f9 10) (define (x) t-798f2ffcb9d7f9))
>
> As you can see, the introduced identifier `t' was changed to include a
> hashed value to make it hygiene.  This hash value is actually:
>
>   scheme@(guile-user)> (number->string (hash '(define t 10) 
> most-positive-fixnum) 16)
>    => "798f2ffcb9d7f9"
>
> So the question beg, why is this a problem for RNRS records?  Well if we
> have a look at the `define-record-type0' we find this:
>
> #`(begin
>     (define record-name
>       (make-record-type-descriptor
>        (quote record-name)
>        #,parent-rtd #,uid #,sealed? #,opaque?
>        #,field-names))
>     (define constructor-name
>       (record-constructor
>        (make-record-constructor-descriptor
>         record-name #,parent-cd #,protocol)))
>     (define dummy
>       (let ()
>         (register-record-type 
>          (quote record-name)
>          record-name (make-record-constructor-descriptor 
>                       record-name #,parent-cd #,protocol))
>         'dummy))
>     (define predicate-name (record-predicate record-name))
>     #,@field-accessors
>     #,@field-mutators)
>
> I have no idea why `dummy' is necessary here.  I'm sure someone that
> know more about RNRS could explain.  Perhaps we don't need it and that
> would solve your issue.
>
> But this still does not explain why the introduction of `dummy' is
> duplicated here.   You mentionned having the identifier
> `dummy-1a78708d3c9406a3'.
>
> So let's try that:
>
>   scheme@(guile-user)> (number->string (hash '(define dummy
>                         (let ()
>                           (register-record-type
>                            (quote a)
>                            record-name (make-record-constructor-descriptor
>                                         record-name #,parent-cd #,protocol))
>                           'dummy))
>   most-positive-fixnum) 16)
>    => "1a78708d3c9406a3"
>    
>   scheme@(guile-user)> (number->string (hash '(define dummy
>                         (let ()
>                           (register-record-type
>                            (quote b)
>                            record-name (make-record-constructor-descriptor
>                                         record-name #,parent-cd #,protocol))
>                           'dummy))
>   most-positive-fixnum) 16)
>    => "1a78708d3c9406a3"
>
>
> Two different lists, same hash values.  Weird.  But it's true:
>
>   scheme@(guile-user)> (= (hash '(1 2 3 (4)) most-positive-fixnum)
>                           (hash '(1 2 3 (4 5)) most-positive-fixnum))
>     => #t
>
> and it gets worse than that:
>
>   scheme@(guile-user)> (= (hash '(1 2 3 4 5) most-positive-fixnum)
>                           (hash '(1 2 3 4 5 6) most-positive-fixnum))
>     => #t
>
> To me this screams that list hashing are fundamentally broken.  Only the
> first 5 elements are used for hashing.  This is why you got the same
> hash for both transformations.
>
> I had plan to make some changes to the hashing system for next release,
> but this would be an incompatible change (hashing of value will change).
> I wonder though if we want to fix this issue for the stable release.

This was quite interesting read.  The problem of generating identifiers
describe above seems non-trivial.  I probably do not have anything novel
to add to the debate, but two questions popped up in my head.

1. What do other Schemes do?
2. Would incorporating source location (if available) into the hash be
   valid?

Thanks again and have a nice day,

Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.

Reply via email to