#24: non-equivalence of equal Hash key strings
----------------------+-----------------------------------------------------
 Reporter:  pmichaud  |       Owner:     
     Type:  bug       |      Status:  new
 Priority:  major     |   Milestone:     
Component:  core      |     Version:     
 Severity:  medium    |    Keywords:     
     Lang:  perl6     |       Patch:     
 Platform:            |  
----------------------+-----------------------------------------------------

Comment(by pmichaud):

 I worked on this bug a bit, hoping to get Rakudo to use fixed-width
 strings for more of its processing.  In r39282 I made a change to
 src/hash.c:205 that changes
 {{{
     if (s1->hashval != s2->hashval)
          return 1;
 }}}
 to
 {{{
     if (s1->charset == s2->charset && s1->hashval != s2->hashval)
          return 1;
 }}}

 The original assumed that different string hashvals automatically implied
 logically unequal strings, but this holds only if the strings are from the
 same charset.  The above change causes the original test script to pass.

 But now there's a different failure -- somehow in larger hashes logically
 equivalent hash keys sometimes end up as non-equivalent.  Here's a test
 program (added to t/op/stringu.t in r39284):

 {{{
 $ cat x.pir
 .sub 'main'
     .local string str0, str1
     str0 = unicode:"infix:\u00b1"
     str1 = iso-8859-1:"infix:\xb1"

     .local pmc hash
     hash = new 'Hash'
     hash[str0] = 'hello'

     $I0 = 0
   fill_loop:
     unless $I0 < 200 goto fill_done
     inc $I0
     $S0 = $I0
     $S0 = concat 'infix:', $S0
     hash[$S0] = 'hello'
     goto fill_loop
   fill_done:

     $I0 = iseq str0, str1
     print "iseq str0, str1               => "
     say $I0

     $S0 = hash[str0]
     $S1 = hash[str1]
     $I0 = iseq $S0, $S1
     print "iseq hash[str0], hash[str1]   => "
     say $I0
     say $S0
     say $S1
 .end

 $ ./parrot x.pir
 iseq str0, str1               => 1
 iseq hash[str0], hash[str1]   => 0
 hello

 $
 }}}

 This test is very sensitive to the size of the hash -- in fact, the
 failure only appears when the hash has more than 192 entries.  I'm not
 sure of the significance of the 192 here, but I suspect it has something
 to do with bucket and/or key management in hashes.  (Change the "200" to
 "191" in the test script above and it produces the correct output.)

 AFAICT this problem is the only significant blocker to Rakudo being able
 to run a significant number of spectests using fixed-width (iso-8859-1)
 strings, which will reduce the overall "make spectest" time by about 20%.

 Thanks,

 Pm

-- 
Ticket URL: <https://trac.parrot.org/parrot/ticket/24#comment:4>
Parrot <https://trac.parrot.org/parrot/>
Parrot Development
_______________________________________________
parrot-tickets mailing list
[email protected]
http://lists.parrot.org/mailman/listinfo/parrot-tickets

Reply via email to