> # New Ticket Created by "Clinton A. Pierce" > # Please include the string: [perl #22718] > # in the subject line of all future correspondence about this issue. > # <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=22718 > > > > In this code: > > .sub _main > .local string source > .local string lookfor > > source="This is not quite right" > lookfor=" is " > > index $I0, source, lookfor, 0 > print "(expect 4) $I0 = " > print $I0 # you'll actually get -1 > > lookfor="is" > index $I0, source, lookfor, 0 > print "\n(expect 2) $I0 = " > print $I0 # You will correctly get 2 > end > .end > > The string " is " (note the spaces) will not be found within "This is not > quite right". Removing the spaces has the expected results. I looked > briefly for an explanation in strings.c but my eyes glazed over at the > sight of the multibyte and encoding stuff.
The bug wasn't in the multibyte part though ;-) I made a slight error in implementing the algorithm. It must've been some strange coincidence that all the tests passed anyway. Well, here's the new one: Index: string.c =================================================================== RCS file: /cvs/public/parrot/string.c,v retrieving revision 1.132 diff -u -r1.132 string.c --- string.c 22 Jun 2003 08:57:49 -0000 1.132 +++ string.c 23 Jun 2003 05:53:20 -0000 @@ -370,7 +370,7 @@ const UINTVAL find_strlen = find->strlen; const UINTVAL str_strlen = str->strlen; const unsigned char* const lastmatch = - str_strstart + str_strlen - find_strlen; + str_strstart + str_strlen; UINTVAL* p; const unsigned char* cp; UINTVAL endct, pos; @@ -393,9 +393,9 @@ /* Perform the match */ pos = start; - cp = str_strstart + start; + cp = str_strstart + start + find_strlen; while (cp <= lastmatch) { - register const unsigned char* sp = cp + find_strlen; + register const unsigned char* sp = cp; register const unsigned char* fp = find_strstart + find_strlen; while (fp > find_strstart) { @@ -406,7 +406,7 @@ return pos; } else { - register UINTVAL bsi = badshift[*sp]; + register UINTVAL bsi = badshift[*(cp-1)]; cp += bsi; pos += bsi; }