Summary: Reading string into associative array key garbles
Created an attachment (id=363)
.tar.gz file with D1 code illustrating bug and one-line sample input text file
Either I'm doing something dumb, or I've found a bug where a string gets
trashed between storing it as key in an associative array and then getting it
The weird thing is it only happens when the string is read in from a file.
Adding the same string as a literal doesn't trigger it.
The attached D1 code simply reads in each line from a BufferedFile, storing it
as key in an uint[string] AA that counts how many times each line occurred. It
verifies the the line is valid UTF-8 going in. It then loops over the keys in
the AA, verifying that they're valid UTF-8 and printing them out. Only the
string fails validation and gives an error if you try to print it out. I don't
think there's anything special about the particular string that I'm using.
I verified this with three compilers on two operating systems:
DMD 1.043 on Ubuntu 8.10 x86_64
gcc version 4.1.3 20070831 (prerelease gdc 0.25, using dmd 1.021) (Ubuntu
gdcmac trunk r229 (based on gcc 4.0.1) on Mac OS X 10.5.5 x86_64
Here is some sample output:
Matched bad input.
Read 1 lines, 1 unique (0 non-UTF).
2nd validate: string
didn't validate as UTF
Error: 4invalid UTF-8 sequence
The Unicode string printed out (as decimal chars) varies each time under Linux,
perhaps suggesting its reading some memory it oughtn't?
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------