Summary: Reading string into associative array key garbles
           Product: D
           Version: 1.043
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD

Created an attachment (id=363)
 --> (
.tar.gz file with D1 code illustrating bug and one-line sample input text file

Either I'm doing something dumb, or I've found a bug where a string gets
trashed between storing it as key in an associative array and then getting it
back out.

The weird thing is it only happens when the string is read in from a file. 
Adding the same string as a literal doesn't trigger it.  

The attached D1 code simply reads in each line from a BufferedFile, storing it
as key in an uint[string] AA that counts how many times each line occurred.  It
verifies the the line is valid UTF-8 going in.  It then loops over the keys in
the AA, verifying that they're valid UTF-8 and printing them out.  Only the
string fails validation and gives an error if you try to print it out.  I don't
think there's anything special about the particular string that I'm using.

I verified this with three compilers on two operating systems:
DMD 1.043 on Ubuntu 8.10 x86_64
gcc version 4.1.3 20070831 (prerelease gdc 0.25, using dmd 1.021) (Ubuntu
gdcmac trunk r229 (based on gcc 4.0.1) on Mac OS X 10.5.5 x86_64 

Here is some sample output:

Reading data...
Matched bad input.
Read 1 lines, 1 unique (0 non-UTF).
2nd validate: string
didn't validate as UTF
Error: 4invalid UTF-8 sequence

The Unicode string printed out (as decimal chars) varies each time under Linux,
perhaps suggesting its reading some memory it oughtn't?

Configure issuemail:
------- You are receiving this mail because: -------

Reply via email to