# New Ticket Created by Zefram # Please include the string: [perl #128512] # in the subject line of all future correspondence about this issue. # <URL: https://rt.perl.org/Ticket/Display.html?id=128512 >
A decode-then-encode cycle through the utf8-c8 encoding is meant to round-trip an octet string. But if the input is the UTF-8 encoding of a string that's not NFC normalised, the output ends up different, because this normalisation got performed somewhere in the middle: > Blob[uint8].new(101, 204, 129).decode("utf8-c8").encode("utf8-c8").perl Blob[uint8].new(195,169) > Blob[uint8].new(195, 169).decode("utf8-c8").encode("utf8-c8").perl Blob[uint8].new(195,169) This is of particular concern for things like access to command-line arguments: $ perl6 -e 'say @*ARGS[0].encode("utf8-c8")' $'e\xcc\x81' Blob[uint8]:0x<c3 a9> $ perl6 -e 'say @*ARGS[0].encode("utf8-c8")' $'\xc3\xa9' Blob[uint8]:0x<c3 a9> -zefram