# New Ticket Created by Helmut Wollmersdorfer
# Please include the string: [perl #64918]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=64918 >
Hi,
rakudo$ ./perl6 -e '"\c[LATIN SMALL LETTER A WITH DIAERESIS,COMBINING
CEDILLA]";'
Malformed UTF-8 string
current instr.: 'parrot;PAST;Compiler;escape' pc 9067 (src/POST/Node.pir:90)
called from Sub 'parrot;PAST;Compiler;escape' pc 1731
(src/PAST/Compiler.pir:188)
called from Sub 'parrot;PAST;Compiler;as_post' pc 8758
(src/PAST/Compiler.pir:2313)
called from Sub 'parrot;PAST;Compiler;post_children' pc 2185
(src/PAST/Compiler.pir:415)
called from Sub 'parrot;PAST;Compiler;as_post' pc 2600
(src/PAST/Compiler.pir:602)
called from Sub 'parrot;PAST;Compiler;post_children' pc 2185
(src/PAST/Compiler.pir:415)
called from Sub 'parrot;PAST;Compiler;as_post' pc 3633
(src/PAST/Compiler.pir:866)
called from Sub 'parrot;PAST;Compiler;post_children' pc 2185
(src/PAST/Compiler.pir:415)
called from Sub 'parrot;PAST;Compiler;pirop' pc 4256
(src/PAST/Compiler.pir:1044)
called from Sub 'parrot;PAST;Compiler;post_children' pc 2185
(src/PAST/Compiler.pir:415)
called from Sub 'parrot;PAST;Compiler;as_post' pc 3633
(src/PAST/Compiler.pir:866)
called from Sub 'parrot;PCT;HLLCompiler;compile' pc 428
(src/PCT/HLLCompiler.pir:301)
called from Sub 'parrot;PCT;HLLCompiler;eval' pc 920
(src/PCT/HLLCompiler.pir:519)
called from Sub 'parrot;PCT;HLLCompiler;command_line' pc 1510
(src/PCT/HLLCompiler.pir:798)
called from Sub 'parrot;Perl6;Compiler;main' pc 23985 (perl6.pir:164)
This works:
rakudo$ ./perl6 -e '"\c[LATIN CAPITAL LETTER D,COMBINING DOT BELOW,COMBINING
DOT ABOVE,COMBINING HORN]";'
My versions:
rakudo$ ./perl6 -v
This is Rakudo Perl 6, revision 37980 built on parrot 1.0.0-devel
for i486-linux-gnu-thread-multi
rakudo$ icu-config --unicode-version
5.1
In comparison the following Perl 8.10 script does not croak:
use strict;
use warnings;
use charnames qw(:full);
my $s= "\N{LATIN SMALL LETTER A WITH DIAERESIS}\N{COMBINING CEDILLA}";
use Encode;
decode("utf8",encode("utf8",$s,1),1);
Helmut Wollmersdorfer