This is a bug report for perl from Niko Tyni <[email protected]>
generated with the help of perlbug 1.39 running under perl 5.11.4.
-----------------------------------------------------------------
The following script croaks with 'Wide character in subroutine entry'
although $1 should not have the utf8 flag set at all.
#!perl
use Digest::MD5 qw(md5_hex);
qq[\x{263a}] =~ /(.)/ and "$1"; # vivify $1 with utf8 flag
"\303\244" =~ /(.)/ and do {
# "$1"; #uncomment this and it goes away
print md5_hex($1), "\n";;
}
__END__
The exception is thrown from the SvPVbyte call in cpan/Digest-MD5/MD5.xs:726.
AIUI, SvPVbyte() looks at the utf8 flag before handling get magic for
$1 and therefore gets the old value.
I'm attaching a patch that adds a TODO test in XS-APItest for this.
Originally reported by Joey Hess as http://bugs.debian.org/376329 .
-----------------------------------------------------------------
---
Flags:
category=core
severity=low
---
Site configuration information for perl 5.11.4:
Configured by niko at Mon Jan 25 19:04:36 EET 2010.
Summary of my perl5 (revision 5 version 11 subversion 4) configuration:
Local Commit: 403e9a2445a629525fa5e35afa699e4140495dcf
Ancestor: fe61459e95657c432074058bd8854fec03559335
Platform:
osname=linux, osvers=2.6.32-trunk-amd64,
archname=x86_64-linux-gnu-thread-multi
uname='linux madeleine 2.6.32-trunk-amd64 #1 smp sun jan 10 22:40:40 utc
2010 x86_64 gnulinux '
config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN
-Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr
-Dprivlib=/usr/share/perl/5.11 -Darchlib=/usr/lib/perl/5.11 -Dvendorprefix=/usr
-Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5
-Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.11.4
-Dsitearch=/usr/local/lib/perl/5.11.4 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl
-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm
-DDEBUGGING=both -Doptimize=-O0 -Dusedevel -Uuseshrplib -des'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O0 -g',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.4.3 20100108 (prerelease)', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64
libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=/lib/libc-2.10.2.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.10.2'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O0 -g -L/usr/local/lib
-fstack-protector'
Locally applied patches:
---
@INC for perl 5.11.4:
lib
/usr/local/lib/perl/5.11.4
/usr/local/share/perl/5.11.4
/usr/lib/perl5
/usr/share/perl5
/usr/lib/perl/5.11
/usr/share/perl/5.11
.
---
Environment for perl 5.11.4:
HOME=/home/niko
LANG=en_US.UTF-8
LANGUAGE (unset)
LC_CTYPE=fi_FI.UTF-8
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/home/niko/bin:/home/niko/bin:/home/niko/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/sbin:/usr/sbin
PERL_BADLANG (unset)
SHELL=/bin/zsh
>From 8a0ae597ee3ca1671784bda0efa80ca1f22b0155 Mon Sep 17 00:00:00 2001
From: Niko Tyni <[email protected]>
Date: Tue, 26 Jan 2010 21:59:45 +0200
Subject: [PATCH] TODO test: SvPVbyte should handle get magic before checking the utf8 flag
When $1 had the utf8 flag set from a previous match, SvPVbyte
may croak with 'Wide character in subroutine entry' before
resetting the flag to its new value.
Add a support function and a TODO test for this in XS-APItest.
http://bugs.debian.org/376329
---
MANIFEST | 1 +
ext/XS-APItest/APItest.xs | 12 ++++++++++++
ext/XS-APItest/t/svpv_magic.t | 32 ++++++++++++++++++++++++++++++++
3 files changed, 45 insertions(+), 0 deletions(-)
create mode 100644 ext/XS-APItest/t/svpv_magic.t
diff --git a/MANIFEST b/MANIFEST
index 17056fc..2366de9 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -3232,6 +3232,7 @@ ext/XS-APItest/t/printf.t XS::APItest extension
ext/XS-APItest/t/push.t XS::APItest extension
ext/XS-APItest/t/rmagical.t XS::APItest extension
ext/XS-APItest/t/svpeek.t XS::APItest extension
+ext/XS-APItest/t/svpv_magic.t Test behaviour of SvPVbyte and get magic
ext/XS-APItest/t/svsetsv.t Test behaviour of sv_setsv with/without PERL_CORE
ext/XS-APItest/t/utf16_to_utf8.t Test behaviour of utf16_to_utf8{,reversed}
ext/XS-APItest/t/xs_special_subs_require.t for require too
diff --git a/ext/XS-APItest/APItest.xs b/ext/XS-APItest/APItest.xs
index ede6994..41b74c8 100644
--- a/ext/XS-APItest/APItest.xs
+++ b/ext/XS-APItest/APItest.xs
@@ -936,3 +936,15 @@ void
my_exit(int exitcode)
PPCODE:
my_exit(exitcode);
+
+U8
+first_byte(sv)
+ SV *sv
+ CODE:
+ char *s;
+ STRLEN len;
+ s = SvPVbyte(sv, len);
+ RETVAL = s[0];
+ OUTPUT:
+ RETVAL
+
diff --git a/ext/XS-APItest/t/svpv_magic.t b/ext/XS-APItest/t/svpv_magic.t
new file mode 100644
index 0000000..dd2af8c
--- /dev/null
+++ b/ext/XS-APItest/t/svpv_magic.t
@@ -0,0 +1,32 @@
+#!perl -w
+BEGIN {
+ require '../../t/test.pl';
+ plan(5);
+ use_ok('XS::APItest')
+};
+
+$b = "\303\244"; # or encode_utf8("\x{e4}");
+
+is(XS::APItest::first_byte($b), 0303,
+ "test function first_byte works");
+
+$b =~ /(.)/;
+is(XS::APItest::first_byte($1), 0303,
+ "matching works correctly");
+
+$a = qq[\x{263a}]; # utf8 flag is set
+
+$a =~ s/(.)/$1/; # $1 now has the utf8 flag set too
+$b =~ /(.)/; # $1 shouldn't have the utf8 flag anymore
+
+is(XS::APItest::first_byte("$1"), 0303,
+ "utf8 flag in match fetched correctly when stringified first");
+
+$a =~ s/(.)/$1/; # $1 now has the utf8 flag set too
+$b =~ /(.)/; # $1 shouldn't have the utf8 flag anymore
+
+TODO: {
+local $TODO = "SvPVbyte should handle get magic before checking the utf8 flag";
+is(eval { XS::APItest::first_byte($1) } || $@, 0303,
+ "utf8 flag fetched correctly without stringification");
+}
--
1.6.6