Package: perl Version: 5.8.8-6 Severity: normal Please excuse the bug title, after working on this for something like 5 hours, I cannot think clearly enough to write a short title describing this very weird bug. Let the code speak for me. I have attached a testcase; untar it, run the "repro" program.
[EMAIL PROTECTED]:~/tmp/repor/testcase>./repro
a
b
Wide character in subroutine entry at /usr/bin/markdown line 360.
zsh: exit 255 ./repro
Now, edit the repro file. There are 4 comments suggesting changes; if you make
any one of the changes, the wide character failure disappears.
Notice that several of the changes should not possibly affect anything,
but do. For example, uncommenting the s/// line should be a null change because
$mommy is otherwise utterly unused. But umcommenting that line "fixes"
the problem. This smells deeply of a perl bug to me. I boiled this test case
down
from several thousand lines of code, dealing with many changes like this that
inexplicably hid the problem.
I should probably do a similar reduction on markdown and possibly
HTML::Scrubber,
but it's getting late. Their versions here are listed below.
Here's some analysis of what's going on inside markdown when it fails:
<paravoid> watch this:
<paravoid> print 'text is utf: ', utf8::is_utf8($text) ? 'yes' : 'no',
"\n";
<paravoid> $text =~ s{
<paravoid> (
# save in $1
<paravoid> ^
# start of line (with /m)
<paravoid> <($block_tags_a) #
start tag = $2
<paravoid> \b
# word break
<paravoid> (.*\n)*?
# any number of lines, minimally matching
<paravoid> </\2>
# the matching end tag
<paravoid> [ \t]*
# trailing spaces/tabs
<paravoid> )
<paravoid> }{
<paravoid> print '$1 is utf: ',
utf8::is_utf8($1) ? 'yes' : 'no', "\n";
<paravoid> my $key = md5_hex($1);
<paravoid> $g_html_blocks{$key} = $1;
<paravoid> "\n\n" . $key . "\n\n";
<paravoid> }egmx;
<paravoid> I added the two 'prints'
<paravoid> text is utf: no
<paravoid> $1 is utf: yes
<paravoid> that's freaking weird
<paravoid> the utf8 flag gets enabled after the regexp is run
Also note that paravoid had a version (much larger; a small modification to
ikiwiki) that reproduced the bug w/o HTML::Scrubber being loaded. As far as
I can guess, the HTML::Scrubber stuff doesn't really have any bearing on the bug
and is just one more mysterious thing that hides the bug if it's removed.
-- System Information:
Debian Release: testing/unstable
APT prefers unstable
APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.17-1-686
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Versions of packages perl depends on:
ii libc6 2.3.6-15 GNU C Library: Shared libraries
ii libdb4.4 4.4.20-6 Berkeley v4.4 Database Libraries [
ii libgdbm3 1.8.3-3 GNU dbm database routines (runtime
ii perl-base 5.8.8-6 The Pathologically Eclectic Rubbis
ii perl-modules 5.8.8-6 Core Perl modules
Versions of packages perl recommends:
ii perl-doc 5.8.8-6 Perl documentation
Other software:
ii markdown 1.0.1-3 Text-to-HTML conversion tool
ii libhtml-scrubb 0.08-2 Perl extension for scrubbing/sanitizing html
paravoid reproduced it using a similar test case on a system running sarge with:
<paravoid> ii perl 5.8.4-8sarge4 Larry Wall's Practical Extraction
and Report
<paravoid> ii markdown 1.0.1-2 Text-to-HTML conversion tool
<paravoid> ii libhtml-scrubb 0.08-1 Perl extension for
scrubbing/sanitizing html
--
see shy jo
signature.asc
Description: Digital signature

