Hi David,
I see you've dual-lifed Pod-Html to CPAN. The RT queue at
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Pod-Html doesn't work however,
so I'm contacting you directly. Please let me know if you'd prefer a
ticket at rt.perl.org or something like that.
As reported by Jakub Wilk in <http://bugs.debian.org/378327> (Cc'd as
[email protected]), pod2html doesn't handle multiple angled bracket
delimiters (C<< foo >>) quite correctly.
There's some misunderstanding about the problem in the bug trace, so
just for the record: the perlpod documentation currently states
A more readable, and perhaps more "plain" way is to use an
alternate set of delimiters that doesn’t require a single ">"
to be escaped. With the Pod formatters that are standard starting
with perl5.5.660, doubled angle brackets ("<<" and ">>") may be
used if and only if there is whitespace right after the opening
delimiter and whitespace right before the closing delimiter!
so I<< x >> is supposed to turn into <em>x</em> just like I<x>.
I'm attaching a patch against the CPAN 1.09_4 version that includes some
testcases failing with the current code and proposed fixes. The changes
are pretty straightforward: the _go_ahead() function already has the
needed logic for matching the right number of closing brackets.
I had to modify t/htmlvie?.html and t/torture.html because the expected
data in them was IMO wrong.
The t/03_output change is because the diagnostics for broken
markup changed a bit. YMMV here of course.
The greedy vs. non-greedy whitespace match change is just so that empty
delimited content works, particularly the Z<< >> test. It's IMO the only
slightly suspicious part in this but at least the other tests don't break :)
The _depod1() function looks like it could have the same problem, but I
wasn't able to come up with a failing testcase, so I haven't touched that.
Please let me know what you think.
Html.pm | 23 +++++++++++--------
MANIFEST | 2 +
t/01_core.t | 4 ++-
t/03_output.t | 4 +--
t/double-angled.html | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++
t/double-angled.pod | 34 +++++++++++++++++++++++++++++
t/htmlviec.html | 4 +--
t/htmlviei.html | 4 +--
t/htmlview.html | 4 +--
t/torture.html | 2 -
Thanks for taking up Pod-Html maintenance,
--
Niko Tyni [email protected]
diff --git a/Html.pm b/Html.pm
index a3f13a9..7f4ab51 100644
--- a/Html.pm
+++ b/Html.pm
@@ -1343,7 +1343,8 @@ sub _process_text1($$;$$){
if( $func eq 'B' ){
# B<text> - boldface
- $res = '<strong>' . _process_text1( $lev, $rstr ) . '</strong>';
+ my $par = _go_ahead( $rstr, 'B', $closing );
+ $res = '<strong>' . _process_text( \$par ) . '</strong>';
} elsif( $func eq 'C' ){
# C<code> - can be a ref or <code></code>
@@ -1360,18 +1361,19 @@ sub _process_text1($$;$$){
} elsif( $func eq 'E' ){
# E<x> - convert to character
- $$rstr =~ s/^([^>]*)>//;
- my $escape = $1;
+ my $escape = _go_ahead( $rstr, 'E', $closing );
$escape =~ s/^(\d+|X[\dA-F]+)$/#$1/i;
$res = "&$escape;";
} elsif( $func eq 'F' ){
# F<filename> - italicize
- $res = '<em class="file">' . _process_text1( $lev, $rstr ) . '</em>';
+ my $par = _go_ahead( $rstr, 'F', $closing );
+ $res = '<em class="file">' . _process_text( \$par ) . '</em>';
} elsif( $func eq 'I' ){
# I<text> - italicize
- $res = '<em>' . _process_text1( $lev, $rstr ) . '</em>';
+ my $par = _go_ahead( $rstr, 'I', $closing );
+ $res = '<em>' . _process_text( \$par ) . '</em>';
} elsif( $func eq 'L' ){
# L<link> - link
@@ -1494,21 +1496,22 @@ sub _process_text1($$;$$){
} elsif( $func eq 'S' ){
# S<text> - non-breaking spaces
- $res = _process_text1( $lev, $rstr );
+ my $par = _go_ahead( $rstr, 'S', $closing );
+ $res = _process_text( \$par );
$res =~ s/ / /g;
} elsif( $func eq 'X' ){
# X<> - ignore
- warn "$0: $Podfile: invalid X<> in paragraph $Paragraph.\n"
- unless $$rstr =~ s/^[^>]*>// or $Quiet;
+ my $par = _go_ahead( $rstr, 'X', $closing );
} elsif( $func eq 'Z' ){
# Z<> - empty
+ my $par = _go_ahead( $rstr, 'Z', $closing );
warn "$0: $Podfile: invalid Z<> in paragraph $Paragraph.\n"
- unless $$rstr =~ s/^>// or $Quiet;
+ unless $par eq '' or $Quiet;
} else {
my $term = _pattern $closing;
- while( $$rstr =~ s/\A(.*?)(([BCEFILSXZ])<(<+[^\S\n]+)?|$term)//s ){
+ while( $$rstr =~ s/\A(.*?)(([BCEFILSXZ])<(<+[^\S\n]+?)?|$term)//s ){
# all others: either recurse into new function or
# terminate at closing angle bracket(s)
my $pt = $1;
diff --git a/MANIFEST b/MANIFEST
index c121b61..9702a2d 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -9,6 +9,8 @@ t/02_exception.t
t/03_output.t
t/die-end.pod
t/die-unbal.pod
+t/double-angled.html
+t/double-angled.pod
t/f-basic.html
t/f-head.html
t/f-html.html
diff --git a/t/01_core.t b/t/01_core.t
index a27f065..9fc54e5 100644
--- a/t/01_core.t
+++ b/t/01_core.t
@@ -1,7 +1,7 @@
#!/usr/bin/perl -w
use strict;
-use Test::More tests => 26;
+use Test::More tests => 27;
use Cwd;
use Pod::Html;
@@ -113,6 +113,8 @@ convert_ok("htmlview.pod", "htmlviei.html", "html view noindex", [qw[--noindex]]
convert_ok("htmlview.pod", "htmlviec.html", "html view noindex title",
[qw[--css=/nullcss.css --title=PodPageTitle]]);
+convert_ok("double-angled.pod", "double-angled.html", "handle << >> delimiters");
+
TODO: {
local $TODO = 'blank lines mangled in explicit HTML blocks';
convert_ok("rt-9385.pod", "rt-9385.html", "RT #9385");
diff --git a/t/03_output.t b/t/03_output.t
index 828194b..3f6c716 100644
--- a/t/03_output.t
+++ b/t/03_output.t
@@ -31,7 +31,7 @@ stderr_is(
stderr_is(
sub {pod2html('--infile=t/unclosed.pod', '--outfile=t/unclosed.out')},
<<EOM, 'unclosed'
-t/03_output.t: t/unclosed.pod: undelimited <> in paragraph 3: 'bold'.
+t/03_output.t: t/unclosed.pod: undelimited B<> in paragraph 3 (_go_ahead): 'bold'.
t/03_output.t: t/unclosed.pod: undelimited C<> in paragraph 4 (_go_ahead): 'code'.
EOM
);
@@ -139,7 +139,7 @@ my $error_message = <<EOM;
t/03_output.t: t/torture.pod: invalid Z<> in paragraph $para[0].
t/03_output.t: t/torture.pod: unknown pod directive 'bogus' in paragraph $para[1]. ignoring.
t/03_output.t: t/torture.pod: unexpected =back directive in paragraph $para[2]. ignoring.
-t/03_output.t: t/torture.pod: invalid X<> in paragraph $para[3].
+t/03_output.t: t/torture.pod: undelimited X<> in paragraph $para[3] (_go_ahead): ''.
t/03_output.t: t/torture.pod: unknown pod directive 'comment' in paragraph $para[4]. ignoring.
t/03_output.t: t/torture.pod: unexpected =item directive in paragraph $para[5]. ignoring.
t/03_output.t: t/torture.pod: cannot resolve L<impossible> in paragraph $para[6].
diff --git a/t/double-angled.html b/t/double-angled.html
new file mode 100644
index 0000000..a0dfcf3
--- /dev/null
+++ b/t/double-angled.html
@@ -0,0 +1,59 @@
+<?xml version="1.0" ?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+<title>Double angled brackets</title>
+<meta http-equiv="content-type" content="text/html; charset=utf-8" />
+<link rev="made" href="mailto:r...@localhost" />
+</head>
+
+<body style="background-color: white">
+
+
+<!-- INDEX BEGIN -->
+<div name="index">
+<p><a name="__index__"></a></p>
+
+<ul>
+
+ <li><a href="#double_angled_brackets">Double angled brackets</a></li>
+ <ul>
+
+ <li><a href="#lima">Lima</a></li>
+ </ul>
+
+ <li><a href="#more_angled_brackets">More angled brackets</a></li>
+</ul>
+
+<hr name="index" />
+</div>
+<!-- INDEX END -->
+
+<p>
+</p>
+<h1><a name="double_angled_brackets">Double angled brackets</a></h1>
+<p><strong>Bravo</strong></p>
+<p><em class="file">Foxtrot</em></p>
+<p><em>India</em></p>
+<p><a href="#lima">Lima</a></p>
+<p>Sierra Nevada</p>
+<p></p>
+<p>Zulu</p>
+<p>|</p>
+<p>></p>
+<p><</p>
+<p><code><<html/>></code></p>
+<p>
+</p>
+<h2><a name="lima">Lima</a></h2>
+<p>Test section.</p>
+<p>
+</p>
+<hr />
+<h1><a name="more_angled_brackets">More angled brackets</a></h1>
+<p><strong>Bravo thrice</strong></p>
+<p><em class="file">Foxtrot five times</em></p>
+
+</body>
+
+</html>
diff --git a/t/double-angled.pod b/t/double-angled.pod
new file mode 100644
index 0000000..055ef86
--- /dev/null
+++ b/t/double-angled.pod
@@ -0,0 +1,34 @@
+=head1 Double angled brackets
+
+B<< Bravo >>
+
+F<< Foxtrot >>
+
+I<< India >>
+
+L<< /Lima >>
+
+S<< Sierra Nevada >>
+
+X<< X-ray >>
+
+Z<< >>Zulu
+
+E<< verbar >>
+
+E<< gt >>
+
+E<< lt >>
+
+C<< <<html/>> >>
+
+=head2 Lima
+
+Test section.
+
+=head1 More angled brackets
+
+B<<< Bravo thrice >>>
+
+F<<<<< Foxtrot five times >>>>>
+
diff --git a/t/htmlviec.html b/t/htmlviec.html
index d975bee..18fae37 100644
--- a/t/htmlviec.html
+++ b/t/htmlviec.html
@@ -60,8 +60,8 @@
like an <html> tag. This is some <code>$code($arg1)</code>.</p>
<p>This <code>text contains embedded bold and italic tags</code>. These can
be nested, allowing <strong>bold and <em>bold & italic</em> text</strong>. The module also
-supports the extended <strong>syntax </strong>> and permits <em>nested tags &
-other <strong>cool </strong></em>> stuff >></p>
+supports the extended <strong>syntax</strong> and permits <em>nested tags &
+other <strong>cool</strong> stuff</em></p>
<p>
</p>
<hr />
diff --git a/t/htmlviei.html b/t/htmlviei.html
index 510f42c..7a13a64 100644
--- a/t/htmlviei.html
+++ b/t/htmlviei.html
@@ -62,8 +62,8 @@
like an <html> tag. This is some <code>$code($arg1)</code>.</p>
<p>This <code>text contains embedded bold and italic tags</code>. These can
be nested, allowing <strong>bold and <em>bold & italic</em> text</strong>. The module also
-supports the extended <strong>syntax </strong>> and permits <em>nested tags &
-other <strong>cool </strong></em>> stuff >></p>
+supports the extended <strong>syntax</strong> and permits <em>nested tags &
+other <strong>cool</strong> stuff</em></p>
<p>
</p>
<hr />
diff --git a/t/htmlview.html b/t/htmlview.html
index e22fb67..ead6d7e 100644
--- a/t/htmlview.html
+++ b/t/htmlview.html
@@ -59,8 +59,8 @@
like an <html> tag. This is some <code>$code($arg1)</code>.</p>
<p>This <code>text contains embedded bold and italic tags</code>. These can
be nested, allowing <strong>bold and <em>bold & italic</em> text</strong>. The module also
-supports the extended <strong>syntax </strong>> and permits <em>nested tags &
-other <strong>cool </strong></em>> stuff >></p>
+supports the extended <strong>syntax</strong> and permits <em>nested tags &
+other <strong>cool</strong> stuff</em></p>
<p>
</p>
<hr />
diff --git a/t/torture.html b/t/torture.html
index d39daac..6d18a53 100644
--- a/t/torture.html
+++ b/t/torture.html
@@ -48,7 +48,7 @@ One line with tabs.
A second line with no tabs.</p>
<p><table cellspacing="0" cellpadding="0"><tr><td>A line with no tabs.
<tr><td>And<td>a<td>second<td>line<td>with<td>tabs.</table></p>
-<p>Obsolete para and a another> one.<code></code><code>
+<p>Obsolete para and a one.<code></code><code>
</code></p>
<p>
</p>