Package: po4a
Version: 0.36.1-1
Severity: normal
Tags: patch
Hi,
As subject says, attached is a patch for improved Markdown support in
text module.
If needed I can distill my evolutionary notes. You can also see for
yourself with the following commands:
git clone git://source.jones.dk/ikiwiki
cd ikiwiki
git log d0c079.. -- perl/Locale/Po4a/Text.pm
git log -p d0c079.. -- perl/Locale/Po4a/Text.pm
(last command shows progressive patches - in case you don't know Git)
Please apply this for more reliable l10n handling in upcoming po pluging
for ikiwiki.
Kind regards,
- Jonas
-- System Information:
Debian Release: squeeze/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.30-rc5-amd64 (SMP w/2 CPU cores)
Locale: LANG=da_DK.UTF-8, LC_CTYPE=da_DK.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages po4a depends on:
ii gettext 0.17-6 GNU Internationalization utilities
ii libsgmls-perl 1.03ii-32 Perl modules for processing SGML p
ii perl 5.10.0-22 Larry Wall's Practical Extraction
ii perl-modules 5.10.0-22 Core Perl modules
ii sp 1.3.4-1.2.1-47 James Clark's SGML parsing tools
Versions of packages po4a recommends:
ii liblocale-gettext-perl 1.05-4 Using libc functions for internati
ii libterm-readkey-perl 2.30-4 A perl module for simple terminal
ii libtext-wrapi18n-perl 0.06-6 internationalized substitute of Te
po4a suggests no packages.
-- no debconf information
--- /home/jonas/src/tmp/IKIWIKI/po4a-0.36.1/lib/Locale/Po4a/Text.pm
2009-04-05 14:10:21.000000000 +0200
+++ Text.pm 2009-05-25 22:40:39.000000000 +0200
@@ -143,6 +143,7 @@
my $paragraph="";
my $wrapped_mode = 1;
my $expect_header = 1;
+ my $end_of_paragraph = 0;
($line,$ref)=$self->shiftline();
my $file = $ref;
$file =~ s/:[0-9]+$//;
@@ -152,6 +153,8 @@
$file = $1;
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="";
+ $wrapped_mode = 1;
+ $expect_header = 1;
}
chomp($line);
@@ -464,37 +467,53 @@
$self->{indent} = $indent;
$self->{bullet} = "";
}
- } elsif ( $line =~ /^=*$/
- or $line =~ /^_*$/
- or $line =~ /^-*$/) {
+ } elsif ($line =~ /^-- $/) {
+ # Break paragraphs on email signature hint
+ do_paragraph($self,$paragraph,$wrapped_mode);
+ $paragraph="";
+ $wrapped_mode = 1;
+ $self->pushline($line."\n");
+ } elsif ( $line =~ /^=+$/
+ or $line =~ /^_+$/
+ or $line =~ /^-+$/) {
$wrapped_mode = 0;
$paragraph .= $line."\n";
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="";
$wrapped_mode = 1;
} elsif ($markdown and
+ ( $line =~ /^\s*\[\[\!\S+\s*$/ # macro begin
+ or $line =~ /^\s*"""\s*\]\]\s*$/)) { # """ textblock inside
macro end
+ # Avoid translating Markdown lines containing only markup
+ do_paragraph($self,$paragraph,$wrapped_mode);
+ $paragraph="";
+ $wrapped_mode = 1;
+ $self->pushline("$line\n");
+ } elsif ($markdown and
( $line =~ /^#/ # headline
or $line =~ /^\s*\[\[\!\S[^\]]*\]\]\s*$/)) { # sole macro
- # Found Markdown markup that should be preserved as a single line
+ # Preserve some Markdown markup as a single line
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="$line\n";
$wrapped_mode = 0;
+ $end_of_paragraph = 1;
+ } elsif ($markdown and
+ ( $line =~ /^"""/)) { # """ textblock inside macro end
+ # Markdown markup needing separation _before_ this line
do_paragraph($self,$paragraph,$wrapped_mode);
+ $paragraph="$line\n";
$wrapped_mode = 1;
- $paragraph="";
- } elsif ($markdown and
- ( $paragraph =~ m/^>/ # blockquote
- or $paragraph =~ m/[<>]/ # maybe html
- or $paragraph =~ m/^"""/ # textblock inside macro end
- or $paragraph =~ m/"""$/)) { # textblock inside macro begin
- # Found Markdown markup that might not survive wrapping
- $wrapped_mode = 0;
- $paragraph .= $line."\n";
} else {
if ($line =~ /^\s/) {
# A line starting by a space indicates a non-wrap
# paragraph
$wrapped_mode = 0;
+ }
+ if ($markdown and
+ ( $line =~ /\S $/ # explicit newline
+ or $line =~ /"""$/)) { # """ textblock inside macro begin
+ # Markdown markup needing separation _after_ this line
+ $end_of_paragraph = 1;
} else {
undef $self->{bullet};
undef $self->{indent};
@@ -510,7 +529,24 @@
# (more than 3)
# are considered as verbatim paragraphs
$wrapped_mode = 0 if ( $paragraph =~ m/^(\*|[0-9]+[.)] )/s
- or $paragraph =~ m/[ \t][ \t][ \t]/s);
+ or $paragraph =~ m/[ \t][ \t][ \t]/s);
+ if ($markdown) {
+ # Some Markdown markup can (or might) not survive wrapping
+ $wrapped_mode = 0 if (
+ $paragraph =~ /^>/ms # blockquote
+ or $paragraph =~ /^( {8}|\t)/ms # monospaced
+ or $paragraph =~ /^\$(\S+[{}]\S*\s*)+/ms # Xapian macro
+ or $paragraph =~ /<(?![a-z]+[:@])/ms # maybe html (tags
but not wiki <URI>)
+ or $paragraph =~ /^[^<]+>/ms # maybe html (tag
with vertical space)
+ or $paragraph =~ /\[\[\!\S[^\]]+$/ms # macro begin
+ );
+ }
+ if ($end_of_paragraph) {
+ do_paragraph($self,$paragraph,$wrapped_mode);
+ $paragraph="";
+ $wrapped_mode = 1;
+ $end_of_paragraph = 0;
+ }
($line,$ref)=$self->shiftline();
}
if (length $paragraph) {