markus schnalke <[email protected]> writes:
>> /usr/share/man/man1/rcsintro.1.gz:.TH RCSINTRO 1 \*(Dt GNU
>> /usr/share/man/man1/saidar.1.gz:.TH saidar 1 $Date:\ 2006/11/30\
>> 23:42:42\ $ i\-scream
>>
> The last line is such a case.
Handled n the patch.
> If you parse it char for char, then you can parse it
I meant thet You can't read information from space delimited text, where
the information means different things. It needs a quote to say BEGIN
and quote to say END for:
NAME SECTION DATE VERSION MANUAL
> The most important thing is detecting the first two parameters
> ... First detect the first two arguments, which will succeed almost
> always.
Added final ELSIF case. Daniel, use this.
Jari
>From 5675160c2b879b9d4b9b29e16224a8090ce32b0a Mon Sep 17 00:00:00 2001
From: Jari Aalto <[email protected]>
Date: Fri, 4 Jun 2010 10:12:23 +0300
Subject: [PATCH] roffit: improve TH handling
Organization: Private
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Signed-off-by: Jari Aalto <[email protected]>
---
roffit | 52 +++++++++++++++++++++++++++++++++++++++++++++-------
1 files changed, 45 insertions(+), 7 deletions(-)
diff --git a/roffit b/roffit
index 3149f37..ae55406 100755
--- a/roffit
+++ b/roffit
@@ -203,23 +203,61 @@ sub parsefile {
$out = "";
# cut off initial spaces
- $rest =~ s/^ +//g;
+ $rest =~ s/^\s+//;
- if($keyword eq "\\\"") {
+ if ( $keyword eq q(\\") ) {
# this is a comment, skip this line
}
- elsif($keyword =~ /^TH$/) {
+ elsif ( $keyword eq "TH" ) {
# man page header:
# curl 1 "22 Oct 2003" "Curl 7.10.8" "Curl Manual"
+
+ # Treat pages that have "*(Dt":
+ # .TH IDENT 1 \*(Dt GNU
+
+ $rest =~ s,\Q\\*(Dt,,g;
+
+ # Delete backslashes
+
+ $rest =~ s,\\,,g;
+
+ # Delete old RCS tags
+ # .TH saidar 1 $Date:\ 2006/11/30\ 23:42:42\ $ i\-scream
+
+ $rest =~ s,\$Date:\s+(.*?)\s+\$,$1,g;
+
# NAME SECTION DATE VERSION MANUAL
- if($rest =~ /([^ ]*) (\d+) \"([^\"]*)\" \"([^\"]*)\"(\"([^\"]*)\")?/) {
+ # section can be: 1 or 3C
+
+ if ( $rest =~ /(\S+)\s+\"?(\d\S?+)\"?\s+\"([^\"]*)\" \"([^\"]*)\"(\"([^\"]*)\")?/ ) {
# strict matching only so far
- $manpage{'name'} = $1;
+ $manpage{'name'} = $1;
$manpage{'section'} = $2;
- $manpage{'date'} = $3;
+ $manpage{'date'} = $3;
$manpage{'version'} = $4;
- $manpage{'manual'} = $6;
+ $manpage{'manual'} = $6;
}
+ # .TH html2text 1 2008-09-20 HH:MM:SS
+ elsif ( $rest =~ m, (\S+) \s+ \"?(\d\S?+)\"? \s+ \"?([ \d:/-]+)\"? \s* (.*) ,x )
+ {
+ $manpage{'name'} = $1;
+ $manpage{'section'} = $2;
+ $manpage{'date'} = $3;
+ $manpage{'manual'} = $4;
+ }
+ # .TH program 1 description
+ elsif ( $rest =~ /(\S+) \s+ \"?(\d\S?+)\"? \s+ (.+)/x )
+ {
+ $manpage{'name'} = $1;
+ $manpage{'section'} = $2;
+ $manpage{'manual'} = $3;
+ }
+ # .TH program 1
+ elsif ( $rest =~ /(\S+) \s+ \"?(\d\S?+)\"? /x )
+ {
+ $manpage{'name'} = $1;
+ $manpage{'section'} = $2;
+ }
}
elsif($keyword =~ /^S[HS]$/) {
# SS is treated the same as SH
--
1.7.1