Sean M Burke <[EMAIL PROTECTED]> writes:
> OK, your mission, should you choose to accept it, is to peruse this archive:
> http://www.interglacial.com/temp/simplifex.tar.bz2
> The archive is about 4MB.
I've looked this over and the output looks pretty solid to me. There are
just a few things that seem problematic in the output, and they're mostly
fairly minor.
The code is another matter and will take me a while to understand. More
comments and explanation would be really appreciated in the areas
surrounding treelets and the like (and I'll have to do a lot of
documentation reading). But it doesn't look too bad, and I expect that
after a day or so of going through it and working on it, I'll figure out
how it works. And I do like how you broke up the initialization code;
that's a definite improvement (as is making magic optional; that's
something I've been thinking about for a while).
I know that the existing code was rather poorly documented as well. As
part of this, I'll likely write up some sort of extensive document on how
the code is structured and what decisions nare made.
Hope you won't mind me reformatting if I'm going to maintain it; I don't
mind working in other people's coding style when I'm contributing, but if
I'm maintaining the code, I'm used to reading it in a particular
structure.
> ./corpus_skinny
> Small files demonstrating some various pod constructs, typically
> problematic ones. When I ran into a diff in the corpus_fat directory, I
> tried to reproduce the problem in as small a file as possible, and then
> put it in here. Unless I missed something, every behavioral difference
> between Pod::Man and Pod::Simple::Man that is evidenced in corpus_fat is
> reflected in here in a more concise way.
This is really excellent; thank you for doing this work! This makes it
far easier for me to see what's going on.
Here are my comments on the individual files here:
agrave Good, thank you.
bulletty Good, thank you.
cplusplus Unimportant difference.
guessing_around_z Good, thank you.
item30 Improvement to variable detection. Missing periods.
item31magic Something weird going on here in both versions.
item31spacing Bug in the new version; spacing should be preserved.
item40 New feature. Indexing is odd.
items10 More odd indexing.
links Improvement to quoting. Good.
magic_around_s New version is buggy here, but not too important.
magic_money Definite improvement here.
ppnameweird Old version is correct, but difference unimportant.
quote_weirdness Good, thank you.
shy Good, thank you.
smallcapsey More odd indexing.
snippet More odd indexing.
whitespace_around_x Unimportant difference.
z_oddity Unimportant difference.
The two things that jump out to me as actual problems are that the new
version does not preserve the periods in cases like:
=item 1.
I think this is a noticable regression. The output looks much better with
the period.
The other is that it looks like the new version is smashing spacing, and
in particular the two spaces after periods and question marks, in headings
and items. I think it should be up to the author whether to use one space
or two spaces and the choice of the author should be respected, since it
is reflected in the output.
The other changes are unimportant or are real bug fixes, with the possible
exception that the new version seems very .IX-happy with the first line of
blocks marked with =item *. I would prefer that it not do this; most of
that is not really index-worthy, and it bloats the size of the output.
But few (probably none) use the .IX feature, so it likely doesn't matter.
I was wonder why, with things like:
=item 1. Some text
the new parser doesn't pull the text down into the body of the item and
still leaves it in the tag. This fix was made for bulletted lists, but
not for numbered lists, and I think it would make more sense to do it for
both types of list.
Note that the current version turns off guesswork between =head1 NAME and
the next =head1; it's intentional that it's not deactivated by a =head2.
But since a man page shouldn't have a =head2 in the NAME section anyway,
the difference is unimportant.
> ./musings
> A directory of files that show potentially interesting differences between
> Pod::Man and Pod::Simple::Man, but whose diff output didn't seem important
> to me. Basically all just odd corner cases.
The bullet transformation and the space smashing mentioned above account
for a lot of these differences. Other than the yen thing, which I think I
want to look at a bit more, I agree that these are fine.
The internationalization issues that this raises are ones that I think
I'll want to deal with for a Pod::Simple-based version. Now that you have
a framework for handling the hard problems there, I think the right thing
to do is for the output of pod2man to assume a particular character set
(probably ISO 8859-1) and optionally produce pure-ASCII or other character
sets. I think that I can do that given the Pod::Simple framework, if I'm
following recent list discussion correctly.
The current way of handling non-ASCII characters produces really bad
results when reading on a text console, so dealing with that before
replacing Pod::Man is probably good given the number of documents you
turned up that were using literal non-ASCII characters.
> 2) P'S and P'S'M do have some differences, but (as you see from skimming
> the .diff files), they are pretty minor -- unless I missed some
> interesting ones, which is quite possible, as I'm inexperienced with *roff.
Nope, I think you did a really great job.
Thank you very much! This was an obvious lot of work on your part and I
really appreciate it. I'm happy to take this code if you'd like me to
maintain it, study and understand it, and release it as the next major
version of Pod::Man and maintain it going forward.
Please don't feel like you have to fix the problems noted above, unless
you feel like they're things that can't be fixed in Pod::Man itself; I
just wanted to note them as things that I'd want to look at.
> * Z<> in P'M always produces \&, whereas Z<>'s in Pod::Simple (by default)
> are deleted at the parsing layer and so aren't even there for turning
> into anything interesting.
Since using:
=item *Z<>
to suppress turning something into a bulletted list still works, this is
fine with me; I just did it that way because it was easy and caused the
right things to happen.
--
Russ Allbery ([EMAIL PROTECTED]) <http://www.eyrie.org/~eagle/>