At 01:53 PM 2003-09-16, Russ Allbery wrote:
> Once I'm thru messing with it in the next few days, I can cook up an
> archive of the code, and the test corpus, and all the differences
> (hopefully insignificant) between the output it makes and the output
> Pod::Man makes, given the same input.
That would be great.

OK, your mission, should you choose to accept it, is to peruse this archive: http://www.interglacial.com/temp/simplifex.tar.bz2 The archive is about 4MB. In in you'll find a bunch of stuff:

./lib/
The perl libraries needed for this, beyond the most recent Pod-Simple in CPAN.
(The Pod/Man.pm in there is current from CPAN, plus a line I added to make it strip \n's from paragraphs, so things would diff nicer.)


./corpus_skinny
Small files demonstrating some various pod constructs, typically problematic ones. When I ran into a diff in the corpus_fat directory, I tried to reproduce the problem in as small a file as possible, and then put it in here. Unless I missed something, every behavioral difference between Pod::Man and Pod::Simple::Man that is evidenced in corpus_fat is reflected in here in a more concise way.


./corpus_fat
A whole bunch of large Pod documents (mostly just copied out of the Perl 5.8.0 install I had sitting around), and the results of diffing their Pod::Man and Pod::Simple::Man renderings.


./musings
A directory of files that show potentially interesting differences between Pod::Man and Pod::Simple::Man, but whose diff output didn't seem important to me. Basically all just odd corner cases.


./roffgress
A program which regression-tests Pod::Simple::Man by taking an input file (or all ./*corpus*/*.txt files if none are specified), and diffing the output of Pod::Man on it and Pod::Simple::Man on it.


./roffgressall
Runs roffgress on all the corpus files and saves the output to alldiff.log.

./alldiff.log
A summary (but omitting the data*.txt corpus items) of which files show no regression differences and which show some.




In the two ./corpus_* directories, here's what happened:

My roffgress program would read an input file to inputfilename.tmp which is just like the original except without any verbatim sections that have a blank line between them, since that construct was the cause of many distracting and uninteresting difference points. So verbatim-blank-verbatim gets turned into verbatim-blank-"Then"-blank-verbatim in the .tmp file. Then Pod::Man formatting that file is saved to inputfilename.1.man, and Pod::Simple::Man formatting it is saved to inputfilename.2.man. They're diffed to inputfilename.diff. If there are no differences (ignoring comment lines in the *.man files), then all these files (except the original) are deleted.

But if there are differences, then the roffgress program leaves around all these files. So basically, skim the foo.diff files, and skim the foo.txt for each, and that's that.


Obviously, these are the points of this:


1) To see that, yup, Pod::Man and Pod::Simple::Man behave mostly the same, across a whole lot of input.

2) P'S and P'S'M do have some differences, but (as you see from skimming the .diff files), they are pretty minor -- unless I missed some interesting ones, which is quite possible, as I'm inexperienced with *roff.

3) Where there are differences, ideally they should be a matter of P'S'M getting something right where P'M doesn't, or of the difference being uninteresting. If/when there's a case of P'S'M getting something wrong where P'M gets it right, that means some Perl code needs fixing!


You'll notice that some of the same differences crop up over and over, of which I recall these:


* mostly just various little quirks with =item formatting.

* Z<> in P'M always produces \&, whereas Z<>'s in Pod::Simple (by default) are deleted at the parsing layer and so aren't even there for turning into anything interesting.

* A few minor differences in fancy magic formatting, like turning " into smart quotes. In almost all cases, there is no such difference; but in some cases that I consider rather anomalous (but largely uninteresting), there is a difference or two.

I'm sure this explanation is hazy, so let me know if anything in the archive puzzles you.

--
Sean M. Burke    http://search.cpan.org/~sburke/



Reply via email to