In the course of assembling DOSSIER (www.ptf.com/dossier) volumes, I
find myself hand-editing an "executable table of contents" (i.e., a
shell script) to drive the production for each volume.  Frequently,
the entries are Unix "man" pages (e.g., "intro", a.out, renice).  I
try to keep these in order, but all too often, make mistakes.  So, I
wrote some code to detect sequence errors.  Here are the rules:

   Within each section of the man pages, the entries are listed in
   (approximately) ASCII sort order.  The exceptions are as follows:

   * Each section begins with the "intro" man page.

   * Sorting of letters is case-insensitive unless a cross-case match
     occurs; in that case, the matching items are sorted in case-
     sensitive order.

   * Punctuation marks come first, followed by digits and letters.
     If two items differ only by a pair of punctuation marks, ASCII
     sort order is used to resolve the ambiguity.

The checking is done by some Perl code, within a "shell function".
The shell variables "S" and "T" are, respectively, the current man
page section (e.g., 1) and title (e.g., "intro").  Handling the
"intro" special case is pretty simple, if a bit wonky.

I had to think a bit, however, about how to handle the rest of the
sorting.  My first thought was to split both $T and $OldT into lists
of characters, then walk the lists in parallel, checking the rules.
This would certainly have worked, but it didn't seem, erm, Perlish.
So, I came up with the following code, which tickled my funnybone:

tchk () {
#
# tchk - Check $T against $OldT for correct sort order.
#
# Note: Neither $T nor $OldT may contain spaces!
#
# Usage: tchk

   if [ $# -ne 0 ]; then
     echo "!!! tchk: usage error"
   fi

   perl -e '
     {
       $OldS="'"$OldS"'";  $S="'"$S"'";
       $OldT="'"$OldT"'";  $T="'"$T"'";

       # Handle "intro" weirdness.

       if ($S ne $OldS) {                # new section
         if ($T ne "intro") {
           print "!!! missing intro: S=$S, T=$T\n";
         }
         exit;
       }

       if ($T eq "intro") {
         print "!!! misplaced intro: S=$S, T=$T\n";
         exit;
       }

       $OldT = "" if ($OldT eq "intro");

       # Now do simple (:-) matching.

       if (lc($T) eq lc($OldT)) {        # cross-case match
         if ($T lt $OldT) {
           print "!!! broken cross-case match: S=$S, T=$T, OldT=$OldT\n";
           exit;
         }
       } else {
         ($Tst    = $T)    =~ s|(.)| $1|g;
         ($OldTst = $OldT) =~ s|(.)| $1|g;

         $Tst              =~ s| (\d)|1$1|g;
         $OldTst           =~ s| (\d)|1$1|g;

         $Tst              =~ s| ([A-Za-z])|2$1|g;
         $OldTst           =~ s| ([A-Za-z])|2$1|g;

         $Tst              =~ s| |0|g;
         $OldTst           =~ s| |0|g;

         if ($Tst lt $OldTst) {
           print "!!! sequence error: S=$S, T=$T, OldT=$OldT\n";
           exit;
         }
       }
     }
   '

   OldS=$S
   OldT=$T
}
-- 
email: [EMAIL PROTECTED]; phone: +1 650-873-7841
http://www.cfcl.com/rdm    - my home page, resume, etc.
http://www.cfcl.com/Meta   - The FreeBSD Browser, Meta Project, etc.
http://www.ptf.com/dossier - Prime Time Freeware's DOSSIER series
http://www.ptf.com/tdc     - Prime Time Freeware's Darwin Collection

Reply via email to