Samuel wrote:

> I see the need for creaton of a search capability for the Perl man pages.
> There is not an index, is there, such as other documentation has.
> 
> I thought Perl would be easier than most other languages to generate a list
> (index) of all words in the Perl man pages and also an application to view
> the index and navigate to references. Since it is not done perhaps it is not
> as easy as I thought it would be to do in Perl. For the Perl man pages in
> HTML format, would it be possible to use one of the internet-style search
> facilities?


I can't imagine using HTML when you have a good text editor available like vim
or emacs to search your docs.


I took five minutes out (actually the ignore part took longer :) and wrote a
script that will read my concatenated man pages and produce an index into them:

use strict;

my $LINE_LENGTH = 79;
my $fold_case = 1;

BEGIN { open STDERR, ">./cerr.tmp"; }

# words to ignore - modify to suit

my @ignore = qw(a about above across actually after again against all already
   also always an and another any anything are around as at AUTHOR available
   be because been before being below better between both but by can't can
   cannot code consider contains context could CPAN create created current
   currently data DESCRIPTION details different does doesn't don't done either
   elsewhere every example examples except following foo for found from
   function functions general get given good got had has have here's here how
   however in instead into is isn't it's it its itself just know later less
   let's like line lines make makes manpage many may means might more most much
   must my name need never no not note nothing now number numbers of off often
   ok on once one only or other otherwise out output outside over perl's perl
   possible probably problem process program programming programs provide
   provides pub rather real really same say section see SEE set sets several
   should similar simple since so some something sometimes such support
   supported sure SV SYNOPSIS system systems take than that's that the their
   them then there's there these they thing things this those though through
   throughout thus to too toward under until up use used using usr usually
   value values variable variables very was way we'll we well were what when
   where whether which who will with within without work works would you'll
   you're you your
);
my %ignore = (); foreach (@ignore) { $ignore{$_}++; }
my %index = ();

# split each line into words and add line no to hash of arrays

while (<>) {

        while (/(\b[^\W_\d][\w'-]+\b)/g ) {

                my $wd = $1;
                next if exists $ignore{$wd};
                next if exists $ignore{lcfirst $wd} and $wd =~ /^[A-Z][a-z]/;
                if ($fold_case) {
                        $wd = lcfirst $wd if $wd =~ /^[A-Z][a-z]/;
                }
                push @{$index{$wd}}, $.;
        }
}

print "Words found ", scalar keys %index, "\n";

# print index

foreach (sort keys %index) {

        print "$_:\n    ";
        my $cnt = 4;
        my $num = @{$index{$_}};
        print STDERR "$_: $num\n" if $num > 100;
        foreach (@{$index{$_}}) {

                my $len = length $_;
                if ($cnt + $len > $LINE_LENGTH) {
                        print "\n    ";
                        $cnt = 4;
                }
                print "$_ ";
                $cnt += $len + 1;
        }
        print "\n";
}

__END__


-- 
   ,-/-  __      _  _         $Bill Luebkert   ICQ=14439852
  (_/   /  )    // //       DBE Collectibles   Mailto:[EMAIL PROTECTED]
   / ) /--<  o // //      http://dbecoll.tripod.com/ (Free site for Perl)
-/-' /___/_<_</_</_     Castle of Medieval Myth & Magic http://www.todbe.com/

_______________________________________________
ActivePerl mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to