On 2009-10-18 at 3:37 PM, [email protected] (le...@gmail) wrote:

>
>I have a fairly long fortune file ( 
>http://home.kreme.com/mysigs.txt )  that I would like to sort 
>in alphabetic order, excepting leading  punctuation or supper 
>common words like 'a', 'the', 'I', &c.
>
>I have two problems, one of which I think is trivial, so let's 
>cover  that one first.
>
>I need to capture the entire 'fortune' from % %, but fortune 
>files  don't have a leading % or a closing % at the file 
>boundaries. Many of  the fortunes are multiple lines, and I do 
>not want to muck up the  formatting when I sort.
>
>The second problem, of course, is excluding a list of words I 
>consider  'common' from the sort.  For example,
>
>"May the forces of evil become confused on the way to your house"
>
>I would have both 'the' and 'may' in my exclusion list, so that 
>should  sort based on 'forces'.
>
>"If a pig looses its voice, is it disgruntled"
>
>would be sorted under 'pig'
>
>Ideas?  Probably a perl script, huh?
>

Try this untested script (provide filenames and add your own 
list of skipped words below __END__):

#!/usr/bin/perl -w

my %sort_buckets;
my %exclusions;

# provide your filenames here:
my $file_to_sort = 'path/to/file_to_sort';
my $sorted_file = 'path/to/sorted_file';

while (<DATA>) {
     $exclusions{$_}++;
}

open $in, "<", $file_to_sort
     or die "Can't open file: $!";

while (<$in>) {
     my $line = $_;
     my $sort_key = join "" => map {
         $_ =~ /$exclusions{$_}/i ? () : $_
     } split " " => $line;
     $sort_buckets{$sort_key} = $line;
}

close $in;

open $out, ">", $sorted_file
     or die "Can't open file: $!";

foreach (sort keys %sort_buckets) {
     print $out $sort_buckets{$_}, "\n";
}

close $out;

__END__
a
the
this
that
you
when
is
may
be
if


    - Bruce

_bruce__van_allen__santa_cruz_ca_


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the 
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/bbedit?hl=en
If you have a feature request or would like to report a problem, 
please email "[email protected]" rather than posting to the group.
-~----------~----~----~----~------~----~------~--~---

Reply via email to