On 2009-10-18 at 3:37 PM, [email protected] (le...@gmail) wrote:
>
>I have a fairly long fortune file (
>http://home.kreme.com/mysigs.txt ) that I would like to sort
>in alphabetic order, excepting leading punctuation or supper
>common words like 'a', 'the', 'I', &c.
>
>I have two problems, one of which I think is trivial, so let's
>cover that one first.
>
>I need to capture the entire 'fortune' from % %, but fortune
>files don't have a leading % or a closing % at the file
>boundaries. Many of the fortunes are multiple lines, and I do
>not want to muck up the formatting when I sort.
>
>The second problem, of course, is excluding a list of words I
>consider 'common' from the sort. For example,
>
>"May the forces of evil become confused on the way to your house"
>
>I would have both 'the' and 'may' in my exclusion list, so that
>should sort based on 'forces'.
>
>"If a pig looses its voice, is it disgruntled"
>
>would be sorted under 'pig'
>
>Ideas? Probably a perl script, huh?
>
Try this untested script (provide filenames and add your own
list of skipped words below __END__):
#!/usr/bin/perl -w
my %sort_buckets;
my %exclusions;
# provide your filenames here:
my $file_to_sort = 'path/to/file_to_sort';
my $sorted_file = 'path/to/sorted_file';
while (<DATA>) {
$exclusions{$_}++;
}
open $in, "<", $file_to_sort
or die "Can't open file: $!";
while (<$in>) {
my $line = $_;
my $sort_key = join "" => map {
$_ =~ /$exclusions{$_}/i ? () : $_
} split " " => $line;
$sort_buckets{$sort_key} = $line;
}
close $in;
open $out, ">", $sorted_file
or die "Can't open file: $!";
foreach (sort keys %sort_buckets) {
print $out $sort_buckets{$_}, "\n";
}
close $out;
__END__
a
the
this
that
you
when
is
may
be
if
- Bruce
_bruce__van_allen__santa_cruz_ca_
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/bbedit?hl=en
If you have a feature request or would like to report a problem,
please email "[email protected]" rather than posting to the group.
-~----------~----~----~----~------~----~------~--~---