On Fri, Dec 12, 2003 at 01:48:12AM -0600, Ronn!Blankenship wrote:
> and three, four, . . . is even worse. Any suggestions?
All emails are supposed to have a unique ID. For example, yours is
Message-Id: <[EMAIL PROTECTED]>
All you have to do is search through and discard all messages except the
first one you encounter with each unique ID.
I wrote a quick and dirty perl script a while ago for a similar task
(filtering duplicate poker hand histories). With a couple slight
modifications (3 really, the beginning of the email, the Message-ID:
line, and the end of the email) that you can probably do in a few
minutes, it would work for your task. Here's the perl script that I
wrote that you can feel free to use or modify:
#!/usr/bin/perl -w
#
# filter duplicate hands from a Party style hand file
use strict;
use warnings;
my %seen = (); # hash (look-up-table) to record already seen hands
my $inhand = 0; # whether we are currently inside a hand record
my $hand = 0; # hand number of the hand we are currently inside
my $line; # line we are currently parsing
my @aline = (); # line array so we can join lines if we find an = on the end
my $nn = 0; # line iteration variable
# take line input from command line filename argument or STDIN automagically
while ($line = <>) {
# look for beginning of another hand, o option compiles pattern <O>nce
# only for speed since pattern doesn't change at runtime
if ( $line =~ m{Hand History for Game\s+(\d+)\s+\*\*}o ) {
$seen{$hand}++; # mark previous hand seen
$hand = $1; # current hand number from pattern match above
$inhand = 1; # record that we are currently in a hand
}
if ( $inhand ) {
# print the filtered line unless we've already seen the hand
unless ( $seen{$hand} ) {
$line =~ s{=20$}{ }o; # convert =20 to a space
if ( $line =~ m{=$}o ) { # line continuation mark = found
@aline = ();
$line =~ s{=$}{}o;
push @aline, $line;
while ( $line = <> ) {
unless ( $line =~ m{=$}o ) {
last;
}
$line =~ s{=$}{}o;
push @aline, $line;
}
chomp @aline;
push @aline, $line;
$line = join '', @aline;
}
print $line;
}
# look for a blank or spaces only line to end hand
if ( $line =~ m{^\s*$}o ) {
$inhand = 0; # found a blank line so record not in hand
}
}
}
_______________________________________________
http://www.mccmedia.com/mailman/listinfo/brin-l