On May 10, Hughes, Andrew said: >I am using it as a mailing list. However, over the last half year, it has >gotten pretty big, and has some duplicates in it. For reporting sake, I >would like to delete the duplicate records based on email addresses. If you >sign up three times, I only want to keep your first record in there.
The simplest course of action is to use a hash, and use the email address as the deciding factor of uniqueness. open ORIG, "< $db" or die "can't read $db: $!"; open NEW, "> $db.new" or die "can't write $db.new: $!"; my %seen; while (<ORIG>) { my ($email) = (split /\|/)[3]; print NEW if !$seen{$email}++; } close ORIG; close NEW; rename "$db.new" => $db or die "can't rename $db.new to $db: $!"; -- Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/ RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ ** Look for "Regular Expressions in Perl" published by Manning, in 2002 ** <stu> what does y/// stand for? <tenderpuss> why, yansliterate of course. [ I'm looking for programming work. If you like my work, let me know. ] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]