On May 10, Jeff 'japhy' Pinyan said: >On May 10, Hughes, Andrew said: > >>I am using it as a mailing list. However, over the last half year, it has >>gotten pretty big, and has some duplicates in it. For reporting sake, I >>would like to delete the duplicate records based on email addresses. If you >>sign up three times, I only want to keep your first record in there. > >The simplest course of action is to use a hash, and use the email address >as the deciding factor of uniqueness.
I meant to convert the email address to lowercase... > open ORIG, "< $db" or die "can't read $db: $!"; > open NEW, "> $db.new" or die "can't write $db.new: $!"; > > my %seen; > > while (<ORIG>) { > my ($email) = (split /\|/)[3]; print NEW if !$seen{lc $email}++; > } > > close ORIG; > close NEW; > > rename "$db.new" => $db or die "can't rename $db.new to $db: $!"; -- Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/ RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ ** Look for "Regular Expressions in Perl" published by Manning, in 2002 ** <stu> what does y/// stand for? <tenderpuss> why, yansliterate of course. [ I'm looking for programming work. If you like my work, let me know. ] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]