On Mon, 30 Sep 2002, Paul and Joann Van Dalen wrote: > Hi all, > > Given an input file that has the following records: > > 123 ABC XX112 YYYY Zzzzzzzzzz > 123 DEF XX113 WWWW Zzzzzzzzz > 123 EEF XX112 YYYY Zzzzzzzzzz > 444 ccc vvbfd QQQQ ccccccccc > 444 CCd vvbfd QQQQ ccccccccc > 444 ddd ssddd QQQQ xxxxxxxx > > I need to focus on the first column (the input file is already sorted on > that field) and, grouped by the first column, pull out > the first record of that group. > e.g., I would need to have the following from the above as output: > > 123 ABC XX112 YYYY Zzzzzzzzzz > 444 ccc vvbfd QQQQ ccccccccc > > I believe I'd need something like a hash, where for every record within > a group defined by the common value of the first column, I take the > numerically first occurance of that group, but I don't know how to do > that in Perl. Would it take a loop for each group within the loop for > the entire file??
You are on the right track, you will have to do something like this open (INPUTFILE, $your_input_file) or die "Error opening $your_input_file: $!\n"; my %uniq_hash; while (<INPUTFILE>) { (my $first_field) = (split)[0]; unless ($uniq_hash{$first_field}++) { # The first check will pass and increment the key value # All further checks will fail print; } } close (INPUTFILE); In principle this quite similar to perldoc -q 'How can I remove duplicate elements from a list or array?' > > Thanks very much, > Paul > > > > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]