On Mon, 30 Sep 2002, Paul and Joann Van Dalen wrote:

> Hi all,
> 
> Given an input file that has the following records:
> 
> 123    ABC    XX112    YYYY    Zzzzzzzzzz
> 123    DEF    XX113    WWWW    Zzzzzzzzz
> 123    EEF    XX112    YYYY     Zzzzzzzzzz
> 444    ccc    vvbfd    QQQQ    ccccccccc
> 444    CCd    vvbfd    QQQQ    ccccccccc
> 444    ddd    ssddd    QQQQ    xxxxxxxx
> 
> I need to focus on the first column (the input file is already sorted on
> that field) and, grouped by the first column, pull out
> the first record of that group.
> e.g., I would need to have the following from the above as output:
> 
> 123    ABC    XX112    YYYY    Zzzzzzzzzz
> 444    ccc    vvbfd    QQQQ    ccccccccc
> 
> I believe I'd need something like a hash, where for every record within
> a group defined by the common value of the first column, I take the
> numerically first occurance of that group, but I don't know how to do
> that in Perl.  Would it take a loop for each group within the loop for
> the entire file??

You are on the right track, you will have to do something like this

open (INPUTFILE, $your_input_file) or die "Error opening $your_input_file: $!\n";
my %uniq_hash;

while (<INPUTFILE>) {
    (my $first_field) = (split)[0];
    unless ($uniq_hash{$first_field}++) {
    # The first check will pass and increment the key value
    # All further checks will fail
        print;
    }
}
close (INPUTFILE);

In principle this quite similar to
perldoc -q 'How can I remove duplicate elements from a list or array?'

> 
> Thanks very much,
> Paul
> 
> 
> 
> 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to