At 4:19 PM -0800 11/14/01, ehughes wrote:
>Hello All,
>
>I have some simple files to parse. The data looks like this:
>
>54            6342059
><snip>
>80            6713793
>
>The first column is an activity code and the second column is a client id
>number.
>
>Here is my task. I need to find the number of unique individuals that
>participated in each activity. Now the boring, tedious way would be to write
>something like this:
><snip>
>Surely, there is something that will capture the data a little easier than
>defining a variable for each activity. There are more than 30 activities.
>

See, any time I see something like "define a variable for each x", I 
think about using hashes, arrays, or lists.

You also said:
>It might work, but, I forgot to point out that some people repeat activities
>and should only be counted once.

So, I'd use a hash of hashes, something similar to
while ( $_ = <> ) {
   chomp;
   ($code, $person) = split(/\s/);
   $codeCount{$code}{$person} = 1;
}

# print them out
foreach $code (sort keys %codeCount) {
   print "$code:\t scalar(keys $codeCount{$code})\n";
}

This is untested code.  The idea is that you create a hash of 
activity hashes, where each activity is a hash of person codes.  The 
number of keys in each activity hashes is the number of unique people 
taking that activity...

-Jeff Lowrey

Reply via email to