At 4:19 PM -0800 11/14/01, ehughes wrote:
>Hello All,
>
>I have some simple files to parse. The data looks like this:
>
>54 6342059
><snip>
>80 6713793
>
>The first column is an activity code and the second column is a client id
>number.
>
>Here is my task. I need to find the number of unique individuals that
>participated in each activity. Now the boring, tedious way would be to write
>something like this:
><snip>
>Surely, there is something that will capture the data a little easier than
>defining a variable for each activity. There are more than 30 activities.
>
See, any time I see something like "define a variable for each x", I
think about using hashes, arrays, or lists.
You also said:
>It might work, but, I forgot to point out that some people repeat activities
>and should only be counted once.
So, I'd use a hash of hashes, something similar to
while ( $_ = <> ) {
chomp;
($code, $person) = split(/\s/);
$codeCount{$code}{$person} = 1;
}
# print them out
foreach $code (sort keys %codeCount) {
print "$code:\t scalar(keys $codeCount{$code})\n";
}
This is untested code. The idea is that you create a hash of
activity hashes, where each activity is a hash of person codes. The
number of keys in each activity hashes is the number of unique people
taking that activity...
-Jeff Lowrey