At 4:19 PM -0800 11/14/01, ehughes wrote: >Hello All, > >I have some simple files to parse. The data looks like this: > >54 6342059 ><snip> >80 6713793 > >The first column is an activity code and the second column is a client id >number. > >Here is my task. I need to find the number of unique individuals that >participated in each activity. Now the boring, tedious way would be to write >something like this: ><snip> >Surely, there is something that will capture the data a little easier than >defining a variable for each activity. There are more than 30 activities. >
See, any time I see something like "define a variable for each x", I think about using hashes, arrays, or lists. You also said: >It might work, but, I forgot to point out that some people repeat activities >and should only be counted once. So, I'd use a hash of hashes, something similar to while ( $_ = <> ) { chomp; ($code, $person) = split(/\s/); $codeCount{$code}{$person} = 1; } # print them out foreach $code (sort keys %codeCount) { print "$code:\t scalar(keys $codeCount{$code})\n"; } This is untested code. The idea is that you create a hash of activity hashes, where each activity is a hash of person codes. The number of keys in each activity hashes is the number of unique people taking that activity... -Jeff Lowrey