I am attemping to create a frequency chart based on a pipe delimited
database output. Following is what I have come up with. The hash create from
this is then output to another file. It seems to be very slow on files in
excess of ~10000 lines. This will be used on ~500000 line files and needs to
be a proficient as possible. If anyone knows a better way to do this, it
would be extremely helpful. Thank you.  -Jess

$table is the number of the field which the frequency will be calculated on.
ex. a|b|c|d| if $table = 3 then c will be used
@tmp is the split array of the line input
%freqidx is the hash with fields values as keys and frequency counts
$tot is for percentage calculation after this loop. this is basically an
INFILE line count

while( <INFILE> ) {
        @tmp = split( /\|/ );
        $x = 0;
        foreach( keys( %freqidx ) ) {
                if( $tmp[$table] ne $_ ) {
                        $x = 1;
                } else {
                        $x = 0;
                        last;  
                }
        }
        if( $x == 1 ) {
                $freqidx{$tmp[$table]} = 0;
        }
        $freqidx{$tmp[$table]}++;
        $tot++;
}

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to