On Mon, 17 Sep 2012 18:29:34 +0000 "Wang, Li" <li.w...@ttu.edu> wrote:
> Dear List members > > I have three columns of a table. The example is as follows: > > DBS R^2 genename > 801 0.27807486057281494 POPTR_0002s00200 > 1903 1.0 POPTR_0002s00200 > 1103 0.25852271914482117 POPTR_0002s00200 > 3215 0.03134157508611679 POPTR_0002s00200 > 2415 0.010018552653491497 POPTR_0002s00200 > 1313 0.03134157508611679 POPTR_0002s00200 > 3442 1.0 POPTR_0002s00200 > 2642 0.25852271914482117 POPTR_0002s00200 > 1540 1.0 POPTR_0002s00200 > 228 0.03134157508611679 POPTR_0002s00200 > 3099 0.026160990819334984 POPTR_0002s00210 > 7555 0.800000011920929 POPTR_0002s00210 > 4457 0.014814814552664757 POPTR_0002s00210 > 7564 5.232862313278019E-4 POPTR_0002s00210 > 4466 0.0018315018387511373 POPTR_0002s00210 > 10 0.0036630036775022745 POPTR_0002s00210 > 7565 5.232862313278019E-4 POPTR_0002s00210 > 4467 0.0018315018387511373 POPTR_0002s00210 > 11 0.0036630036775022745 POPTR_0002s00210 > 2 1.0 POPTR_0002s00210 > > I would like to calculate the average value of column 2 while the > content of column three is the same. In this case, I would like the > output of my result be as follows: R^2 genename 0.3899163577 > POPTR_0002s00200 0.2314956035 POPTR_0002s00210 > > I donot know how to deal with columns in Perl. I thought about using > the idea of hash. But the key of a hash could not be the same. You'd probably want to use a hash where the key is the genename in the third column of your input, and the value is an arrayref of each value you saw - so you can collate values against genenames, then calculate the average at the end. Something along the lines of: use strict; use List::Util; # For each row, record the value against the genename in question: my %values_by_genename; while(my $line = <>) { chomp $line; next if $genename eq 'genename'; # skip header row my ($dbs, $r2, $genename) = split /\s+/, $line; push @{ $values_by_genename{$genename} }, $r2; } # Now, for each genename, calculate the average value for my $genename (keys %values_by_genename) { my $avg = List::Util::sum( @{ $values_by_genename{$genename} } ) / scalar @{ $values_by_genename{$genename} }; print "$avg,$genename\n"; } Of course, you'd be better off parsing the input using Text::CSV, but the above should give you something to start from. -- David Precious ("bigpresh") <dav...@preshweb.co.uk> http://www.preshweb.co.uk/ www.preshweb.co.uk/twitter www.preshweb.co.uk/linkedin www.preshweb.co.uk/facebook www.preshweb.co.uk/cpan www.preshweb.co.uk/github -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/