Wijaya Edward wrote:
>
> From: Rob Dixon
>>
>> use strict;
>> use warnings;
>>
>> my %pwm;
>>
>> while (<DATA>) {
>>    my $col = 0;
>>    foreach my $c (/\S/g) {
>>      $pwm{$c}[$col++]++;
>>    }
>> }
>>
>> foreach my $freq (values %pwm) {
>>    $_ = $_ ? $_ / keys %pwm : 0 foreach @$freq;
>> }
>>
>> use Data::Dumper;
>> print Dumper \%pwm;
>>
>>
>> __END__
>> AAA
>> ATG
>> TTT
>> GTC
>
> I was trying your script with this set of strings:
>
> __DATA__
> CAGGTG
> CAGGTG
>
> But how come it returns:
>
> $VAR1 = {
>     'A' => [ 0, '0.5' ],
>     'T' => [ 0, 0, 0, 0, '0.5' ],
>     'C' => [ '0.5' ],
>     'G' => [ 0, 0, '0.5', '0.5', 0, '0.5' ]
> };
>
> Instead of the correct:
>
> $VAR1 = {
>     'A' => [ '0', '1', '0', '0', '0', '0' ],
>     'T' => [ '0', '0', '0', '0', '1', '0' ],
>     'C' => [ '1', '0', '0', '0', '0', '0' ],
>     'G' => [ '0', '0', '1', '1', '0', '1' ]
> };

Hi Edward

(Please bottom-post your replies)

Yes, I'm sorry, Ruud pointed out that I was dividing by the number of symbols
instead of the number of rows to get the probabilities. You have also found that
each symbol was given no probability value after the last column where it
occurred. This revision fixes both of these problems. I've also borrowed Ruud's
neater construct to normalise the hash values in the final loop. Thanks Ruud!

use strict;
use warnings;

my %pwm;
my ($rows, $cols) = (0, 0);

while (<DATA>) {
  my $col = 0;
  $rows++ if /\S/;
  foreach my $c (/\S/g) {
    $pwm{$c}[$col++]++;
    $cols = $col if $col > $cols;
  }
}

foreach my $freq (values %pwm) {
  ($_ |= 0) /= $rows foreach @$freq[0 .. $cols - 1];
}

use Data::Dumper;
print Dumper \%pwm;


__DATA__
CAGGTG
CAGGTG

*OUTPUT*

$VAR1 = {
          'A' => [
                   '0',
                   '1',
                   '0',
                   '0',
                   '0',
                   '0'
                 ],
          'T' => [
                   '0',
                   '0',
                   '0',
                   '0',
                   '1',
                   '0'
                 ],
          'C' => [
                   '1',
                   '0',
                   '0',
                   '0',
                   '0',
                   '0'
                 ],
          'G' => [
                   '0',
                   '0',
                   '1',
                   '1',
                   '0',
                   '1'
                 ]
        };

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to