On Fri, May 13, 2011 at 11:46, Nathalie Conte <n...@sanger.ac.uk> wrote: > I have a file with sequences each sequence is 200 pb long and I have 30K > lines > > ATGGATAGATA\n > TTCGATTCATT\n > GCCTAGACAT\n > TTGCATAGACTA\n > I want to calculate the AT ratio of each base based on their position > (3/4) for the 1st position, 3/4 on the second, (0/4) on the 3rd... [ snip ] > foreach my $line (@list){ > chomp $line; > my @pb = split(/\d/, $line); > my @position = $pb[0]; for the fisrt position > $line++; > > do that in a loop 200 times ( as we have 200 pb per sequence) which will > create 200 arrays with 30K digits in them. I would need an array of all > arrays at that point???
You don't need to do it 200 times; you can loop over the file once. Something like this (caveat, untested): my( @AT_count , $total ); open( my $FH , '<' , $file ) or die( $! ); while( <$FH> ) { my @bases = split // , $_; foreach my $idx ( 0 .. $#bases ) { $AT_count[$idx]++ if $bases[$idx] eq 'A' or $bases[$idx] eq 'T'; $total++; } } close( $FH ); foreach my $position ( 0 .. $#AT_count) { printf "%.2f%% AT at position %d\n" , $AT_count[$position] / $total * 100 , $position + 1; } Basically you're going to keep an array where each element corresponds to the number of A or T residues at a given position across all your sequences. So the first element is the total number of [AT] residues in the first column, second element is [AT] in the second column, and so on. Read each line in the file, split into individual bases, loop over each base, and if it is an A or a T, increment the corresponding position in the array. At the end, loop over your count array and convert each element into a percentage and print it out. NOTE: If some of the sequences are different lengths, you'll need to store the total number of positions in an array too. You said they were all 200bp long, so I only used a single counter for the total. chrs, john. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/