--On Monday, August 25, 2003 6:50 PM -0400 Mike Robeson <[EMAIL PROTECTED]> wrote:
OK, I feel like an idiot. When I initially asked for help with this I just realized that I forgot two little details. I was supposed to add the number of sequences as well as the length of the sequences at the top of the output file.
That is this file:
dogagatagatcgcatcgacatacgcttcgatacgctagcttamouseagatatacgggtt
is relly supposed to be:
3 22 a g a t a g a t c g c a t c g a - - - - - - dog a c g c t t c g a t a c g c t a g c t t a - cat a g a t a t a c g g g t t - - - - - - - - - mouse
The '3' represents the number of individual sequences in the file (i.e. dog, cat, mouse). And the 22 is the number of letters and dashes there are. The length is already in the script as $len. I am able to get the length listed at the top. However, I cannot find a way to have the number of sequences (the 3 in this case) printed to the top.
Here's one way (slightly altering John's solution), but it will use lots of memory if the sequences are long.
#!/usr/bin/perl use warnings; use strict;
my ($name, $num_seq, @seq); my $len = 30; while ( <DATA> ) { unless ( /^\s*$/ or s/^\s*>(\S+)// ) { my $name = $1; my @char = ( /[acgt]/g, ( '-' ) x $len )[ 0 .. $len - 1 ]; push @seq, "@char $name"; $num_seq++; } } { local $" ="\n"; print "[EMAIL PROTECTED]"; }
__DATA__ > dog agatagatcgcatcga > cat acgcttcgatacgctagctta > mouse agatatacgggt
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]