Pedro A Reche wrote: > > Hi all, Hello,
> I have a file that it looks as it follows: > > >name1 > ADASDADFSDF > ADASDADFSDF > SDFDSFDSDFDF > >name2 > ASDFDFDFFDF > ADFEERERREWR > ADFADFQERQEWR > >name1 > ADASDADFSDF > SDFDSFADFDF > SDFDSFDSDFDF > >name3 > SDAFSDFDFF > WERWERER > WERWERER > > and I want to have something like this: > >name1 > ADASDADFSDFADASDADFSDFSDFDSFDSDFDF > >name2 > ADASDADFSDFSDFDSFADFDFSDFDSFDSDFDF > >name3 > SDAFSDFDFFWERWERERWERWERER > > Note that ">name1 is repeated in the input but not in the output. > > With the script below I can put everything under ">anyname" > in one line. However, if ">anyname" is repeated I will get a > concatenation and I do not want that > Any help welcome and thanks in advance. > Cheers > > #!/usr/sbin/perl > if (!@ARGV) { > print STDERR "usage: $0 fasta_file \n"; > > exit 0; > } > my $FILE = shift @ARGV; > my @ID; > my %SEQ; > > read_alignment($FILE); > foreach my $key ( keys (%SEQ)){ #defines key for each key > > printf "%s\n%s\n", $key, $SEQ{$key}; > } > sub read_alignment { > my $line; > my ($file) = @_; > #local (*TMP); > open(TMP, $file) or die "can't open file '$file'\n"; > while ( $line = <TMP> ) { > chomp($line); > if ($line =~ /(>\S+)\s*/) {#&& (! $SEQ{$1})) { > push (@ID, $1); > } > else { > $SEQ{$1}.= $line; > > } > } > close TMP > } #!/usr/sbin/perl -w use strict; die "usage: $0 fasta_file\n" unless @ARGV; my %seq; $/ = '>'; while ( <> ) { chomp; next unless /\S/; my ( $key, $data ) = split /\n/, $_, 2; $data =~ s/\s+//g; $seq{$key} .= $data; } for my $key ( keys %seq ) { print ">$key\n$seq{$key}\n"; } __END__ John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]