Hi all,
I have a file that it looks as it follows:

>name1
ADASDADFSDF
ADASDADFSDF
SDFDSFDSDFDF
>name2
ASDFDFDFFDF
ADFEERERREWR
ADFADFQERQEWR
>name1
ADASDADFSDF
SDFDSFADFDF
SDFDSFDSDFDF
>name3
SDAFSDFDFF
WERWERER
WERWERER

and I want to have something like this:
>name1
ADASDADFSDFADASDADFSDFSDFDSFDSDFDF
>name2
ADASDADFSDFSDFDSFADFDFSDFDSFDSDFDF
>name3
SDAFSDFDFFWERWERERWERWERER

Note that  ">name1  is repeated in the input but not in the output.

With the script below  I can  put everything under ">anyname"
in one line. However, if ">anyname" is repeated I will get a
concatenation and I do not want that
Any help welcome and thanks in advance.
Cheers


#!/usr/sbin/perl
if (!@ARGV) {
    print STDERR "usage: $0 fasta_file \n";

    exit 0;
}
my $FILE  = shift @ARGV;
my @ID;
my %SEQ;

read_alignment($FILE);
foreach my $key ( keys (%SEQ)){ #defines key for each key

     printf "%s\n%s\n", $key, $SEQ{$key};
}
sub read_alignment {
     my $line;
     my ($file) = @_;
     #local (*TMP);
     open(TMP, $file) or die "can't open file '$file'\n";
     while ( $line = <TMP> ) {
          chomp($line);
          if ($line =~ /(>\S+)\s*/) {#&& (! $SEQ{$1}))  {
          push (@ID, $1);
          }
          else {
          $SEQ{$1}.= $line;

         }
    }
 close TMP
}
However, I find that


*******************************************************************
PEDRO A. RECHE , pHD            TL: 617 632 3824
Dana-Farber Cancer Institute,   FX: 617 632 4569
Harvard Medical School,         EM: [EMAIL PROTECTED]
44 Binney Street, D1510A,       EM: [EMAIL PROTECTED]
Boston, MA 02115                URL:
http://www.reche.org
*******************************************************************

Reply via email to