On Thu, May 31, 2012 at 11:37 AM, nathalie <n...@sanger.ac.uk> wrote:

>
>
> Hi
> I have this format of file: (see attached example)
> 1       3206102-3207048 3411782-3411981 3660632-3661428
> 2       4481796-4482748 4483180-4483486
>
>
> and I would like to change it to this
> 1       3206102-3207048
> 1       3411782-3411981
> 1       3660632-3661428
> 2       4481796-4482748
> 2       4483180-4483486 .....
>
>
> I have tried with this script to create an array for each line, and to
> print the first element (1 or  2) with the rest of the line but the output
> don't seem to be right, could you please advise?
> #!/software/bin/perl
> use warnings;
> use strict;
> my $file="example.txt";
> my $in;
> open(  $in , '<' , $file ) or die( $! );
> #open(  $out, ">>txtout");
>
>
> while (<$in>){
>    next if /^#/;
>    my @lines=split(/\t/);
>    chomp;
> for (@lines) { print $lines[0],"\t",$_,"\n"; };
>
>
> ouput
> 1       1  i don't want this
> 1       3206102-3207048
> 1       3411782-3411981
> 1       3660632-3661428
> 1       i don't want this
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
>
> 1       1
> 1       4334680-4340171
> 1       4341990-4342161
> 1       4342282-4342905
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
>
> 1       1
> 1       4481796-4482748
> 1       4483180-4483486
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
>
> 1       1
> 1       4797994-4798062
> 1       4798535-4798566
> 1       4818664-4818729
> 1       4820348-4820395
> 1       4822391-4822461
> 1       4827081-4827154
> 1       4829467-4829568
> 1       4831036-4831212
> 1       4835043-4835096
>
> many thanks
> Nathalie
>
>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome Research
> Limited, a charity registered in England with number 1021457 and a company
> registered in England with number 2742969, whose registered office is 215
> Euston Road, London, NW1 2BE.
> --
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
>
>
Hi Nathalie,

Instead of using the split function I would personally go for a regular
expression as it allows for a lot more control over what you want to find.
Here is my solution...

#!/usr/local/bin/perl

use strict;
use warnings;

my $fh;

my %results;

open ( $fh, '<', 'temp.txt' ) or die $!;
while ( <$fh> ) {
 chomp;
 my $line = $_;
 my $rownum = substr($line, 0, 1);

 my @othernumbers;
 while ( /(\d{7}-\d{7})/g ) {
  push ( @othernumbers, $1 );
 }

 $results{$rownum} = \@othernumbers;
}
close $fh;

use Data::Dumper;
print Dumper %results;

This should print the results below:

$VAR1 = '1';
$VAR2 = [
          '3206102-3207048',
          '3411782-3411981',
          '3660632-3661428'
        ];
$VAR3 = '2';
$VAR4 = [
          '4481796-4482748',
          '4483180-4483486'
        ];

And this is I believe where you wanted to go. Of course you could just
print it directly without the need for the temp variables etc but I assume
that you want to do something more with the found values then just dump
them on your screen.

Regards,

Rob

Reply via email to