On Thu, May 31, 2012 at 11:37 AM, nathalie <n...@sanger.ac.uk> wrote:
> > > Hi > I have this format of file: (see attached example) > 1 3206102-3207048 3411782-3411981 3660632-3661428 > 2 4481796-4482748 4483180-4483486 > > > and I would like to change it to this > 1 3206102-3207048 > 1 3411782-3411981 > 1 3660632-3661428 > 2 4481796-4482748 > 2 4483180-4483486 ..... > > > I have tried with this script to create an array for each line, and to > print the first element (1 or 2) with the rest of the line but the output > don't seem to be right, could you please advise? > #!/software/bin/perl > use warnings; > use strict; > my $file="example.txt"; > my $in; > open( $in , '<' , $file ) or die( $! ); > #open( $out, ">>txtout"); > > > while (<$in>){ > next if /^#/; > my @lines=split(/\t/); > chomp; > for (@lines) { print $lines[0],"\t",$_,"\n"; }; > > > ouput > 1 1 i don't want this > 1 3206102-3207048 > 1 3411782-3411981 > 1 3660632-3661428 > 1 i don't want this > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > > 1 1 > 1 4334680-4340171 > 1 4341990-4342161 > 1 4342282-4342905 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > > 1 1 > 1 4481796-4482748 > 1 4483180-4483486 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > > 1 1 > 1 4797994-4798062 > 1 4798535-4798566 > 1 4818664-4818729 > 1 4820348-4820395 > 1 4822391-4822461 > 1 4827081-4827154 > 1 4829467-4829568 > 1 4831036-4831212 > 1 4835043-4835096 > > many thanks > Nathalie > > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a company > registered in England with number 2742969, whose registered office is 215 > Euston Road, London, NW1 2BE. > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > Hi Nathalie, Instead of using the split function I would personally go for a regular expression as it allows for a lot more control over what you want to find. Here is my solution... #!/usr/local/bin/perl use strict; use warnings; my $fh; my %results; open ( $fh, '<', 'temp.txt' ) or die $!; while ( <$fh> ) { chomp; my $line = $_; my $rownum = substr($line, 0, 1); my @othernumbers; while ( /(\d{7}-\d{7})/g ) { push ( @othernumbers, $1 ); } $results{$rownum} = \@othernumbers; } close $fh; use Data::Dumper; print Dumper %results; This should print the results below: $VAR1 = '1'; $VAR2 = [ '3206102-3207048', '3411782-3411981', '3660632-3661428' ]; $VAR3 = '2'; $VAR4 = [ '4481796-4482748', '4483180-4483486' ]; And this is I believe where you wanted to go. Of course you could just print it directly without the need for the temp variables etc but I assume that you want to do something more with the found values then just dump them on your screen. Regards, Rob