> From: Nathalie Conte [mailto:n...@sanger.ac.uk] > Sent: Friday, September 30, 2011 9:38 AM > To: beginners@perl.org > Subject: parsing script removing some lines help please > > > > Hi, > I am lost in my script, and would need to basic help please. > I have got a file , separated by tabs, and the first column contain a > chromosome number, then several other column with different infos. > Basically I am trying to created a script that would take a file(see > example), parse line by line, and when the first column start by any > of > the chromosomes I don't want (6,8,14,16,18,Y), go the next line, and if > it doesn't start by the bad chromosomes , print all the line to a new > output file. > the script below, just reprint the same original file :( > thanks for any clues > Nat > > > > #!/software/bin/perl > use warnings; > use strict; > open(IN, "<example.txt") or die( $! ); > open(OUT, ">>removed.txt") or die( $! ); > my @bad_chromosome=(6,8,14,16,18,Y); > while(<IN>){ > chomp; > my @column=split /\t/; > foreach my $chr_no(@bad_chromosome){ > if ($column[0]==$chr_no){ > next; > } > } > print OUT > $column[0],"\t",$column[1],"\t",$column[2],"/",$column[3],"\t",$column[ > 4],"\t",$column[5],"\t",$column[6],"\t",$column[7],"\t",$column[8],"\t" > ,$column[9],"\t",$column[10],"\t",$column[11],"\t",$column[12],"\t",$co > lumn[13],"\t",$column[14],"\n"; > } > > > > close IN; close OUT; > John has provided good advice on this problem, but I wanted to add a couple of things. To avoid explicitly coding the foreach loop for @bad_chromosome, you could use the 'grep' function. Also, if you are just reprinting the input line, print $_.
unless ( grep {$column[0] eq $_} @bad_chromosome ){ print OUT "$_\n"; # or print $OUT if declared as John suggested The grep call will return the number of times $column[0] matched an element of @bad_chromosome. Thus, if there is a match the grep call will evaluate to 'true'. Otherwise, it will evaluate to 'false'. Using grep does have a drawback (but not that much unless you have a lot of values in @bad_chromosome). It checks all the values of @bad_chromosome for a match. Using the 'if ... next' stops looking for a match when a match is found. If you wonder about the use of $_ in the grep function - that is a localized copy of $_ and does not affect the $_ that contains the data read from the file. If you are using Perl 5.10 or higher, you can use the 'smart match' operators instead of grep. HTH, Ken -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/