At 5:04 pm -0200 17/1/05, Marco Takita wrote:

What I need is to count how many times either CR05-C1-102 or CR05-C1-103 appears in the text, which I was able to do:

#!/usr/bin/perl

while (<>) {

....



My problem is that I cannot do that for individual blocks like:

Sequence Contig3772
Assembled_from  CR05-C1-102-004-_A01_-CT.F_008.ab1  -40  955
Assembled_from  CR05-C1-102-006-_E05_-CT.F_035.ab1  -40  972
Assembled_from  CR05-C1-102-004-_B01_-CT.F_007.ab1  -32  1007
Assembled_from  CR05-C1-103-033-_G08_-CT.F_026.ab1  397  1400
Assembled_from  CR05-C1-102-060-_D07_-CT.F_029.ab1  403  1450
Assembled_from  CR05-C1-102-008-_G03_-CT.F_010.ab1  404  1427
Assembled_from  CR05-C1-102-065-_F12_-CT.F_043.ab1  406  1498


There are far shorter ways of doing it than I show here but since you say you're new to Perl I'll make it as long as I can:


#!/usr/bin/perl -w use strict; my ($i, $line, @lines, $text); $text = <<'EOT'; Sequence Contig3772 Assembled_from CR05-C1-102-004-_A01_-CT.F_008.ab1 -40 955 Assembled_from CR05-C1-102-006-_E05_-CT.F_035.ab1 -40 972 Assembled_from CR05-C1-102-004-_B01_-CT.F_007.ab1 -32 1007 Assembled_from CR05-C1-103-033-_G08_-CT.F_026.ab1 397 1400 Assembled_from CR05-C1-102-060-_D07_-CT.F_029.ab1 403 1450 Assembled_from CR05-C1-102-008-_G03_-CT.F_010.ab1 404 1427 Assembled_from CR05-C1-102-065-_F12_-CT.F_043.ab1 406 1498 EOT @lines = split m~[ \012 \015 \x{2029} ] ~x, $text; foreach $line ( @lines ) { $i++ if $line =~ m~CR05-C1-102 | CR05-C1-103~ix; } print $i;

$text is some text delimited by paragraph separators of one of 3 kinds -- which of them being irrelevant in this case. We split the $calar $text into an @rray of lines. We then loop through @lines adding 1 to the initial value 0/undefined of $i each time a match (m) is found in $line for ..102 or (|) ..103

JD








Reply via email to