I'm really new to perl. Actually I'm trying write my very first script. Let me try to explain what I need. I have a large text file that is basically something like this:
Sequence Contig3772 Assembled_from CR05-C1-102-004-_A01_-CT.F_008.ab1 -40 955 Assembled_from CR05-C1-102-006-_E05_-CT.F_035.ab1 -40 972 Assembled_from CR05-C1-102-004-_B01_-CT.F_007.ab1 -32 1007 Assembled_from CR05-C1-103-033-_G08_-CT.F_026.ab1 397 1400 Assembled_from CR05-C1-102-060-_D07_-CT.F_029.ab1 403 1450 Assembled_from CR05-C1-102-008-_G03_-CT.F_010.ab1 404 1427 Assembled_from CR05-C1-102-065-_F12_-CT.F_043.ab1 406 1498
Sequence Contig3773 Assembled_from CR05-C1-103-041-_E11_-CT.F_044.ab1 -694 275 Assembled_from CR05-C1-102-019-_A11_-CT.F_048.ab1 -626 289 Assembled_from CR05-C1-102-019-_D03_-CT.F_013.ab1 -625 314 Assembled_from CR05-C1-102-019-_B11_-CT.F_047.ab1 -733 185
Sequence Contig3774
and so on.
What I need is to count how many times either CR05-C1-102 or CR05-C1-103 appears in the text, which I was able to do:
#!/usr/bin/perl
while (<>) {
chomp;
@text = (CR05-C1-102,CR05-C1-103);
foreach $wd (split) {
if ($wd =~ @text[0], @text[1]){ if ($wd =~ @text[0]){ $score++; } if ($wd =~ @text[1]){ $res++; } } } }
print " CR05-C1-102 $score CR05-C1-103 $res \n\n";
My problem is that I cannot do that for individual blocks like:
Sequence Contig3772 Assembled_from CR05-C1-102-004-_A01_-CT.F_008.ab1 -40 955 Assembled_from CR05-C1-102-006-_E05_-CT.F_035.ab1 -40 972 Assembled_from CR05-C1-102-004-_B01_-CT.F_007.ab1 -32 1007 Assembled_from CR05-C1-103-033-_G08_-CT.F_026.ab1 397 1400 Assembled_from CR05-C1-102-060-_D07_-CT.F_029.ab1 403 1450 Assembled_from CR05-C1-102-008-_G03_-CT.F_010.ab1 404 1427 Assembled_from CR05-C1-102-065-_F12_-CT.F_043.ab1 406 1498
I was not able to isolate this block from the rest of the text.
Any idea how to do that?
Thanks a lot
Dr. Marco Aurélio Takita, Ph.D. Centro APTA Citros Sylvio Moreira Rodovia Anhanguera Km 158 Caixa Postal 04 13490-970 Cordeirópolis - SP, BRAZIL Tel.: 55-19-35461399