stupid newbie question

Marco Takita Mon, 17 Jan 2005 11:06:01 -0800

Hi guys, sorry for the question not directly related to macosx but this is the OS I work with and I know that you guys are really helpful.

I'm really new to perl. Actually I'm trying write my very first script. Let me try to explain what I need. I have a large text file that is basically something like this:

Sequence Contig3772
Assembled_from  CR05-C1-102-004-_A01_-CT.F_008.ab1  -40  955
Assembled_from  CR05-C1-102-006-_E05_-CT.F_035.ab1  -40  972
Assembled_from  CR05-C1-102-004-_B01_-CT.F_007.ab1  -32  1007
Assembled_from  CR05-C1-103-033-_G08_-CT.F_026.ab1  397  1400
Assembled_from  CR05-C1-102-060-_D07_-CT.F_029.ab1  403  1450
Assembled_from  CR05-C1-102-008-_G03_-CT.F_010.ab1  404  1427
Assembled_from  CR05-C1-102-065-_F12_-CT.F_043.ab1  406  1498


Sequence Contig3773
Assembled_from  CR05-C1-103-041-_E11_-CT.F_044.ab1  -694  275
Assembled_from  CR05-C1-102-019-_A11_-CT.F_048.ab1  -626  289
Assembled_from  CR05-C1-102-019-_D03_-CT.F_013.ab1  -625  314
Assembled_from  CR05-C1-102-019-_B11_-CT.F_047.ab1  -733  185

Sequence  Contig3774

and so on.

What I need is to count how many times either CR05-C1-102 or CR05-C1-103 appears in the text, which I was able to do:

#!/usr/bin/perl

while (<>) {

 chomp;

@text = (CR05-C1-102,CR05-C1-103);

         foreach $wd (split) {

        if ($wd =~ @text[0], @text[1]){
        if ($wd =~ @text[0]){
        $score++;
        }
        if ($wd =~ @text[1]){
        $res++;
           }
        }
      }
   }


print " CR05-C1-102 $score CR05-C1-103 $res \n\n";

My problem is that I cannot do that for individual blocks like:

Sequence Contig3772
Assembled_from  CR05-C1-102-004-_A01_-CT.F_008.ab1  -40  955
Assembled_from  CR05-C1-102-006-_E05_-CT.F_035.ab1  -40  972
Assembled_from  CR05-C1-102-004-_B01_-CT.F_007.ab1  -32  1007
Assembled_from  CR05-C1-103-033-_G08_-CT.F_026.ab1  397  1400
Assembled_from  CR05-C1-102-060-_D07_-CT.F_029.ab1  403  1450
Assembled_from  CR05-C1-102-008-_G03_-CT.F_010.ab1  404  1427
Assembled_from  CR05-C1-102-065-_F12_-CT.F_043.ab1  406  1498

I was not able to isolate this block from the rest of the text.

Any idea how to do that?

Thanks a lot

Dr. Marco Aurélio Takita, Ph.D.
Centro APTA Citros Sylvio Moreira
Rodovia Anhanguera Km 158
Caixa Postal 04
13490-970 Cordeirópolis - SP, BRAZIL
Tel.: 55-19-35461399

stupid newbie question

Reply via email to