In article <[EMAIL PROTECTED]>, Janek Schleicher 
wrote:

> Robin Garbutt wrote at Mon, 23 Jun 2003 11:40:47 +0100:
> 
>> I have a string that is a random sequence like the following:-
>> 
>> ACGTCGTCGTCACACACACGCGTCTCTATACGCG
>> 
>> I want to be able to parse the string, picking out any TATA sequences,
>> colour them in red and make a not of where ther lie in the sequence.
>> 
>> Is this possible with perl?
> 
> Yes, but you have to explain in what matter you want to colorize.
> As output in a terminal window, as html/xml, as a picture, as a word
> document ... .
> 
> If you would have in a pseudo-xml with the tag <red>...</red>,
> you would perhaps do it as:
> 
> $string =~ s/(TATA)/<red>$1</red>/g;

Here is my script using the regex substitution:

(I wonder if there is a way to report the starting char position for regex 
matches like this?)

#!/usr/bin/perl
use warnings;
use strict;

# find_substring2

# I have a string that is a random sequence like the following:-
#
# ACGTCGTCGTCACACACACGCGTCTCTATACGCG
#
# I want to be able to parse the string, picking out any TATA sequences,
# colour them in red and make a not of where ther lie in the sequence.

while (@ARGV) {

   my $sequence = 'TATA';                      #what we are looking for
   my $data = shift;

   open FH, "< ", $data
     or die "Couldn't open datafile $data for reading: $!\n";

   while (<FH>) {
      chomp;
      print matches($., $_, $sequence);
   }
}

# end  main #
# begin sub #

sub matches{
   my @matches;
   my ($line_nbr, $line, $seq) = @_;

   my $start_tag = "\e[31;1m";
   my $stop_tag  = "\e[0m";

   $line =~ s/($seq)/$start_tag$1$stop_tag/g;

   push @matches, sprintf "%5d : %s\n", $line_nbr, $line;

   return @matches;
}


-- 
Kevin Pfeiffer
International University Bremen

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to