Wijaya Edward wrote: > > I have the following problem. > I am trying to put the bracket in a string given the set of its substrings. > Those bracketed region is "bounded" by the given substrings. > Like this, given input "String" and it's "substrings" > > String > 1.CCCATCTGTCCTTATTTGCTG > 2.ACCCATCTGTCCTTGGCCAT > 3.CCACCAGCACCTGTC > 4.CCCAACACCTGCTGCCT > 5.CTGGGTATGGGT > 6.AGGAACTTGCCTGTACCACAGGAAG > > Substrings: > 1. ATCTG ATTTG > 2. CCATC > 3. CCACC CCAGC GCAAC > 4. CCAAC ACACC > 5. GTATG TGGGT > 6. CAGGA AGGAA > > The desired answer are: > 1. CCC[ATCTG]TCCTT[ATTTG]CTG > 2. AC[CCATC]TGTCCTTGGCCAT > 3. [CCACCAGCACC]TGTC * > 4. C[CCAACACC]TGCTGCCT * > 5. CTGG[GTATGGGT] ** > 6. AGGAACTTGCCTGTACCA[CAGGAA]G ** > > Please note that in example 3 and 4 the substrings are "overlapping". > Pay attention also to for example 5 and 6, there exist substrings that occur > twice. So the answer for example 5 and 6 are NOT > > 5. C[TGGGTATGGGT] ----this is wrong > 6. [AGGAA]CTTGCCTGTACCA[CAGGAA]G ----this is wrong > > Since they do not follow the order from the given substrings (array -- see my > code). > Below is my code. It only work for example 1 and 2. > How can I approach this problem so that it can handle all those cases? > > > __BEGIN__ > #!/usr/bin/perl -w > use strict; > > my $s1 ='CCCATCTGTCCTTATTTGCTG'; my @a1 = qw(ATCTG ATTTG); > my $s2 ='ACCCATCTGTCCTTGGCCAT'; my @a2 = qw(CCATC); > my $s3 ='CCACCAGCACCTGTC'; my @a3 = qw(CCACC CCAGC GCACC); > my $s4 ='CCCAACACCTGCTGCCT'; my @a4 = qw(CCAAC ACACC); > my $s5 ='CTGGGTATGGGT'; my @a5 = qw(GTATG TGGGT); > my $s6 = 'AGGAACTTGCCTGTACCACAGGAAG'; my @a6 = qw( CAGGA AGGAA ); > > #These two work fine. > put_bracket($s1,[EMAIL PROTECTED]); > put_bracket($s2,[EMAIL PROTECTED]); > > #But these the rest don't work > put_bracket($s3,[EMAIL PROTECTED]); > put_bracket($s4,[EMAIL PROTECTED]); > put_bracket($s5,[EMAIL PROTECTED]); > put_bracket($s6,[EMAIL PROTECTED]); > > sub put_bracket > { > my ($str,$ar) = @_; > my $bstr; > my $slen = length $ar->[0]; > > foreach my $subs ( @$ar ) > { > my $idx = index($str,$subs); > my $bgn = $idx; > my $end = $idx+$slen+1; > substr($str,$bgn,0,"["); > substr($str,$end,0,"]"); > } > print "$str\n"; > return ; > > > __END__ > > Really hope to hear from you again.
This appears to do what you want: sub put_bracket { my ( $str, $ar ) = @_; my $x = 0; for my $subs ( @$ar ) { if ( substr( $str, $x ) =~ /$subs/i ) { $x += $-[ 0 ]; substr( $str, $x, length $subs ) =~ tr/A-Z/a-z/; } } $str =~ s/([a-z]+)/[\U$1\E]/g; print "$str\n"; return; } John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>