Dear John,
Thanks so much for your life saving response.
There are one minor issue I still couldn't solve.
It is the fact that when the bounded region marked
by the array may occur more than once.
(See example no. 7 and 8 in my code below)
To disambiguate the situation, I can give the array that
comes along with the index.
I tried to modify your code below to handle
the matters. But I still cannot solve it.
I think I'm almost there but not quite yet.
Can you advice, how can I go about it?
Thanks so much beforehand. Really hope to hear
from you again.
__BEGIN__
my $t1 ='CCCATCTGTCCTTATTTGCTG'; my @ar1 = qw(ATCTG-3 ATTTG-13);
my $t2 ='ACCCATCTGTCCTTGGCCAT'; my @ar2 = qw(CCATC-2);
my $t3 ='CCACCAGCACCTGTC'; my @ar3 = qw(CCACC-0 CCAGC-3 GCACC-6);
my $t4 ='CCCAACACCTGCTGCCT'; my @ar4 = qw(CCAAC-1 ACACC-4);
my $t5 ='CTGGGTATGGGT'; my @ar5 = qw(GTATG-4 TGGGT-1);
my $t6 = 'AGGAACTTGCCTGTACCACAGGAAG'; my @ar6 = qw( CAGGA-18 AGGAA-19 );
#The above example should yield the same result as previously
# These two examples below are the 'ambiguous' cases.
my $t7 = 'CAGGACTTGCCTGTACCACAGGAAG'; my @ar7 = qw( CAGGA-18 );
my $t8 = 'CAGGATTTGAGGAAGTACCACAGGAAG'; my @ar8 = qw( CAGGA-18 AGGAA-19 );
# Answer 7 -- CAGGACTTGCCTGTACCA[CAGGA]AG Instead of --
[CAGGA]CTTGCCTGTACCACAGGAAG
# Answer 8 -- CAGGATTTGAGGAAGTACCA[CAGGAA]G Instead of --
[CAGGA]TTTG[AGGAA]GTACCACAGGAAG
print put_bracket_jk_idx($t8,[EMAIL PROTECTED]),"\n";
sub put_bracket_jk_idx {
my ( $str, $ar ) = @_;
for my $subs ( @$ar ) {
my ($sb,$id) = split("-",$subs);
print "$sb $id\n";
if ( substr( $str, $id ) =~ /$subs/i ) {
$id += $-[ 0 ];
substr( $str, $id, length $subs ) =~ tr/A-Z/a-z/;
}
}
$str =~ s/([a-z]+)/[\U$1\E]/g;
return $str;
}
print "\n";
__END__
--
Regards,
Edward WIJAYA
SINGAPORE
On Fri, 28 Oct 2005 18:42:11 +0800, John W. Krahn <[EMAIL PROTECTED]> wrote:
Wijaya Edward wrote:
I have the following problem.
I am trying to put the bracket in a string given the set of its
substrings.
Those bracketed region is "bounded" by the given substrings.
Like this, given input "String" and it's "substrings"
String
1.CCCATCTGTCCTTATTTGCTG
2.ACCCATCTGTCCTTGGCCAT
3.CCACCAGCACCTGTC
4.CCCAACACCTGCTGCCT
5.CTGGGTATGGGT
6.AGGAACTTGCCTGTACCACAGGAAG
Substrings:
1. ATCTG ATTTG
2. CCATC
3. CCACC CCAGC GCAAC
4. CCAAC ACACC
5. GTATG TGGGT
6. CAGGA AGGAA
The desired answer are:
1. CCC[ATCTG]TCCTT[ATTTG]CTG
2. AC[CCATC]TGTCCTTGGCCAT
3. [CCACCAGCACC]TGTC *
4. C[CCAACACC]TGCTGCCT *
5. CTGG[GTATGGGT] **
6. AGGAACTTGCCTGTACCA[CAGGAA]G **
Please note that in example 3 and 4 the substrings are "overlapping".
Pay attention also to for example 5 and 6, there exist substrings that
occur
twice. So the answer for example 5 and 6 are NOT
5. C[TGGGTATGGGT] ----this is wrong
6. [AGGAA]CTTGCCTGTACCA[CAGGAA]G ----this is wrong
--
Regards,
Edward WIJAYA
SINGAPORE>
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>