Guang,
Well you are almost there ... you need fifo structure .... namely a pl/sql array
1. create a local pl/sql array to store the delimiter (store the ascii value of the
delimiter to be safe) my_array (varchar2(5))
2. as you find a delimiter insert into the first position in the array and replace the
delimiting character with #
3. lather.rinse.repeat.
when it is time to put it back
use a loop
nIndex := 0;
nPos := 0;
loop
npos := instr(my_str,'#',1);
exit when npos := 0;
nIndex := nindex + 1;
my_str := substr(my_str,1,nPos-1) || chr(my_array(nIndex)) || sybstr(my_str, nPos+1);
end loop;
something like this should help, proof-read though ...
Raj
--------------------------------------------------------------------------------
Rajendra dot Jamadagni at nospamespn dot com
All Views expressed in this email are strictly personal.
QOTD: Any clod can have facts, having an opinion is an art !
-----Original Message-----
Sent: Friday, November 21, 2003 11:44 AM
To: Multiple recipients of list ORACLE-L
Hi Stephane:
Thanks for your good suggestion. I compared the method you suggested and the orginal
one and it indeed boosted the performance (in my simple test). However the ONLY
problem I am having is that by doing TRANSLATE, I lost the original delimits. The new
method (you suggested) correctly "extract" the "words" (and sent for processing), But
after processing I need to put processed-words back to the orginal string with orginal
demilters un-changed. I tried to track to position of delimit from the orginal string
by doing
global_pos := global_pos + pos ;
in my "while" loop, but ltrim(substr(string, pos + 1), '#') will make "global_pos"
wrong when ltrim trims '#'. Any work-around?
TIA.
Guang
-----Original Message-----
Stephane Faroult
Sent: Friday, November 21, 2003 4:19 AM
To: Multiple recipients of list ORACLE-L
Guang,
I agree with your analysis, looping on characters is not the faster you can do,
simply because there is a significant overhead (compared to C code for instance) in a
language such as PL/SQL - which might be perfectly acceptable in some circumstances,
much less so in very repetitive tasks. 'Native compiling', ie turning PL/SQL in C,
might improve performance. However, in my view the best performance gains you may get
is by, so to speak, pushing the bulk of the processing deeper into the kernel (which
isn't by the way exclusive of native compiling). Using a function such as INSTR() will
be much more efficient than looping on characters.
I would suggest something such as :
- First use TRANSLATE() to replace all the characters you want to get rid of by a
single, well identified character, say # (use CHR() || ... for non printable
characters - you can build up the string of characters to translate in the
initialisation section of a package rather than typing it).
- Start with initializing your string to LTRIM(string, '#')
- Then as long as pos := INSTR(string, '#') isn't 0,
get your token as substr(string, 1, pos - 1) then assign ltrim(substr(string, pos
+ 1), '#') to string (very similar to what you were planning to do with owa).
This will be probably much faster than a character-by-character loop and calls to an
owa package.
HTH,
Stephane Faroult
>----- ------- Original Message ------- -----
>From: Guang Mei <[EMAIL PROTECTED]>
>To: Multiple recipients of list ORACLE-L
><[EMAIL PROTECTED]>
>Sent: Thu, 20 Nov 2003 19:39:55
>
>Hi:
>
>In my pl/sql program, I want to process each "word"
>in a string. The
>string is selected from a varchar2(300) column. The
>delimit that separates
>the words is not necessary space character. The
>definition of the delimit
>in this program is set as
>
>1. Any character that is NOT AlphaNumerical (0-9,
>A-Z,a-z)
>and
>2. the character is not one of these: '-.,/<*>_'
>
>Now my program is basically checking each
>character, find the delimit, and
>rebuild each word. After that I process each
>"word". The code looks like
>this:
>
>-------
>str := "This will be a long string with length
>upto 300 characters, it
>may contain some invisible characters';
>len := length(str)+1;
> for i in 1..len loop
> ch := substr(str,i,1);
> if (not strings.isAlnum(ch) and
>instr('-.,/<*>_', ch)<1) then
> if word is not null then
> -- do some processing to variable word !
> word := null; -- reset it
> end if;
> else
> word := word || ch; -- concat ch to word
> end if;
> end loop;
>
>-------
>
>I think It's taking too long because it loops
>through each characters. I
>hope I could find a way to speed it up. I don't
>have experiience in
>owa_pattern, but I thought there might be a way to
>do it here:
>
>----
>str := "This will be a long string with length
>upto 300 characters, it
>may contain some invisible characters';
>newstr := str;
>pos := 1;
>while pos != 0 loop
> pos := owa_pattern.amatch(newstr, 1, '\W');
>-- how can I mask out
>these '-.,/<*>_' ???
> word := substr(newstr, 1, pos-1);
> -- do some processing to variable word !
> if pos != 0 then
> newstr := substr(newstr, pos+1);
> end if;
>end loop;
>------
>
>My simple tests showed that owa_pattern call is
>much slower than direct
>string manupilation. But I would like to try it in
>this case if I could
>easily get the "wrods" from the string. Any
>suggestions?
>
>TIA.
>
>Guang
>
--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Stephane Faroult
INET: [EMAIL PROTECTED]
Fat City Network Services -- 858-538-5051 http://www.fatcity.com
San Diego, California -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).
--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Guang Mei
INET: [EMAIL PROTECTED]
Fat City Network Services -- 858-538-5051 http://www.fatcity.com
San Diego, California -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).
**************************************************************************************
This e-mail message is confidential, intended only for the named recipient(s) above
and may contain information that is privileged, attorney work product or exempt from
disclosure under applicable law. If you have received this message in error, or are
not the named recipient(s), please immediately notify corporate MIS at (860) 766-2000
and delete this e-mail message from your computer, Thank you.
**************************************************************************************5
--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Jamadagni, Rajendra
INET: [EMAIL PROTECTED]
Fat City Network Services -- 858-538-5051 http://www.fatcity.com
San Diego, California -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).