Hi!
----
Fresh from our bug database: The ${str:offset:size}-operator seems to
scale poorly for large str sizes (e.g. putting 4MB of characters in a
multibyte locale (e.g. "en_US.UTF-8) into a string takes forever).
For example:
-- snip --
$ (for ((z=10 ; z < 16 ; z++ )) ; do printf "#### z=%d:\n" z ; export z
; timex ksh93 -c 'integer i len ; s="x" ; for ((i=0 ; i < z ; i++)) ; do
s+="$s" ; done ; len=${#s} ; print "len=$len" ; for ((i=0 ; i < len ;
i++ )) ; do buf="${s:i:2}" ; done' ; done)
#### z=10:
len=1024
real 0.23
user 0.15
sys 0.06
#### z=11:
len=2048
real 0.54
user 0.44
sys 0.09
#### z=12:
len=4096
real 1.71
user 1.62
sys 0.06
#### z=13:
len=8192
real 6.43
user 6.32
sys 0.06
#### z=14:
len=16384
real 28.19
user 27.63
sys 0.09
#### z=15:
len=32768
real 1:44.89
user 1:43.41
sys 0.22
-- snip --
... and so on... for 4MB of data it's getting really really nasty...
I'm currently scratching my head how to solve the problem - there is no
simple way to fetch charcter "x" from a multibyte string without
scanning the string from the beginning and use |mblen()| to walk over
the data...
... would it be possible to change the variable storage system a bit and
create an array of |wchar_t| on demand for string operators like
${str:offset:size} (the the array is kept around as "cache" until
someone writes to the variable) ?
----
Bye,
Roland
--
__ . . __
(o.\ \/ /.o) [EMAIL PROTECTED]
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 7950090
(;O/ \/ \O;)
_______________________________________________
ast-developers mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-developers