Tex,
This approach would work only if we allowed access to the string
contents always via regimented API. Unfortunately, many third party
extensions (and many bundled ones) simply change the contents of the
string directly via a pointer.. I am not sure we could standardize
this.
-Andrei
On Mar 8, 2006, at 1:35 AM, Tex Texin wrote:
Suggestion for improving the performance of indexing strings:
Associate with the string the index of the first code unit that is a
surrogate.
Since most strings will have no surrogates, these strings will have a
value
greater than the length of the string, and this tells you that you can
index
directly into the string. When there is a surrogate, you can index
directly,
prior to the surrogate's index.
If there is a surrogate then you can consider the meta data for
remembering
which chars used surrogates, to optimize indexing as was proposed.
This is low cost, very efficient... Most strings won't have surrogates.
tex
--
PHP Unicode & I18N Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php