mb_substr is always going to be slow because you always have to iterate from the beginning get the count, thus the loop will run in O(N^2).
In theory, it should be much faster if you just pull the first character. e.g.: while($rest) $char = mb_substr($rest,0,1); $rest = mb_substr($rest,1); This will at least be O(N) on the length of the string. I also like Dan's idea of using preg_split. Regards, John Campbell On Wed, Jan 13, 2010 at 10:02 AM, Rob Marscher <rmarsc...@beaffinitive.com> wrote: > Hi all, > > I have a need to iterate through a multibyte string to process the string > character by character. Hopefully in php6, this will work without any > special work, but as we know we need to use special multibyte string > functions in php5 to work with utf-8 characters. Here's an example that > iterates my dilemma: > > <?php > mb_internal_encoding("UTF-8"); > > $str = "string with utf-8 chars åèö"; > $length = mb_strlen($str); > $brokenStr = ""; > $preservedStr = ""; > > for ($i = 0; $i < $length; $i++) { > $brokenStr .= $str[$i]; > $preservedStr .= mb_substr($str, $i, 1); > } > echo "brokenStr = " . $brokenStr . "\n"; > echo "preservedStr = " . $preservedStr . "\n"; > ?> > > The array notation for string is the normal way to do this with regular > strings: $str[$i]. I assume this will work for multibyte strings in php6. > > -- Is using mb_substr($str, $i, 1) the only way to get this to work in php5? > That's my question. > > It seems like it's going to be many times slower according to some of the > comments I've seen on the multibyte functions in the php manual. > > Thanks!! > -Rob > > _______________________________________________ > New York PHP Users Group Community Talk Mailing List > http://lists.nyphp.org/mailman/listinfo/talk > > http://www.nyphp.org/Show-Participation > _______________________________________________ New York PHP Users Group Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk http://www.nyphp.org/Show-Participation