You forgot mb_internal_encoding("UTF-8"); without that, mb_substr is just an alias for substr
my results look like: normal iteration took 0.64724087715149 mb_substr method took 16.471849918365 mb_substr method with shortening the string took 21.613878965378 preg_split method took 1.927277803421 Dan is the winner. preg_split always runs in linear time. Both of the mb_substr are O(N^2), because the first step in mb_substr is splitting the string into array. It is not as intelligent as I initially assumed. Regards, John Campbell On Wed, Jan 13, 2010 at 11:37 AM, Rob Marscher <rmarsc...@beaffinitive.com> wrote: > OK. Here are the results of my rough benchmark. Every time I ran it, the > results were within about .025 seconds of each other so it seems accurate. > Surprisingly, my original mb_substr method won, with preg_split taking just a > little bit longer. John's method of grabbing the first character and then > removing it from the string actually seems take almost exponentially more > time based on how long the string is. I set $strSize to 1000 and had to kill > it because I didn't want to wait so long. There must be something pretty > inefficient going on in mb_substr to make that the case. I suppose we could > look at the source to get to the bottom of it... but I think I've already > spent as much time on this as I'm willing to. Thanks again to you guys. > > $ php mbtest.php > normal iteration took 0.8041729927063 > mb_substr method took 1.7228858470917 > mb_substr method with shortening the string took 7.9840841293335 > preg_split method took 2.1547298431396 > > $ cat mbtest.php > <?php > > $strSize = 100; > $repeats = 1000; > > // make the string somewhat large > $str = ''; > for ($i = 0; $i < $strSize; $i++) { > $str .= "string with utf-8 chars\n åèö"; > } > > // non-multibyte iteration > $start = microtime(true); > for ($i = 0; $i < $repeats; $i++) { > $length = strlen($str); > $newStr = ''; > for ($j = 0; $j < $length; $j++) { > $newStr .= $str{$j}; > } > } > $end = microtime(true); > echo "normal iteration took " . ($end - $start) . "\n"; > > // mb_substr method > $start = microtime(true); > for ($i = 0; $i < $repeats; $i++) { > $length = mb_strlen($str); > $newStr = ''; > $rest = $str; > for ($j = 0; $j < $length; $j++) { > $newStr .= mb_substr($rest, $j, 1); > } > } > $end = microtime(true); > echo "mb_substr method took " . ($end - $start) . "\n"; > > // mb_substr method, shortening string > $start = microtime(true); > for ($i = 0; $i < $repeats; $i++) { > $length = mb_strlen($str); > $newStr = ''; > $rest = $str; > while ($rest) { > $newStr .= mb_substr($rest, 0, 1); > $rest = mb_substr($rest, 1); > } > } > $end = microtime(true); > echo "mb_substr method with shortening the string took " . ($end - $start) . > "\n"; > > // preg_split method > $start = microtime(true); > for ($i = 0; $i < $repeats; $i++) { > $chars = preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY); > $length = count($chars); > $newStr = ''; > for ($j = 0; $j < $length; $j++) { > $newStr += $chars[$j]; > } > } > $end = microtime(true); > echo "preg_split method took " . ($end - $start) . "\n"; > > > > _______________________________________________ > New York PHP Users Group Community Talk Mailing List > http://lists.nyphp.org/mailman/listinfo/talk > > http://www.nyphp.org/Show-Participation > _______________________________________________ New York PHP Users Group Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk http://www.nyphp.org/Show-Participation