From: ed at grooveshark dot com Operating system: PHP version: 5.4.4 Package: I18N and L10N related Bug Type: Bug Bug description:levenshtein returns bytes different, not characters different
Description: ------------ The php levenshtein function, documented here: http://php.net/manual/en/function.levenshtein.php does not perform as stated with unicode characters over 1 byte in length. The code sample below will print out a character difference of 3, when it should be 1. The characters below are some random Japanese characters and use 3 bytes to store their values in unicode. The same behavior can be seen comparing an ASCII single quote to a unicode right single quote, which also takes 3 bytes vs the single byte for the ASCII character. Test script: --------------- <?php printf("%d\n", levenshtein("æ¥", "èª")); ?> Expected result: ---------------- Expected Output: 1 Actual result: -------------- Actual Output: 3 -- Edit bug report at https://bugs.php.net/bug.php?id=62466&edit=1 -- Try a snapshot (PHP 5.4): https://bugs.php.net/fix.php?id=62466&r=trysnapshot54 Try a snapshot (PHP 5.3): https://bugs.php.net/fix.php?id=62466&r=trysnapshot53 Try a snapshot (trunk): https://bugs.php.net/fix.php?id=62466&r=trysnapshottrunk Fixed in SVN: https://bugs.php.net/fix.php?id=62466&r=fixed Fixed in SVN and need be documented: https://bugs.php.net/fix.php?id=62466&r=needdocs Fixed in release: https://bugs.php.net/fix.php?id=62466&r=alreadyfixed Need backtrace: https://bugs.php.net/fix.php?id=62466&r=needtrace Need Reproduce Script: https://bugs.php.net/fix.php?id=62466&r=needscript Try newer version: https://bugs.php.net/fix.php?id=62466&r=oldversion Not developer issue: https://bugs.php.net/fix.php?id=62466&r=support Expected behavior: https://bugs.php.net/fix.php?id=62466&r=notwrong Not enough info: https://bugs.php.net/fix.php?id=62466&r=notenoughinfo Submitted twice: https://bugs.php.net/fix.php?id=62466&r=submittedtwice register_globals: https://bugs.php.net/fix.php?id=62466&r=globals PHP 4 support discontinued: https://bugs.php.net/fix.php?id=62466&r=php4 Daylight Savings: https://bugs.php.net/fix.php?id=62466&r=dst IIS Stability: https://bugs.php.net/fix.php?id=62466&r=isapi Install GNU Sed: https://bugs.php.net/fix.php?id=62466&r=gnused Floating point limitations: https://bugs.php.net/fix.php?id=62466&r=float No Zend Extensions: https://bugs.php.net/fix.php?id=62466&r=nozend MySQL Configuration Error: https://bugs.php.net/fix.php?id=62466&r=mysqlcfg