From:             ed at grooveshark dot com
Operating system: 
PHP version:      5.4.4
Package:          I18N and L10N related
Bug Type:         Bug
Bug description:levenshtein returns bytes different, not characters different

Description:
------------
The php levenshtein function, documented here:

http://php.net/manual/en/function.levenshtein.php

does not perform as stated with unicode characters over 1 byte in length. 
The 
code sample below will print out a character difference of 3, when it
should be 
1.  The characters below are some random Japanese characters and use 3
bytes to 
store their values in unicode.  The same behavior can be seen comparing an
ASCII 
single quote to a unicode right single quote, which also takes 3 bytes vs
the 
single byte for the ASCII character.

Test script:
---------------
<?php
printf("%d\n", levenshtein("日", "語"));
?>




Expected result:
----------------
Expected Output: 1

Actual result:
--------------
Actual Output:   3

-- 
Edit bug report at https://bugs.php.net/bug.php?id=62466&edit=1
-- 
Try a snapshot (PHP 5.4):            
https://bugs.php.net/fix.php?id=62466&r=trysnapshot54
Try a snapshot (PHP 5.3):            
https://bugs.php.net/fix.php?id=62466&r=trysnapshot53
Try a snapshot (trunk):              
https://bugs.php.net/fix.php?id=62466&r=trysnapshottrunk
Fixed in SVN:                        
https://bugs.php.net/fix.php?id=62466&r=fixed
Fixed in SVN and need be documented: 
https://bugs.php.net/fix.php?id=62466&r=needdocs
Fixed in release:                    
https://bugs.php.net/fix.php?id=62466&r=alreadyfixed
Need backtrace:                      
https://bugs.php.net/fix.php?id=62466&r=needtrace
Need Reproduce Script:               
https://bugs.php.net/fix.php?id=62466&r=needscript
Try newer version:                   
https://bugs.php.net/fix.php?id=62466&r=oldversion
Not developer issue:                 
https://bugs.php.net/fix.php?id=62466&r=support
Expected behavior:                   
https://bugs.php.net/fix.php?id=62466&r=notwrong
Not enough info:                     
https://bugs.php.net/fix.php?id=62466&r=notenoughinfo
Submitted twice:                     
https://bugs.php.net/fix.php?id=62466&r=submittedtwice
register_globals:                    
https://bugs.php.net/fix.php?id=62466&r=globals
PHP 4 support discontinued:          
https://bugs.php.net/fix.php?id=62466&r=php4
Daylight Savings:                    https://bugs.php.net/fix.php?id=62466&r=dst
IIS Stability:                       
https://bugs.php.net/fix.php?id=62466&r=isapi
Install GNU Sed:                     
https://bugs.php.net/fix.php?id=62466&r=gnused
Floating point limitations:          
https://bugs.php.net/fix.php?id=62466&r=float
No Zend Extensions:                  
https://bugs.php.net/fix.php?id=62466&r=nozend
MySQL Configuration Error:           
https://bugs.php.net/fix.php?id=62466&r=mysqlcfg

Reply via email to