From:             
Operating system: MS Windows XP
PHP version:      5.3.3
Package:          *URL Functions
Bug Type:         Bug
Bug description:parse_url corrupts some UTF-8 strings

Description:
------------
I have tested this with PHP 5.2.9 and 5.3.3.

Some UTF-8 strings are not being processed correctly by parse_url.

In the given example, the result of the evaluation of strings which
contains the chars 'ם' or 'א' is corrupt, whereas the string
'מישהו'(which does not contain the above chars) is being processed
correctly.

The affected characters (in UTF-8) are comprised of the following bytes:

ם - d7|9d

א - d7|90



Those are converted to a char which contains the following bytes: d7|5f.



In addition to ruining the url, this char is not safe with preg_replace.

Therefore, if we merge the result of parse_url back into a string, and then
attempting to replace, say, spaces with underscores using preg_replace, we
will get an empty string.



I believe that this is similar to bug #26391.

Test script:
---------------
$url = 'http://www.mysite.org/he/פרויקטים/ByYear.html';

$url = parse_url($url); //$url['path'] is now corrupt



$url = preg_replace('/\s+/u','_',$url['path']); //$url is now undefined

Expected result:
----------------
The correct portion of the url.

Actual result:
--------------
Corrupt string (or blank after using preg_replace).

-- 
Edit bug report at http://bugs.php.net/bug.php?id=52923&edit=1
-- 
Try a snapshot (PHP 5.2):            
http://bugs.php.net/fix.php?id=52923&r=trysnapshot52
Try a snapshot (PHP 5.3):            
http://bugs.php.net/fix.php?id=52923&r=trysnapshot53
Try a snapshot (trunk):              
http://bugs.php.net/fix.php?id=52923&r=trysnapshottrunk
Fixed in SVN:                        
http://bugs.php.net/fix.php?id=52923&r=fixed
Fixed in SVN and need be documented: 
http://bugs.php.net/fix.php?id=52923&r=needdocs
Fixed in release:                    
http://bugs.php.net/fix.php?id=52923&r=alreadyfixed
Need backtrace:                      
http://bugs.php.net/fix.php?id=52923&r=needtrace
Need Reproduce Script:               
http://bugs.php.net/fix.php?id=52923&r=needscript
Try newer version:                   
http://bugs.php.net/fix.php?id=52923&r=oldversion
Not developer issue:                 
http://bugs.php.net/fix.php?id=52923&r=support
Expected behavior:                   
http://bugs.php.net/fix.php?id=52923&r=notwrong
Not enough info:                     
http://bugs.php.net/fix.php?id=52923&r=notenoughinfo
Submitted twice:                     
http://bugs.php.net/fix.php?id=52923&r=submittedtwice
register_globals:                    
http://bugs.php.net/fix.php?id=52923&r=globals
PHP 4 support discontinued:          http://bugs.php.net/fix.php?id=52923&r=php4
Daylight Savings:                    http://bugs.php.net/fix.php?id=52923&r=dst
IIS Stability:                       
http://bugs.php.net/fix.php?id=52923&r=isapi
Install GNU Sed:                     
http://bugs.php.net/fix.php?id=52923&r=gnused
Floating point limitations:          
http://bugs.php.net/fix.php?id=52923&r=float
No Zend Extensions:                  
http://bugs.php.net/fix.php?id=52923&r=nozend
MySQL Configuration Error:           
http://bugs.php.net/fix.php?id=52923&r=mysqlcfg

Reply via email to