ID:               42101
 Updated by:       [EMAIL PROTECTED]
 Reported By:      mcorne at yahoo dot com
-Status:           Open
+Status:           Assigned
 Bug Type:         mbstring related
 Operating System: Linux x86-64
 PHP Version:      5.2.4RC2-dev
-Assigned To:      
+Assigned To:      hirokawa
 New Comment:

Assigned to the maintainer of mbstring extension.


Previous Comments:
------------------------------------------------------------------------

[2007-08-15 06:45:07] mcorne at yahoo dot com

Same issue on the latest release.
Test done on:
PHP Version => 5.2.4RC2-dev
System => Linux durbatuluk 2.6.20-16-generic #2 SMP Thu Jun 7 19:00:28
UTC 2007 x86_64
Build Date => Aug 13 2007 21:59:11

------------------------------------------------------------------------

[2007-07-25 12:10:28] mcorne at yahoo dot com

Description:
------------
mb_substr("\x44\xCC\x87", 0, PHP_INT_MAX, 'UTF-8') only captures the
first character on linux 64-bit instead of returning the whole string.
Note that this works fine on Windows XP and Linux 32-bit.

Reproduce code:
---------------
function substring($string, $length)
{
    $substr = mb_substr($string, 0, $length , 'UTF-8');
    $length = strlen($substr);
    $chars = $length? unpack("C{$length}chars", $substr) : array();
    $decs = array_map('dechex', $chars);
    return array($substr, $decs);
}

$test['string'] = "\x44\xCC\x87";
$test['utf8'] = '\x44\xCC\x87';
$test['unicode'] = '\u0044\u0307';
$test['PHP_INT_MAX'] = PHP_INT_MAX;
$test['php_int_max'] = substring($test['string'], PHP_INT_MAX);
$test['9999'] = substring($test['string'], 9999);

print_r($test);


Expected result:
----------------
Array
(
    [string] => Ḋ
    [utf8] => \x44\xCC\x87
    [unicode] => \u0044\u0307
    [PHP_INT_MAX] => 2147483647
    [php_int_max] => Array
        (
            [0] => Ḋ
            [1] => Array
                (
                    [chars1] => 44
                    [chars2] => cc
                    [chars3] => 87
                )

        )

    [9999] => Array
        (
            [0] => Ḋ
            [1] => Array
                (
                    [chars1] => 44
                    [chars2] => cc
                    [chars3] => 87
                )

        )

)

Actual result:
--------------
Array
(
    [string] => Ḋ
    [utf8] => \x44\xCC\x87
    [unicode] => \u0044\u0307
    [PHP_INT_MAX] => 2147483647
    [php_int_max] => Array
        (
            [0] => D
            [1] => Array
                (
                    [chars1] => 44
                )

        )

    [9999] => Array
        (
            [0] => Ḋ
            [1] => Array
                (
                    [chars1] => 44
                    [chars2] => cc
                    [chars3] => 87
                )

        )

)


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=42101&edit=1

Reply via email to