Edit report at http://bugs.php.net/bug.php?id=53596&edit=1
ID: 53596
User updated by: anton dot a dot minin at gmail dot com
Reported by: anton dot a dot minin at gmail dot com
Summary: Function iconv_mime_decode failed to decode utf-8
header
-Status: Open
+Status: Closed
Type: Bug
Package: ICONV related
Operating System: CentOS release 5.5
PHP Version: 5.3.4
Block user comment: N
Private report: N
New Comment:
With the php.ini option iconv.internal_encoding=utf-8 it works properly.
Previous Comments:
------------------------------------------------------------------------
[2010-12-23 11:02:27] anton dot a dot minin at gmail dot com
Description:
------------
---
>From manual page: http://www.php.net/function.iconv-mime-decode
---
Function iconv_mime_decode can't decode header with non-ascii
characters, if
charset differs from ISO-8859-1.
For example iconv_mime_decode can't decode string
"Subject:
=?utf-8?Q?=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82,=20=D0=9C=D0=B5?=
=?utf-8?Q?=D0=B4=D0=B2=D0=B5=D0=B4!=20(Hello,=20Bear!)?="
Test script:
---------------
<?php
$plan = array(
// It's erroneous way to encode non-ascii text with ISO-8859-1, but
// in this case the encode and the decode are inverse functions
// i. e. $a == decode(encode($a))
array(
'description' => "Non-ASCII characters, ISO-8859-1 to
ISO-8859-1 conversion",
'subject' => "ÐÑивеÑ, ÐедведÑ! (Hello,
Bear!)",
'prefs' => array(
'input-charset' => 'iso-8859-1',
'output-charset' => 'iso-8859-1',
)
),
// unfortunately fails
array(
'description' => "Non-ASCII characters and UTF-8",
'subject' => "ÐÑивеÑ, Ðедвед! (Hello,
Bear!)",
'prefs' => array(
'input-charset' => 'utf-8',
'output-charset' => 'utf-8',
)
),
array(
'description' => "Only ASCII characters and UTF-8",
'subject' => "Hello, Bear!",
'prefs' => array(
'input-charset' => 'utf-8',
'output-charset' => 'utf-8',
)
),
array(
'description' => "Only ASCII characters and Windows-1251
charset",
'subject' => "Hello, Bear!",
'prefs' => array(
'input-charset' => 'utf-8',
'output-charset' => 'windows-1251',
)
),
array(
'description' => "Non-ASCII characters and Windows-1251
charset",
'subject' => "ÐÑивеÑ, ÐедведÑ! (Hello,
Bear!)",
'prefs' => array(
'input-charset' => 'utf-8',
'output-charset' => 'windows-1251',
)
)
);
foreach ($plan as $case) {
printf("\n\nStart: %s\n%s\n", $case['description'], str_repeat("=",
80));
$prefs = $case['prefs'];
$prefs['scheme'] = 'Q';
$subject_encoded = iconv_mime_encode('Subject', $case['subject'],
$prefs);
printf("Encoded subject: %s\n", var_export($subject_encoded, 1));
if (!$subject_encoded) {
$status = 'FAILED due to iconv_mime_encode';
} else {
$status = false === iconv_mime_decode($subject_encoded) ?
'FAILED' : 'PASSED';
}
printf("[%s] %s\n", $status, $case['description']);
}
echo "\n";
Expected result:
----------------
All tests should pass.
Actual result:
--------------
Start: Non-ASCII characters, ISO-8859-1 to ISO-8859-1 conversion
================================================================================
Encoded subject: 'Subject:
=?iso-8859-1?Q?=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82,?
=
=?iso-8859-1?Q?=20=D0=9C=D0=B5=D0=B4=D0=B2=D0=B5=D0=B4=D1=8C!=20(Hello,?=
=?iso-8859-1?Q?=20Bear!)?='
[PASSED] Non-ASCII characters, ISO-8859-1 to ISO-8859-1 conversion
Start: Non-ASCII characters and UTF-8
================================================================================
Encoded subject: 'Subject: =?utf-8?Q?
=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82,=20=D0=9C=D0=B5?=
=?utf-8?Q?=D0=B4=D0=B2=D0=B5=D0=B4!=20(Hello,=20Bear!)?='
[FAILED] Non-ASCII characters and UTF-8
Start: Only ASCII characters and UTF-8
================================================================================
Encoded subject: 'Subject: =?utf-8?Q?Hello,=20Bear!?='
[PASSED] Only ASCII characters and UTF-8
Start: Only ASCII characters and Windows-1251 charset
================================================================================
Encoded subject: 'Subject: =?windows-1251?Q?Hello,=20Bear!?='
[PASSED] Only ASCII characters and Windows-1251 charset
Start: Non-ASCII characters and Windows-1251 charset
================================================================================
Encoded subject: 'Subject: =?windows-1251?Q?
=CF=F0=E8=E2=E5=F2,=20=CC=E5=E4=E2=E5=E4=FC!=20(?=
=?windows-1251?Q?Hello,=20Bear!)?='
[FAILED] Non-ASCII characters and Windows-1251 charset
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/bug.php?id=53596&edit=1