ID:               43314
 Comment by:       d_kelsey at uk dot ibm dot com
 Reported By:      wiela at centras dot lt
 Status:           Open
 Bug Type:         ICONV related
 Operating System: Windows XP HE
 PHP Version:      5.2.5
 New Comment:

I encountered a similar problem with another utf-8 string, and although
this may not be the best way to fix it, this change provides a
workaround.

in iconv.c (line 1281 in php5.2.5) the line
out_size -= ((nbytes_required - (char_cnt - 2)) + 1) / (3 - 1);

should be changed to
out_size -= ((nbytes_required - (char_cnt - 2)) + 1) / 3;

It looks like the code attempts to determine how many characters would
fit into output buffer when converted (given that it has gone over the
limit), but it assumes that on average each character uses 2 bytes (ie
an even mixture of encoded and printable characters). A lot of strings
will be greater than this and out_size will be set to a very large
positive number (as it subtracts a larger number from out_size and being
unsigned will result in a large positive number).
The workaround is to take the worst case scenario and assume all
characters generated 3 bytes (ie all encoded).


Previous Comments:
------------------------------------------------------------------------

[2007-11-16 16:23:17] wiela at centras dot lt

Description:
------------
iconv_mime_encode(),'Q' encoding scheme isn't reliable and 
sometimes (for particular character and/or string length combination?)
returns: "Notice: iconv_mime_encode(): Unknown error (7) in ..."
*without any result*. 

This also applies to earlier php versions (tested with php 5.2.1).



Reproduce code:
---------------
$preferences = array(
    "input-charset" => "UTF-8",
    "output-charset" => "UTF-8",
    "line-length" => 76,
    "line-break-chars" => "\n",
    "scheme" => "Q"
);

// $str1 results error, it's utf-8 string, its base64_encode() is: 
//'xIXEjcSZxJfEr8WhxbPFviDEr8SZxI3FocWzxJnEr8SFIMSNxJnFs8SFxaHFs8Wr'
$str1 = "ąčęėįšųž
įęčšųęįą
čęųąšųū"; 

// $str2 doesn't result error, although it's only one character
// shorter. It's utf-8 string, its base64_encode() is: 
//'xIXEjcSZxJfEr8WhxbPFviDEr8SZxI3FocWzxJnEr8SFIMSNxJnFs8SFxaHFsw=='
$str2 = "ąčęėįšųž
įęčšųęįą
čęųąšų";

echo iconv_mime_encode("Subject", $str1, $preferences);
echo iconv_mime_encode("Subject", $str2, $preferences);


Expected result:
----------------
Well, at least any (*some*) result is expected, without any 
errors and warnings. 

For $str1 is expected:
Subject: =?UTF-8?Q?=C4=85=C4=8D=C4=99=C4=97=C4=AF=C5=A1=C5=B3?=
 =?UTF-8?Q?=C5=BE=20=C4=AF=C4=99=C4=8D=C5=A1=C5=B3=C4=99=C4=AF?=
 =?UTF-8?Q?=C4=85=20=C4=8D=C4=99=C5=B3=C4=85=C5=A1=C5=B3=C5=AB?=

For $str2 is expected:
Subject: =?UTF-8?Q?=C4=85=C4=8D=C4=99=C4=97=C4=AF=C5=A1=C5=B3?=
 =?UTF-8?Q?=C5=BE=20=C4=AF=C4=99=C4=8D=C5=A1=C5=B3=C4=99=C4=AF?=
 =?UTF-8?Q?=C4=85=20=C4=8D=C4=99=C5=B3=C4=85=C5=A1=C5=B3?=

Actual result:
--------------
For $str1: 
FALSE with "Notice: iconv_mime_encode(): Unknown error (7) in ..."


For $str2:
Subject: =?UTF-8?Q?=C4=85=C4=8D=C4=99=C4=97=C4=AF=C5=A1=C5=B3?=
 =?UTF-8?Q?=C5=BE=20=C4=AF=C4=99=C4=8D=C5=A1=C5=B3=C4=99=C4=AF?=
 =?UTF-8?Q?=C4=85=20=C4=8D=C4=99=C5=B3=C4=85=C5=A1=C5=B3?=


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=43314&edit=1

Reply via email to