Hi Moriyoshi,
OK, I thought the email was lost, so ignore the email I just resent.
In general I like your patch and I would glad to see it fixed.
I already tried to make some fixes.
See the attached patch.
Thanks. Dmitry.
On 03/02/2011 11:51 PM, Moriyoshi Koizumi wrote:
Hey,
I think I can fix it somehow. Please don't be haste with it. I am
going to look into it.
Moriyoshi
On Tue, Mar 1, 2011 at 11:35 PM, Dmitry Stogov<dmi...@zend.com> wrote:
Hi,
I'm going to revert Moriyoshi patch from December and some following fixes.
I like the idea of the patch, but it just doesn't work as expected.
It breaks 10 tests, but in general it breaks most things related to Unicode
(declare statement, multibyte scripts, exif support for Unicode, multibyte
POST requests).
I tried to fix it myself, but I just can't understand how it should work
(it's too big). It also has several places where integers messed with
pointers, old API messed with new one and so on.
I'm going to revert (apply the attached patch) on Thursday.
Following is the list of failed tests:
Shift_JIS request [tests/basic/029.phpt]
Testing declare statement with several type values
[Zend/tests/declare_001.phpt]
Zend Multibyte and ShiftJIS
[Zend/tests/multibyte/multibyte_encoding_001.phpt]
Zend Multibyte and UTF-8 BOM
[Zend/tests/multibyte/multibyte_encoding_002.phpt]
Zend Multibyte and UTF-16 BOM
[Zend/tests/multibyte/multibyte_encoding_003.phpt]
encoding conversion from script encoding into internal encoding
[Zend/tests/multibyte/multibyte_encoding_005.phpt]
086: bracketed namespace with encoding [Zend/tests/ns_086.phpt]
Check for exif_read_data, Unicode user comment [ext/exif/tests/exif003.phpt]
Check for exif_read_data, Unicode WinXP tags [ext/exif/tests/exif004.phpt]
Test mb_get_info() function [ext/mbstring/tests/mb_get_info.phpt]
Thanks. Dmitry.
Index: ext/exif/exif.c
===================================================================
--- ext/exif/exif.c (revision 308813)
+++ ext/exif/exif.c (working copy)
@@ -2664,13 +2664,13 @@
decode = ImageInfo->decode_unicode_le;
}
if (zend_multibyte_encoding_converter(
- pszInfoPtr,
+ (unsigned char**)pszInfoPtr,
&len,
- szValuePtr,
+ (unsigned char*)szValuePtr,
ByteCount,
- ImageInfo->encode_unicode,
- decode
- TSRMLS_CC) != 0) {
+ zend_multibyte_fetch_encoding(ImageInfo->encode_unicode TSRMLS_CC),
+ zend_multibyte_fetch_encoding(decode TSRMLS_CC)
+ TSRMLS_CC) < 0) {
len = exif_process_string_raw(pszInfoPtr, szValuePtr, ByteCount);
}
return len;
@@ -2684,13 +2684,13 @@
szValuePtr = szValuePtr+8;
ByteCount -= 8;
if (zend_multibyte_encoding_converter(
- pszInfoPtr,
+ (unsigned char**)pszInfoPtr,
&len,
- szValuePtr,
+ (unsigned char*)szValuePtr,
ByteCount,
- ImageInfo->encode_jis,
- ImageInfo->motorola_intel ? ImageInfo->decode_jis_be : ImageInfo->decode_jis_le
- TSRMLS_CC) != 0) {
+ zend_multibyte_fetch_encoding(ImageInfo->encode_jis TSRMLS_CC),
+ zend_multibyte_fetch_encoding(ImageInfo->motorola_intel ? ImageInfo->decode_jis_be : ImageInfo->decode_jis_le TSRMLS_CC)
+ TSRMLS_CC) < 0) {
len = exif_process_string_raw(pszInfoPtr, szValuePtr, ByteCount);
}
return len;
@@ -2723,13 +2723,13 @@
/* Copy the comment */
if (zend_multibyte_encoding_converter(
- &xp_field->value,
+ (unsigned char**)&xp_field->value,
&xp_field->size,
- szValuePtr,
+ (unsigned char*)szValuePtr,
ByteCount,
- ImageInfo->encode_unicode,
- ImageInfo->motorola_intel ? ImageInfo->decode_unicode_be : ImageInfo->decode_unicode_le
- TSRMLS_CC) != 0) {
+ zend_multibyte_fetch_encoding(ImageInfo->encode_unicode TSRMLS_CC),
+ zend_multibyte_fetch_encoding(ImageInfo->motorola_intel ? ImageInfo->decode_unicode_be : ImageInfo->decode_unicode_le TSRMLS_CC)
+ TSRMLS_CC) < 0) {
xp_field->size = exif_process_string_raw(&xp_field->value, szValuePtr, ByteCount);
}
return xp_field->size;
Index: ext/mbstring/tests/mb_encoding_aliases.phpt
===================================================================
--- ext/mbstring/tests/mb_encoding_aliases.phpt (revision 308813)
+++ ext/mbstring/tests/mb_encoding_aliases.phpt (working copy)
@@ -13,26 +13,28 @@
?>
--EXPECTF--
Warning: mb_encoding_aliases() expects exactly 1 parameter, 0 given in %s on line 2
-array(10) {
+array(11) {
[0]=>
string(14) "ANSI_X3.4-1968"
[1]=>
string(14) "ANSI_X3.4-1986"
[2]=>
+ string(7) "IBM-367"
+ [3]=>
string(6) "IBM367"
- [3]=>
+ [4]=>
string(9) "ISO646-US"
- [4]=>
+ [5]=>
string(16) "ISO_646.irv:1991"
- [5]=>
+ [6]=>
string(8) "US-ASCII"
- [6]=>
+ [7]=>
string(5) "cp367"
- [7]=>
+ [8]=>
string(7) "csASCII"
- [8]=>
+ [9]=>
string(8) "iso-ir-6"
- [9]=>
+ [10]=>
string(2) "us"
}
array(0) {
Index: ext/mbstring/mbstring.c
===================================================================
--- ext/mbstring/mbstring.c (revision 308813)
+++ ext/mbstring/mbstring.c (working copy)
@@ -2910,7 +2910,7 @@
string.no_encoding = from_encoding->no_encoding;
} else {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to detect character encoding");
- from_encoding = mbfl_no_encoding_pass;
+ from_encoding = &mbfl_encoding_pass;
to_encoding = from_encoding;
string.no_encoding = from_encoding->no_encoding;
}
@@ -3448,7 +3448,7 @@
break;
}
if (elistsz <= 0) {
- from_encoding = mbfl_no_encoding_pass;
+ from_encoding = &mbfl_encoding_pass;
} else if (elistsz == 1) {
from_encoding = *elist;
} else {
@@ -3517,7 +3517,7 @@
if (!from_encoding) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to detect encoding");
- from_encoding = mbfl_no_encoding_pass;
+ from_encoding = &mbfl_encoding_pass;
}
}
if (elist != NULL) {
@@ -3525,7 +3525,7 @@
}
/* create converter */
convd = NULL;
- if (from_encoding != mbfl_no_encoding_pass) {
+ if (from_encoding != &mbfl_encoding_pass) {
convd = mbfl_buffer_converter_new2(from_encoding, to_encoding, 0);
if (convd == NULL) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to create converter");
@@ -4605,7 +4605,7 @@
from_encoding = MBSTRG(http_input_identify);
}
- if (from_encoding == mbfl_no_encoding_pass) {
+ if (from_encoding == &mbfl_encoding_pass) {
return 0;
}
Index: ext/mbstring/libmbfl/mbfl/mbfl_memory_device.c
===================================================================
--- ext/mbstring/libmbfl/mbfl/mbfl_memory_device.c (revision 308813)
+++ ext/mbstring/libmbfl/mbfl/mbfl_memory_device.c (working copy)
@@ -218,7 +218,7 @@
const unsigned char *p;
len = 0;
- p = psrc;
+ p = (const unsigned char*)psrc;
while (*p) {
p++;
len++;
@@ -235,7 +235,7 @@
device->buffer = tmp;
}
- p = psrc;
+ p = (const unsigned char*)psrc;
w = &device->buffer[device->pos];
device->pos += len;
while (len > 0) {
Index: ext/mbstring/libmbfl/filters/mbfilter_ascii.c
===================================================================
--- ext/mbstring/libmbfl/filters/mbfilter_ascii.c (revision 308813)
+++ ext/mbstring/libmbfl/filters/mbfilter_ascii.c (working copy)
@@ -37,7 +37,7 @@
static int mbfl_filt_ident_ascii(int c, mbfl_identify_filter *filter);
-static const char *mbfl_encoding_ascii_aliases[] = {"ANSI_X3.4-1968", "iso-ir-6", "ANSI_X3.4-1986", "ISO_646.irv:1991", "US-ASCII", "ISO646-US", "us", "IBM367", "cp367", "csASCII", NULL};
+static const char *mbfl_encoding_ascii_aliases[] = {"ANSI_X3.4-1968", "iso-ir-6", "ANSI_X3.4-1986", "ISO_646.irv:1991", "US-ASCII", "ISO646-US", "us", "IBM367", "IBM-367", "cp367", "csASCII", NULL};
const mbfl_encoding mbfl_encoding_ascii = {
mbfl_no_encoding_ascii,
Index: ext/mbstring/libmbfl/filters/mbfilter_cp866.c
===================================================================
--- ext/mbstring/libmbfl/filters/mbfilter_cp866.c (revision 308813)
+++ ext/mbstring/libmbfl/filters/mbfilter_cp866.c (working copy)
@@ -37,7 +37,7 @@
static int mbfl_filt_ident_cp866(int c, mbfl_identify_filter *filter);
-static const char *mbfl_encoding_cp866_aliases[] = {"CP866", "CP-866", "IBM-866", NULL};
+static const char *mbfl_encoding_cp866_aliases[] = {"CP866", "CP-866", "IBM-866", "IBM866", NULL};
const mbfl_encoding mbfl_encoding_cp866 = {
mbfl_no_encoding_cp866,
Index: ext/mbstring/libmbfl/filters/mbfilter_cp850.c
===================================================================
--- ext/mbstring/libmbfl/filters/mbfilter_cp850.c (revision 308813)
+++ ext/mbstring/libmbfl/filters/mbfilter_cp850.c (working copy)
@@ -33,7 +33,7 @@
static int mbfl_filt_ident_cp850(int c, mbfl_identify_filter *filter);
-static const char *mbfl_encoding_cp850_aliases[] = {"CP850", "CP-850", "IBM-850", NULL};
+static const char *mbfl_encoding_cp850_aliases[] = {"CP850", "CP-850", "IBM-850", "IBM850", NULL};
const mbfl_encoding mbfl_encoding_cp850 = {
mbfl_no_encoding_cp850,
Index: ext/mbstring/libmbfl/filters/mbfilter_cp5022x.c
===================================================================
--- ext/mbstring/libmbfl/filters/mbfilter_cp5022x.c (revision 308813)
+++ ext/mbstring/libmbfl/filters/mbfilter_cp5022x.c (working copy)
@@ -462,7 +462,7 @@
s = 0x224c;
}
}
- if (s <= 0 || s >= 0x8080 && s < 0x10000) {
+ if (s <= 0 || (s >= 0x8080 && s < 0x10000)) {
int i;
s = -1;
@@ -693,7 +693,7 @@
s = 0x224c;
}
}
- if (s <= 0 || s >= 0x8080 && s < 0x10000) {
+ if (s <= 0 || (s >= 0x8080 && s < 0x10000)) {
int i;
s = -1;
@@ -841,7 +841,7 @@
s = 0x224c;
}
}
- if (s <= 0 || s >= 0x8080 && s < 0x10000) {
+ if (s <= 0 || (s >= 0x8080 && s < 0x10000)) {
int i;
s = -1;
Index: ext/mbstring/mb_gpc.c
===================================================================
--- ext/mbstring/mb_gpc.c (revision 308813)
+++ ext/mbstring/mb_gpc.c (working copy)
@@ -282,7 +282,7 @@
if (info->report_errors) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to detect encoding");
}
- from_encoding = mbfl_no_encoding_pass;
+ from_encoding = &mbfl_encoding_pass;
}
}
Index: Zend/zend_language_scanner_defs.h
===================================================================
--- Zend/zend_language_scanner_defs.h (revision 308813)
+++ Zend/zend_language_scanner_defs.h (working copy)
@@ -1,4 +1,4 @@
-/* Generated by re2c 0.13.5 on Mon Jan 3 06:07:39 2011 */
+/* Generated by re2c 0.13.5 on Mon Feb 28 17:11:13 2011 */
#line 3 "Zend/zend_language_scanner_defs.h"
enum YYCONDTYPE {
Index: Zend/zend_language_scanner.c
===================================================================
--- Zend/zend_language_scanner.c (revision 308813)
+++ Zend/zend_language_scanner.c (working copy)
@@ -1,4 +1,4 @@
-/* Generated by re2c 0.13.5 on Mon Jan 3 06:07:39 2011 */
+/* Generated by re2c 0.13.5 on Mon Feb 28 17:11:13 2011 */
#line 1 "Zend/zend_language_scanner.l"
/*
+----------------------------------------------------------------------+
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php