Hi, The obvious problem looked like handling of internal encoding. When the script is written in an encoding that is incompatible with the lexer, the script is converted into internal encoding (input_filter) for parsing, and then gets every string literal converted back to the original encoding (output_filter). Some of the test cases fail because the internal encoding is not set to an encoding that is bidirectionally convertible from/to the script encoding (ISO-8859-1 against Shift_JIS for example.)
I'm gonna make a fix to change that behavior so that the input_filter always converts the script into UTF-8 instead of internal_encoding. Also gonna take a closer look into your patch. You basically don't have to adjust the style of codes under libmbfl as it is a separate library. Bugfixes are always appreciated. Regards, Moriyoshi On Thu, Mar 3, 2011 at 4:44 PM, Dmitry Stogov <dmi...@zend.com> wrote: > Hi Moriyoshi, > > OK, I thought the email was lost, so ignore the email I just resent. > > In general I like your patch and I would glad to see it fixed. > > I already tried to make some fixes. > See the attached patch. > > Thanks. Dmitry. > > On 03/02/2011 11:51 PM, Moriyoshi Koizumi wrote: >> >> Hey, >> >> I think I can fix it somehow. Please don't be haste with it. I am >> going to look into it. >> >> Moriyoshi >> >> On Tue, Mar 1, 2011 at 11:35 PM, Dmitry Stogov<dmi...@zend.com> wrote: >>> >>> Hi, >>> >>> I'm going to revert Moriyoshi patch from December and some following >>> fixes. >>> >>> I like the idea of the patch, but it just doesn't work as expected. >>> It breaks 10 tests, but in general it breaks most things related to >>> Unicode >>> (declare statement, multibyte scripts, exif support for Unicode, >>> multibyte >>> POST requests). >>> >>> I tried to fix it myself, but I just can't understand how it should work >>> (it's too big). It also has several places where integers messed with >>> pointers, old API messed with new one and so on. >>> >>> I'm going to revert (apply the attached patch) on Thursday. >>> >>> Following is the list of failed tests: >>> >>> Shift_JIS request [tests/basic/029.phpt] >>> Testing declare statement with several type values >>> [Zend/tests/declare_001.phpt] >>> Zend Multibyte and ShiftJIS >>> [Zend/tests/multibyte/multibyte_encoding_001.phpt] >>> Zend Multibyte and UTF-8 BOM >>> [Zend/tests/multibyte/multibyte_encoding_002.phpt] >>> Zend Multibyte and UTF-16 BOM >>> [Zend/tests/multibyte/multibyte_encoding_003.phpt] >>> encoding conversion from script encoding into internal encoding >>> [Zend/tests/multibyte/multibyte_encoding_005.phpt] >>> 086: bracketed namespace with encoding [Zend/tests/ns_086.phpt] >>> Check for exif_read_data, Unicode user comment >>> [ext/exif/tests/exif003.phpt] >>> Check for exif_read_data, Unicode WinXP tags >>> [ext/exif/tests/exif004.phpt] >>> Test mb_get_info() function [ext/mbstring/tests/mb_get_info.phpt] >>> >>> Thanks. Dmitry. >>> > > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php