Re: [PHP-DEV] htmlspecialchars iso-2022-jp patch
Hi, Excuse me for the late reply. I reviewed the patch and adjusted the style to the coding standards. Attached is the revised version diff'ed against HEAD. Please verify it. And please be sure to check out CODING_STANDARDS included in the source package before submitting the patch from now on. BTW, your code doesn't seem to handle the string that might result in a string longer than 256 bytes. IMO erealloc() is lacking somewhere. As for the other part, I see no obvious problems. Moriyoshi Adrian Gartland [EMAIL PROTECTED] wrote: New patch applied against the current php4-latest.tar.gz, same location: http://support.oregan.net/php/php_htmlspecialchars_iso_2022-jp.patch On 11 Nov 02, Moriyoshi Koizumi [EMAIL PROTECTED] wrote: Could you make a patch diff'ed against the latest version of html.c in HEAD branch? determine_charset() issue which you pointed out seems to have been fixed already. Moriyoshi Adrian Gartland [EMAIL PROTECTED] wrote: http://support.oregan.net/php/php_htmlspecialchars_iso_2022-jp.patch On 11 Nov 02, Jan Schneider [EMAIL PROTECTED] wrote: Zitat von Adrian Gartland [EMAIL PROTECTED]: Attached is a patch which allows iso-2022-jp (jis) encoded text to be passed through htmlspecialchars when the character set is set to ISO-2022-JP. It should also fix a tiny bug I found in determine_charset code where len hadn't been set and then doing its charset map walk. Your attachment didn't go through the mailing list filters. Please post a link where the patch can be downloaded. Jan. -- http://www.horde.org - The Horde Project http://www.ammma.de - discover your knowledge http://www.tip4all.de - Deine private Tippgemeinschaft -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- Adrian Gartland - Senior Systems Engineer - TV Portal Team Oregan Networks UK Ltd Tel: +44 (0) 20 8846 0990 The White Building, 52-54 Glentham RoadFax: +44 (0) 20 8646 0999 Barnes, London. SW13 9JJ, United Kingdom WWW: http://www.oregan.net/ --- html.c Mon Nov 18 04:11:27 2002 +++ html.c.next Tue Nov 19 05:51:43 2002 @@ -18,7 +18,7 @@ +--+ */ -/* $Id: html.c,v 1.65 2002/11/16 08:30:31 sebastian Exp $ */ +/* $Id: html.c,v 1.63 2002/11/11 13:31:08 moriyoshi Exp $ */ #include php.h #if PHP_WIN32 @@ -43,7 +43,7 @@ #endif enum entity_charset { cs_terminator, cs_8859_1, cs_cp1252, - cs_8859_15, cs_utf_8, cs_big5, cs_gb2312, + cs_8859_15, cs_2022_jp, cs_utf_8, cs_big5, +cs_gb2312, cs_big5hkscs, cs_sjis, cs_eucjp}; typedef const char *entity_table_t; @@ -288,6 +288,7 @@ } charset_map[] = { { ISO-8859-1, cs_8859_1 }, { ISO8859-1, cs_8859_1 }, + { ISO-2022-JP,cs_2022_jp }, { ISO-8859-15,cs_8859_15 }, { ISO8859-15, cs_8859_15 }, { utf-8, cs_utf_8 }, @@ -728,8 +729,138 @@ } /* }}} */ +/* {{{ next_iso2022_segment + * updates whatever psIn is pointing to the end of the multi-byte run + * esc$bxesc(by ; psIn = + */ +static const char *next_iso2022_segment(const unsigned char **psIn, int iInLen, const +char *pcEscapeSafeEnd) +{ + const char *sIn = *psIn; + const char *pcNextEsc; + static const char cEsc = 033; + int iSegmentLength; + int iRemaining = iInLen; + + pcNextEsc = sIn; + if (sIn pcEscapeSafeEnd) { + /* Buffer overrun if we try and spot the escape chars */ + *psIn = sIn + iInLen; + return sIn; + } else { + while(1) { + pcNextEsc++; /* step past the current escape */ + + /* search for the closing escape sequence */ + while (cEsc != *pcNextEsc iRemaining) { + iRemaining--; + pcNextEsc++; + } + + if (cEsc != *pcNextEsc) { + pcNextEsc = NULL; + } + + + if (NULL == pcNextEsc || pcNextEsc pcEscapeSafeEnd) { + *psIn = sIn + iInLen; + return sIn; + } else { + if ('(' == pcNextEsc[1]) { + /*End of multi-byte run. */ +
[PHP-DEV] htmlspecialchars iso-2022-jp patch
Attached is a patch which allows iso-2022-jp (jis) encoded text to be passed through htmlspecialchars when the character set is set to ISO-2022-JP. It should also fix a tiny bug I found in determine_charset code where len hadn't been set and then doing its charset map walk. -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] htmlspecialchars iso-2022-jp patch
Zitat von Adrian Gartland [EMAIL PROTECTED]: Attached is a patch which allows iso-2022-jp (jis) encoded text to be passed through htmlspecialchars when the character set is set to ISO-2022-JP. It should also fix a tiny bug I found in determine_charset code where len hadn't been set and then doing its charset map walk. Your attachment didn't go through the mailing list filters. Please post a link where the patch can be downloaded. Jan. -- http://www.horde.org - The Horde Project http://www.ammma.de - discover your knowledge http://www.tip4all.de - Deine private Tippgemeinschaft -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] htmlspecialchars iso-2022-jp patch
It would be better to try inlining your patch also. I'm very interested in the patch. Moriyoshi Adrian Gartland [EMAIL PROTECTED] wrote: Attached is a patch which allows iso-2022-jp (jis) encoded text to be passed through htmlspecialchars when the character set is set to ISO-2022-JP. It should also fix a tiny bug I found in determine_charset code where len hadn't been set and then doing its charset map walk. -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] htmlspecialchars iso-2022-jp patch
http://support.oregan.net/php/php_htmlspecialchars_iso_2022-jp.patch On 11 Nov 02, Jan Schneider [EMAIL PROTECTED] wrote: Zitat von Adrian Gartland [EMAIL PROTECTED]: Attached is a patch which allows iso-2022-jp (jis) encoded text to be passed through htmlspecialchars when the character set is set to ISO-2022-JP. It should also fix a tiny bug I found in determine_charset code where len hadn't been set and then doing its charset map walk. Your attachment didn't go through the mailing list filters. Please post a link where the patch can be downloaded. Jan. -- http://www.horde.org - The Horde Project http://www.ammma.de - discover your knowledge http://www.tip4all.de - Deine private Tippgemeinschaft -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] htmlspecialchars iso-2022-jp patch
Could you make a patch diff'ed against the latest version of html.c in HEAD branch? determine_charset() issue which you pointed out seems to have been fixed already. Moriyoshi Adrian Gartland [EMAIL PROTECTED] wrote: http://support.oregan.net/php/php_htmlspecialchars_iso_2022-jp.patch On 11 Nov 02, Jan Schneider [EMAIL PROTECTED] wrote: Zitat von Adrian Gartland [EMAIL PROTECTED]: Attached is a patch which allows iso-2022-jp (jis) encoded text to be passed through htmlspecialchars when the character set is set to ISO-2022-JP. It should also fix a tiny bug I found in determine_charset code where len hadn't been set and then doing its charset map walk. Your attachment didn't go through the mailing list filters. Please post a link where the patch can be downloaded. Jan. -- http://www.horde.org - The Horde Project http://www.ammma.de - discover your knowledge http://www.tip4all.de - Deine private Tippgemeinschaft -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] htmlspecialchars iso-2022-jp patch
New patch applied against the current php4-latest.tar.gz, same location: http://support.oregan.net/php/php_htmlspecialchars_iso_2022-jp.patch On 11 Nov 02, Moriyoshi Koizumi [EMAIL PROTECTED] wrote: Could you make a patch diff'ed against the latest version of html.c in HEAD branch? determine_charset() issue which you pointed out seems to have been fixed already. Moriyoshi Adrian Gartland [EMAIL PROTECTED] wrote: http://support.oregan.net/php/php_htmlspecialchars_iso_2022-jp.patch On 11 Nov 02, Jan Schneider [EMAIL PROTECTED] wrote: Zitat von Adrian Gartland [EMAIL PROTECTED]: Attached is a patch which allows iso-2022-jp (jis) encoded text to be passed through htmlspecialchars when the character set is set to ISO-2022-JP. It should also fix a tiny bug I found in determine_charset code where len hadn't been set and then doing its charset map walk. Your attachment didn't go through the mailing list filters. Please post a link where the patch can be downloaded. Jan. -- http://www.horde.org - The Horde Project http://www.ammma.de - discover your knowledge http://www.tip4all.de - Deine private Tippgemeinschaft -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- Adrian Gartland - Senior Systems Engineer - TV Portal Team Oregan Networks UK Ltd Tel: +44 (0) 20 8846 0990 The White Building, 52-54 Glentham RoadFax: +44 (0) 20 8646 0999 Barnes, London. SW13 9JJ, United Kingdom WWW: http://www.oregan.net/ -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] htmlspecialchars iso-2022-jp patch
Thanks, I'll take a look at it. Moriyoshi Adrian Gartland [EMAIL PROTECTED] wrote: New patch applied against the current php4-latest.tar.gz, same location: http://support.oregan.net/php/php_htmlspecialchars_iso_2022-jp.patch On 11 Nov 02, Moriyoshi Koizumi [EMAIL PROTECTED] wrote: Could you make a patch diff'ed against the latest version of html.c in HEAD branch? determine_charset() issue which you pointed out seems to have been fixed already. Moriyoshi Adrian Gartland [EMAIL PROTECTED] wrote: http://support.oregan.net/php/php_htmlspecialchars_iso_2022-jp.patch On 11 Nov 02, Jan Schneider [EMAIL PROTECTED] wrote: Zitat von Adrian Gartland [EMAIL PROTECTED]: Attached is a patch which allows iso-2022-jp (jis) encoded text to be passed through htmlspecialchars when the character set is set to ISO-2022-JP. It should also fix a tiny bug I found in determine_charset code where len hadn't been set and then doing its charset map walk. Your attachment didn't go through the mailing list filters. Please post a link where the patch can be downloaded. Jan. -- http://www.horde.org - The Horde Project http://www.ammma.de - discover your knowledge http://www.tip4all.de - Deine private Tippgemeinschaft -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- Adrian Gartland - Senior Systems Engineer - TV Portal Team Oregan Networks UK Ltd Tel: +44 (0) 20 8846 0990 The White Building, 52-54 Glentham RoadFax: +44 (0) 20 8646 0999 Barnes, London. SW13 9JJ, United Kingdom WWW: http://www.oregan.net/ -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php