Re: [PHP-DEV] mbstring and 4.3.0
I completely agree with Wez. mbstring has very foundamental functionalities for multibyte users. Multibyte users can 'not' build any useful application without mbstring. We must understand there are so many users who are using multibyte character encoding. multibyte string functions for multibyte users has nealy same meaning with string functions for singlebyte users. If PHP lacks string functions, who can use PHP ? I think mbstring enabled by default in PHP 4.3 is very good decision. 'function overloading' in mbstring is to make easier multibyte-aware application, but, it is disabled by default. I also agree with Wez, the official zend API for function overloading is needed. I will change the implemantaion if some official API is available. Rui On Fri, 8 Nov 2002 10:13:29 + [EMAIL PROTECTED] (Wez Furlong) wrote: I see the known-good codeset conversion implementation as a *very* good reason to have mbstring enabled by default. (Just look at all the problems with iconv and recode on different systems out there). I agree that the magic features for lazy programmers (function overloading and transparent encoding) are slightly worrying, but they are disabled by default, and as I have said - I don't use those, but I do use the conversion functions and *that* configuration works just fine. The conversion functions are something that really should be there by default, as it allows people to write portable globalized scripts. Remember that a large majority of users are vhosted and have not control over the build of PHP. By not providing a reliable and portable codeset conversion API, we are holding back the masses from writing (and distributing) killer apps in PHP. Yes, I can enable mbstring at configure time, and yes, the CJK people can do likewise, but what about the rest of the world running from vhosts when they want to use unicode, quoted-printable, uu-encoding, name of your favourite encoding here encodings which are also supported by mbstring? We took the decision to enable it by default; let's not be short-sighted and disable it primarily out of ignorance (no offence intended). I've yet to see someone comment on my suggestions for a practical solution that would shut up both myself and the people advocating disabling it. --Wez. -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Re: mbstring
On Tue, 3 Sep 2002 07:13:41 +0100 James Cox [EMAIL PROTECTED] wrote: I think that the problem is caused by --enable-mbstr-enc-trans option and is not caused by mbstring itself. If --enable-mbstr-enc-trans is enabled, php_treat_data, the original handler of user input (POST/GET/Cookie), is overrided by mbstr_treat_data in ext/mbstring , the multibyte enabled version. mbstr_treat_data had a GET handling bug in PHP 4.2.x/PHP 4.1.x. Although the bug was already fixed in CVS, but mbstr_treat_data is not used by non multibyte user. So, I abolished --enable-mbstr-enc-trans option in CVS, and made a new php.ini option 'mbstring.encoding_translation', which is the equivalent of --enable-mbstr-enc-trans, and which is 'Off' by default. so this is now always on, but controlled by an ini variable? that doesn't sound very good. maybe i'm being paranoid, but it seems to me that you've just enabled it by default (at build time). No, this option is 'disabled' by default, and can be enabled by a ini variable. mbstring.encoding_translation = Off; is default. If mbstring.encoding_translation = On is set in php.ini, the transparent conversion will be enabled. I think the user input (POST/GET/Cookie) related problem for non-multibyte user will completely disappeer with this change. And, the mbstring should be 'enabled' by default. I18N of PHP is very important for especially multibyte PHP users. [snip] I agree with Zeev, the simplicity of development work is important. So, the duplicate code of the user input handler should be merged with the original handler in PHP 5.0. I agree -- it's very useful.. but i don't think it should exist as an extension, but built in.. the extension just now seems a really messy way of doing it. Yes, I think mbstring is very fundamental feature and it should be built-in feature (in PHP 5.0). But, string related functions and some other core functions are included in ext/standard which is also in extension part. Rui -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Re: mbstring
I think that the problem is caused by --enable-mbstr-enc-trans option and is not caused by mbstring itself. If --enable-mbstr-enc-trans is enabled, php_treat_data, the original handler of user input (POST/GET/Cookie), is overrided by mbstr_treat_data in ext/mbstring , the multibyte enabled version. mbstr_treat_data had a GET handling bug in PHP 4.2.x/PHP 4.1.x. Although the bug was already fixed in CVS, but mbstr_treat_data is not used by non multibyte user. So, I abolished --enable-mbstr-enc-trans option in CVS, and made a new php.ini option 'mbstring.encoding_translation', which is the equivalent of --enable-mbstr-enc-trans, and which is 'Off' by default. I think the user input (POST/GET/Cookie) related problem for non-multibyte user will completely disappeer with this change. And, the mbstring should be 'enabled' by default. I18N of PHP is very important for especially multibyte PHP users. The core part of mbstring is stable and is widely used. In Japan, PHP has already large user community. More than 20 PHP related books were published in japanese, and there are more than 5,000 subscibers in japanese php-user mailing-list. And PHP 4.3.0 will introduce Chinese, Korean (and Russian) support in mbstring. mbstring will more widely used by non singlebyte PHP users in near future. I agree with Zeev, the simplicity of development work is important. So, the duplicate code of the user input handler should be merged with the original handler in PHP 5.0. Rui On Sun, 1 Sep 2002 10:49:59 +0100 [EMAIL PROTECTED] (James Cox) wrote: Phil Copeland @ redhat pointed me at this bug: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=72752 Seems that there are a number of issues (i'm going to verify patch his fixes right now). The other he mentions is mbstring seems to cause problems. I have experienced this too. Guys, i don't want to be mean or sound racist or anything else you throw at me. But mbstring really isn't a core module, and very few people will require kr/zh/ru style encoding. I vote to remove mbstring as a default module. And for those who say that i could just disable it -- well, the converse is true. Let us STOP burdening default builds with crap that is unlikely to be used. -- james -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Re: 4.2.3
I agree with you, because 4.2.3dev is including some bugfix for mbstring, and it will take a couple of weeks until release of php-4.3. On Sat, 17 Aug 2002 13:47:19 +0300 [EMAIL PROTECTED] (Zeev Suraski) wrote: I'd like to raise the option of releasing 4.2.3 again. I believe that it would be quite a while before 4.3.0 is out, and there are quite a few fixes in the 4.2 branch that should make the userbase as soon as possible, especially the Windows userbase. I think that releasing 4.2.3 can be done within approximately one week, with one RC, barring unexpected surprises. Opinions? Zeev -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Re: [PHP-CVS] cvs: php4 /ext/mbstring mbfilter.c mbfilter.h mbregex.c mbstring.c mbstring.h /main rfc1867.c
Thank you for tsrm fix. I think html-entities is better than html. On Fri, 02 Aug 2002 12:37:05 +0200 [EMAIL PROTECTED] (Marcus B¾Órger) wrote: At 12:22 02.08.2002, you wrote: helly Fri Aug 2 06:22:31 2002 EDT Modified files: /php4/ext/mbstring mbfilter.c mbfilter.h mbregex.c mbstring.c mbstring.h /php4/main rfc1867.c Log: -use const to clarify code -fix tsrmls build (therefore rfc1867.c) Rui, you shoul use TSRM builds by adding one of the --enable-tsrm-XXX configure options. The TSRMLS_C ist used in calls where normally no parameter would occure: f(TSRMLS_C) TSRMLS_CC is the same preceded by ',' therefore it is f(param1, param2 TSRMLS_CC) TSRMLS_D and TSRMLS_DC are for function definitions, again second with additional ','. If have just added all the consts but not the HTML encoding stuff even though it works fine now. Because you suggested we give it another name. What about HTML-ENTITIES with HTML being an Alias? marcus -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Re: mbstring and html encode/const structs
Thanks, It's cool! On Thu, 01 Aug 2002 20:29:12 +0200 [EMAIL PROTECTED] (Marcus B¾Órger) wrote: I have spent some more work and now i can decode HTML upon input, too. If you set (arg_separator.input = "|") and (mbstring.internal_encoding = ISO-8859-15) in your ini file you can do something like this: testpage.php?var=#65;auml;euro; and receive $_GET['VAR'] = 'A and both auml; and euro; decoded. This will work for post, too. So you can upload html pages decode them and do something with it. But, I couldn't found arg_separator.input = "|" related code on your patch. arg_separator.input = "|" hasn't any side effect ? If you use HTML as output encoding and do foreach($_GET as $idx=$val) echo $idx=$val; you will see the original input again (as expected minus | if any). I think using HTML as a name of output encoding is a little bit confusing because almost PHP scripts output is html. And it't not compatible with multibyte encoding which is necessary to use output encoding conversion. But again i have problems if internal encoding is Multibate encoding. I have to send some search on internal handling Anyone interested may download the patch: http://marcus-boerger.de/php/ext/mbstring/mbstring-entities-const.patch And the additional file holding the translation table: http://marcus-boerger.de/php/ext/mbstring/html_entities.c marcus At 04:11 01.08.2002, Yasuo Ohgaki wrote: Interesting. Marcus Boerger wrote: Anyone interested may download the patch: http://marcus.boerger.de/php/ext/mbstring/mbstring-entities-const.patch And the additional file holding translation the table: http://marcus.boerger.de/php/ext/mbstring/html_entities.c but I cannot access to your web site -- Yasuo Ohgaki -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php -- ----- Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Re: mbstring and html encode/const structs
I think adding 'const' is good idea to clarify the code. We should check the new code before release process of PHP 4.3.0. Rui On Fri, 02 Aug 2002 03:08:12 +0200 [EMAIL PROTECTED] (Marcus B¾Órger) wrote: Spent some more work and now it works if the internal encoding is UTF-8. So maybe the work is worth a comit the next days after some further testing. And the question is with or without const modifiers? -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Re: Warnings in ext/mbstring
I modified these warning excluding the last one. I couldn't understand why the last warning in mbstring.c happened. mbre_free_pattern() is defined in mbregex.h and mbregex.h is included in mbstring.c. Rui On Tue, 30 Apr 2002 06:25:00 +0200 [EMAIL PROTECTED] (Sebastian Bergmann) wrote: c:\home\php\php4\ext\mbstring\mbfilter.c(6382): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbfilter.c(6751): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbregex.c(2207): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbregex.c(3619): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbregex.c(3624): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbregex.c(3740): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbregex.c(3794): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbregex.c(3807): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbregex.c(3819): warning C4018: '=': Conflict between signed and unsigned c:\home\php\php4\ext\mbstring\mbregex.c(4432): warning C4244: '=': Conversion from 'long' in 'short', possible loss of data c:\home\php\php4\ext\mbstring\mbstring.c(409): warning C4013: 'mbre_free_pattern' undefined -- Sebastian Bergmann http://sebastian-bergmann.de/ http://phpOpenTracker.de/ Did I help you? Consider a gift: http://wishlist.sebastian-bergmann.de/ -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Re: PHP and UTF
Hi, Currently ext/mbstring has such features. It is primary focus on multibyte character encoding, but it is also support conversion between UTF-8 and singlebyte encodings. I think Unicode (or some multibyte encodings) support as internal character encoding is desired feature for PHP 5/ZE2. Rui On Mon, 11 Mar 2002 14:22:44 -0800 (PST) [EMAIL PROTECTED] (Brad Lafountain) wrote: Has it ever been disscuesd to make php support different char encodings internally? - Brad -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Re: FW: [PHP-QA] New Windows Binaries
Is this patch for Windows already applied to CVS's PHP 4_0_7 branch ? Rui Shane and I worked last night to build Windows versions of 4.1.2, and also fix a further vulnerability which exists when you call the cgi directly, for example in cgi with apache, it was possible to call http://example.com/php/php.exe?c:\winnt\repair\sam to get the equivalent of the /etc/passwd file. We have patched it so it is no longer possible to call it directly, so this vulenerability is at least worked around. Due to the fact that some webservers fix this by default anyway, we have 2 new ini options. (see them in the php.ini in the source). The particular one you'll need to set is cgi.force-redirect (0|1) so that for servers that are not exploitable (eg, IIS) you override the setting. I hope that made sense, check out the attached binaries... let us know if there are any problems. if not, i'll put them up on the website with detauiled (Thought out) install instructions for all those windows users, and add comments to the docs. Thanks, James -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Built-in SOAP based Web Services support (wasRe: PHP 5)
ext/xmlrpc in PHP 4.2.0dev already supports SOAP 1.1. It is still in experimental status, but, this is good start point to add native SOAP support for PHP. I think the light weight and fast SOAP implementation is prefereable because SOAP is really basic layer/infrastrucure for Web Services. -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] Proporsal for cascadable general HTTP input handler
The input could be an array pointer for the splited and url decoded input by POST/GET/Cookie. The output could be an array including result or return code of the handler. These handlers should be activated in php_treat_data before php_register_variable_safe(). An example is php_mbstr_encoding_handler() in ext/mbstring.c. The argument of php_mbstr_encoding_handler() is, static void php_mbstr_encoding_handler(zval *arg, char *res, char *separator TSRMLS_DC) But, the array pointer should be better to simplify the handler. On Sun, 09 Dec 2001 20:21:02 +0200 Zeev Suraski [EMAIL PROTECTED] wrote: What would be the input/output of these input handlers? Zeev At 07:19 09/12/2001, Rui Hirokawa wrote: Hi, I propose a new idea for HTTP input handler to improve security and multibyte encoding support. Currently, user input by POST/GET/Cookie is treated by internal function php_treat_variables(). Some security related work to prevent some security attack is preformed in PHP script by htmlspecialchars() and regex(). And multibyte encoding detection and translation which is necessary for multibyte enable Web application is implemented by override php_treat_variables(). My idea is to introduce some general input filter/handler for php_treat_variables(). It is a similar concept as output buffering handler. For example, if a user defined input_handler = http_input_check,mb_filter in php.ini, user defined security check handler and multibyte encoding translation are perfomed. Generally, http input check for secure transaction is really hard work and some programers might make some critical mistake. And PHP script with http input check is usually hard to read. If we can use http input handler, we can implemnt separately http input check and Web application. -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED] -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
[PHP-DEV] Proporsal for cascadable general HTTP input handler
Hi, I propose a new idea for HTTP input handler to improve security and multibyte encoding support. Currently, user input by POST/GET/Cookie is treated by internal function php_treat_variables(). Some security related work to prevent some security attack is preformed in PHP script by htmlspecialchars() and regex(). And multibyte encoding detection and translation which is necessary for multibyte enable Web application is implemented by override php_treat_variables(). My idea is to introduce some general input filter/handler for php_treat_variables(). It is a similar concept as output buffering handler. For example, if a user defined input_handler = http_input_check,mb_filter in php.ini, user defined security check handler and multibyte encoding translation are perfomed. Generally, http input check for secure transaction is really hard work and some programers might make some critical mistake. And PHP script with http input check is usually hard to read. If we can use http input handler, we can implemnt separately http input check and Web application. -- - Rui Hirokawa [EMAIL PROTECTED] [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
[PHP-DEV] Re: 4.1.0RC1 out
I also failed on some tests. For examples, 'make test' failed on ext/session/tests/00[1|3|6].phpt, php_smart_str.h(74) : Freeing 0x081ABEA4 (284 bytes), script=/home/rui/work/php/php-4.1.0RC1//ext/session/tests/phpt.YnpRbJ I suspect that there is a memory leak. Rui Yasuo Ohgaki wrote: Stig S. Bakken wrote: Hi, 4.1.0RC1 is out, download it from http://www.php.net/~ssb/php-4.1.0RC1.tar.gz - Stig Built and tested under linux 2.4.4/glibc 2.2.2 -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] Branching 4.0.6...
Hi, I fixed a compilation problem for ext/mbstring recently. I think this problem is critical for this module, it should be applied for php-4.0.6 branch. hirokawaSat May 5 19:44:12 2001 EDT Modified files: /php4/ext/mbstring mbstring.c Log: fixed a compilation problem without --enable-mbstr-enc-trans. Index: php4/ext/mbstring/mbstring.c diff -u php4/ext/mbstring/mbstring.c:1.7 php4/ext/mbstring/mbstring.c:1.8 --- php4/ext/mbstring/mbstring.c:1.7Fri May 4 03:42:54 2001 +++ php4/ext/mbstring/mbstring.cSat May 5 19:44:12 2001 @@ -16,7 +16,7 @@ +--+ */ -/* $Id: mbstring.c,v 1.7 2001/05/04 10:42:54 hirokawa Exp $ */ +/* $Id: mbstring.c,v 1.8 2001/05/06 02:44:12 hirokawa Exp $ */ /* * PHP4 Multibyte String module mbstring (currently only for Japanese) @@ -73,6 +73,7 @@ static unsigned char third_and_rest_force_ref[] = { 3, BYREF_NONE, BYREF_NONE, BYREF_FORCE_REST }; +#if defined(MBSTR_ENC_TRANS) SAPI_POST_HANDLER_FUNC(php_mbstr_post_handler); static sapi_post_entry mbstr_post_entries[] = { @@ -80,6 +81,7 @@ { MULTIPART_CONTENT_TYPE, sizeof(MULTIPART_CONTENT_TYPE)-1, sapi_read_standard_form_data, rfc1867_post_handler }, { NULL, 0, NULL } }; +#endif function_entry mbstring_functions[] = { PHP_FE(mb_internal_encoding,NULL) -- -- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] 4.0.6
Andi, We have plan to add ext/jstring which is a japanese string extension module to php-4.0.6. Is there any problem to add this module on CVS tree now ? On Sun, 29 Apr 2001 22:35:43 +0200 Andi Gutmans [EMAIL PROTECTED] wrote: Guys, I think that despite the release of 4.0.5 tomorrow we are pretty close to having an RC1 for 4.0.6. Lots of things have been fixed/added since 4.0.5 (check the NEWS file). Can we make a list of things which still need to make it into 4.0.6 before we branch? Andi -- -- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] 4.0.6
On Mon, 30 Apr 2001 15:37:14 +0300 Andi Gutmans [EMAIL PROTECTED] wrote: At 09:35 PM 4/30/2001 +0900, Rui Hirokawa wrote: Andi, We have plan to add ext/jstring which is a japanese string extension module to php-4.0.6. Is there any problem to add this module on CVS tree now ? No I don't see a problem with this but please do it quickly. 4.0.6 has already gone a long way since we started RC'ing 4.0.5 and I would like to start RC'ing it pretty soon. You should probably also copy dotnet/EXPERIMENTAL to your directory until it becomes completely stable. I did. By the way, do you think jstring is the right name? Is it only for Japanese? It includes some functions for japanese,but some functions are not only for japanese. For example, this module supports encoding conversion functionality between Unicode and some other encodings like ISO-8859-X. Currently, it includes encoding conversion filter for japanese and ISO-8859-X, but it is relatively easy task to add another conversion filter for some other language. I believe this module is small step for PHP internationalization. I think the name 'jstring' is right name now because it is existing for japanese string function now. But 'i18n' or 'wchar' or 'i18n-ja' are also the candidate for the name. -- -- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] 4.0.6
On Mon, 30 Apr 2001 16:51:15 +0300 (IDT) Stanislav Malyshev [EMAIL PROTECTED] wrote: RH For example, this module supports encoding conversion RH functionality between Unicode and some other encodings like RH ISO-8859-X. Currently, it includes encoding conversion filter Doesn't this duplicate the GNU recode functionality? ext/jstring adds some unique functionalities as follows, - automatic encoding recognition functionality for japanese and Unicode encodings. - the output encoding translation using output buffering functionality. - adding encoding translation for HTTP input (POST/GET/Cookie). - adding multibyte compatible string functions like mbstrlen (multibyte enabled strlen()) These functionalities are not fully supported by ext/recode or ext/iconv. -- Stanislav Malyshev, Zend Products Engineer [EMAIL PROTECTED] http://www.zend.com/ +972-3-6139665 ext.115 -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED] -- -- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] 4.0.6
On Mon, 30 Apr 2001 17:26:58 +0300 Andi Gutmans [EMAIL PROTECTED] wrote: At 10:01 PM 4/29/2001 -0400, Sterling Hughes wrote: ext/wchar (wide character support?) ext/mstring (multibyte string functions) ext/jpstring (japanese string functions) I'd make mstring - mbstring. The question is if it's worth splitting this up into more than one extension. Probably not. So we should probably be picking out of wchar, mbstring, jpstring. Rui, what do you think? I prefer unified approach is better for php-i18n than splitting some modules. I think mbstring is better, although this module also supports single-byte encoding like ISO-8859-X. Some people might say 'wchar' is better choise, because this module converts string to wide character internally. If someone want to add some other encoding support, he should add mbfilter_xx.c mbfilter_xx.h where xx means some specific language like ja (japanese). Anyway, because I am not original author of this module, I must discuss to Mr. Tsukada ,the original author of jstring about renaming the module. -- -- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] 4.0.6
I am also one of the authors of ext/iconv module. iconv module is only for encoding translation but jstring (renaming to mbstring) is for general multibyte string handling fucntions. The encoding translation is one of functionalities of mbstring. I think rather iconv module should be merged into mbstring in the future. On Mon, 30 Apr 2001 19:53:11 +0300 Alexander Bokovoy [EMAIL PROTECTED] wrote: On Mon, Apr 30, 2001 at 10:47:21PM +0900, Rui Hirokawa wrote: On Mon, 30 Apr 2001 15:37:14 +0300 Andi Gutmans [EMAIL PROTECTED] wrote: At 09:35 PM 4/30/2001 +0900, Rui Hirokawa wrote: Andi, We have plan to add ext/jstring which is a japanese string extension module to php-4.0.6. Is there any problem to add this module on CVS tree now ? No I don't see a problem with this but please do it quickly. 4.0.6 has already gone a long way since we started RC'ing 4.0.5 and I would like to start RC'ing it pretty soon. You should probably also copy dotnet/EXPERIMENTAL to your directory until it becomes completely stable. I did. By the way, do you think jstring is the right name? Is it only for Japanese? It includes some functions for japanese,but some functions are not only for japanese. For example, this module supports encoding conversion functionality between Unicode and some other encodings like ISO-8859-X. Currently, it includes encoding conversion filter for japanese and ISO-8859-X, but it is relatively easy task to add another conversion filter for some other language. I believe this module is small step for PHP internationalization. I think the name 'jstring' is right name now because it is existing for japanese string function now. But 'i18n' or 'wchar' or 'i18n-ja' are also the candidate for the name. It would be very good to join this features with existing iconv extension rather than generate new extensions. -- -- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] Re: [PHP-CVS] cvs: php4 / TODO-4.1.txt
iconv P The iconv library in itself is useful enough that we should try to keep this one within PHP, maybe even integrate it tighter. I hope so too. iconv has some important role espetially for handling multibyte encoding. I am also preparing a extension module called jstring for handing janapese multibyte string. This module includes some encoding translation functionality between Unicode and some other encodings. It will be also elementary tool for japanese PHP users. I want to keep tight integration between PHP and encoding translation functions because of performance and useability. -- -- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] iconv patch...
On 09 Mar 2001 18:31:07 +0100 Ondrej Sury [EMAIL PROTECTED] wrote: I think that UTF-8 is better choice as default encoding ;-) And because I couldn't compile this module, I have added php_iconv_init_globals and put it in ZEND_INIT_MODULE_GLOBALS. (I have inspired by mysql module, so this shouldn't be neccessary correct.) I personaly agree with your choice because I am using multibyte encoding (japanese) as my native language. But some other modules like ext/xml are using ISO-8859-1 as default encoding now. You can easily change these encoding settings using php.ini as follows, [iconv] iconv.internal_encoding = "UTF-8" iconv.output_encoding = "UTF-8" The iconv.input_encoding encoding has no meaning now. I have a plan to add encoding translation capability between input encoding and internal encoding in http input (POST/GET/Cookie) parser. -- ---------- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
[PHP-DEV] about users of japanese localized php3
Hi, Hi, I remember there was a patch around for PHP 3.0.x which added some Japanese support functions. Anyone know how many Japanese actually used these functions and how many are using Vanilla PHP? As far as I know, most of japanese Linux distributions like RedHat-6.2j/7.0j Linux or TurboLinux are shiped with japanese localized versions of PHP 3.0.x. As Mr.Sato said, it is nesserary to handle three kind of different encoding set 'shift_jis','euc-jp','iso2022-jp' (or 'utf-8') in japan. Because the exact encoding is unknown when php parser was started, automatic encoding renognition and encoding translation are neccessary in http input (post/get/cookie) parsing process. I think 'Vanilla' PHP 3/4 is not very useful for japanese. I know PHP 3 is obsolete now, but so many japanese PHP users contininue to use japanese localized PHP 3 (e.g. PHP-3.0.18-i18n-ja). Currently, I am working to make some patch for php4/main/php_variables.c to add the encoding translation capability using ext/jstring made by Mr. Tsukada supporting japanese characters handling functions. Does Japanese actually work in a decent way with PHP? From the zillions of Japanese sites I've seen running it I'd guess it works :) Andi -- -- Rui Hirokawa [EMAIL PROTECTED] maintainer of japanese PHP manual [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]