Re: [PHP-DEV] mbstring and 4.3.0

2002-11-08 Thread Rui Hirokawa

I completely agree with Wez.
mbstring has very foundamental functionalities for multibyte users.
Multibyte users can 'not' build any useful application without mbstring.
We must understand there are so many users who are using multibyte
character encoding. 
multibyte string functions for multibyte users has nealy same meaning
with string functions for singlebyte users.
If PHP lacks string functions, who can use PHP ?

I think mbstring enabled by default in PHP 4.3 is very good decision.

'function overloading' in mbstring is to make easier multibyte-aware
application, but, it is disabled by default.
I also agree with Wez, the official zend API for function overloading 
is needed. I will change the implemantaion if some official API 
is available.

Rui

On Fri,  8 Nov 2002 10:13:29 +
[EMAIL PROTECTED] (Wez Furlong) wrote:

 I see the known-good codeset conversion implementation as a *very* good
 reason to have mbstring enabled by default.
 (Just look at all the problems with iconv and recode on different systems
 out there).
 
 I agree that the magic features for lazy programmers (function overloading
 and transparent encoding) are slightly worrying, but they are disabled
 by default, and as I have said - I don't use those, but I do use the
 conversion functions and *that* configuration works just fine.
 
 The conversion functions are something that really should be there by
 default, as it allows people to write portable globalized scripts.
 Remember that a large majority of users are vhosted and have not control
 over the build of PHP.  By not providing a reliable and portable
 codeset conversion API, we are holding back the masses from writing
 (and distributing) killer apps in PHP.
 
 Yes, I can enable mbstring at configure time, and yes, the CJK people
 can do likewise, but what about the rest of the world running from vhosts
 when they want to use unicode, quoted-printable, uu-encoding, name of your
 favourite encoding here encodings which are also supported by mbstring?
 
 We took the decision to enable it by default; let's not be short-sighted
 and disable it primarily out of ignorance (no offence intended).
 
 I've yet to see someone comment on my suggestions for a practical solution
 that would shut up both myself and the people advocating disabling it.
 
 --Wez.

-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] Re: mbstring

2002-09-03 Thread Rui Hirokawa



On Tue, 3 Sep 2002 07:13:41 +0100
James Cox [EMAIL PROTECTED] wrote:

 
  I think that the problem is caused by
  --enable-mbstr-enc-trans option and is not caused by mbstring itself.
 
  If --enable-mbstr-enc-trans is enabled,
  php_treat_data, the original handler of user input (POST/GET/Cookie),
  is overrided by mbstr_treat_data in ext/mbstring ,
  the multibyte enabled version.
 
  mbstr_treat_data had a GET handling bug in PHP 4.2.x/PHP 4.1.x.
  Although the bug was already fixed in CVS,
  but mbstr_treat_data is not used by non multibyte user.
 
  So, I abolished --enable-mbstr-enc-trans option in CVS,
  and made a new php.ini option 'mbstring.encoding_translation',
  which is the equivalent of --enable-mbstr-enc-trans,
  and which is 'Off' by default.
 
 so this is now always on, but controlled by an ini variable? that doesn't
 sound very good. maybe i'm being paranoid, but it seems to me that you've
 just enabled it by default (at build time).

No, this option is 'disabled' by default, and can be enabled by a ini variable.

mbstring.encoding_translation = Off; is default.

If mbstring.encoding_translation = On is set in php.ini, 
the transparent conversion will be enabled.

 
  I think the user input (POST/GET/Cookie) related problem
  for non-multibyte user will completely disappeer with this change.
 
  And,
  the mbstring should be 'enabled' by default.
  I18N of PHP is very important for especially multibyte PHP users.
  [snip]
  I agree with Zeev, the simplicity of development work is important.
  So, the duplicate code of the user input handler should be merged
  with the original handler in PHP 5.0.
 
 
 I agree -- it's very useful.. but i don't think it should exist as an
 extension, but built in.. the extension just now seems a really messy way of
 doing it.

Yes, I think mbstring is very fundamental feature and it should be built-in
feature (in PHP 5.0).
But, string related functions and some other core functions
are included in ext/standard which is also in extension part.

Rui

-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP-DEV] Re: mbstring

2002-09-02 Thread Rui Hirokawa


I think that the problem is caused by 
--enable-mbstr-enc-trans option and is not caused by mbstring itself.

If --enable-mbstr-enc-trans is enabled, 
php_treat_data, the original handler of user input (POST/GET/Cookie),
is overrided by mbstr_treat_data in ext/mbstring , 
the multibyte enabled version.

mbstr_treat_data had a GET handling bug in PHP 4.2.x/PHP 4.1.x.
Although the bug was already fixed in CVS, 
but mbstr_treat_data is not used by non multibyte user.

So, I abolished --enable-mbstr-enc-trans option in CVS,
and made a new php.ini option 'mbstring.encoding_translation', 
which is the equivalent of --enable-mbstr-enc-trans,
and which is 'Off' by default.

I think the user input (POST/GET/Cookie) related problem
for non-multibyte user will completely disappeer with this change.

And, 
the mbstring should be 'enabled' by default.
I18N of PHP is very important for especially multibyte PHP users.

The core part of mbstring is stable and is widely used.

In Japan, PHP has already large user community.
More than 20 PHP related books were published in japanese,
and there are more than 5,000 subscibers in japanese php-user 
mailing-list.

And PHP 4.3.0 will introduce Chinese, Korean (and Russian) support
in mbstring.
mbstring will more widely used by non singlebyte PHP users
in near future.

I agree with Zeev, the simplicity of development work is important.
So, the duplicate code of the user input handler should be merged
with the original handler in PHP 5.0.

Rui

On Sun, 1 Sep 2002 10:49:59 +0100
[EMAIL PROTECTED] (James Cox) wrote:

 Phil Copeland @ redhat pointed me at this bug:
 
 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=72752
 
 Seems that there are a number of issues (i'm going to verify  patch his
 fixes right now).
 
 The other he mentions is mbstring seems to cause problems. I have
 experienced this too.
 
 Guys, i don't want to be mean or sound racist or anything else you throw at
 me.
 
 But mbstring really isn't a core module, and very few people will require
 kr/zh/ru style encoding.
 
 I vote to remove mbstring as a default module.
 
 And for those who say that i could just disable it -- well, the converse is
 true. Let us STOP burdening default builds with crap that is unlikely to be
 used.
 
  -- james

-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP-DEV] Re: 4.2.3

2002-08-17 Thread Rui Hirokawa


I agree with you,
because 4.2.3dev is including some bugfix for mbstring, and 
it will take a couple of weeks until release of php-4.3.


On Sat, 17 Aug 2002 13:47:19 +0300
[EMAIL PROTECTED] (Zeev Suraski) wrote:

 I'd like to raise the option of releasing 4.2.3 again.  I believe that it 
 would be quite a while before 4.3.0 is out, and there are quite a few fixes 
 in the 4.2 branch that should make the userbase as soon as possible, 
 especially the Windows userbase.
 I think that releasing 4.2.3 can be done within approximately one week, 
 with one RC, barring unexpected surprises.
 Opinions?
 
 Zeev
 


-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP-DEV] Re: [PHP-CVS] cvs: php4 /ext/mbstring mbfilter.c mbfilter.h mbregex.c mbstring.c mbstring.h /main rfc1867.c

2002-08-02 Thread Rui Hirokawa

Thank you for tsrm fix.
I think html-entities is better than html.


On Fri, 02 Aug 2002 12:37:05 +0200
[EMAIL PROTECTED] (Marcus B¾­ŽÓrger) wrote:

 At 12:22 02.08.2002, you wrote:
 helly   Fri Aug  2 06:22:31 2002 EDT
 
Modified files:
  /php4/ext/mbstring  mbfilter.c mbfilter.h mbregex.c mbstring.c
  mbstring.h
  /php4/main  rfc1867.c
Log:
-use const to clarify code
-fix tsrmls build (therefore rfc1867.c)
 
 
 Rui,
 
 you shoul use TSRM builds by adding  one of the --enable-tsrm-XXX configure 
 options.
 The TSRMLS_C ist used in calls where normally no parameter would occure: 
 f(TSRMLS_C)
 TSRMLS_CC is the same preceded by ',' therefore it is f(param1, param2 
 TSRMLS_CC)
 TSRMLS_D and TSRMLS_DC are for function definitions, again second with 
 additional ','.
 
 If have just added all the consts but not the HTML encoding stuff even 
 though it works fine
 now. Because you suggested we give it another name. What about 
 HTML-ENTITIES with
 HTML being an Alias?
 
 marcus


-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] Re: mbstring and html encode/const structs

2002-08-01 Thread Rui Hirokawa

Thanks,
It's cool!

On Thu, 01 Aug 2002 20:29:12 +0200
[EMAIL PROTECTED] (Marcus B¾­ŽÓrger) wrote:

 I have spent some more work and now i can decode HTML upon input, too.
 
 If you set (arg_separator.input = "|") and (mbstring.internal_encoding = 
 ISO-8859-15) in your ini file
 you can do something like this:
 testpage.php?var=#65;auml;euro;
 and receive $_GET['VAR'] = 'A and both auml; and euro; decoded.
 This will work for post, too. So you can upload html pages decode them and 
 do something with it.

But, I couldn't found arg_separator.input = "|" related code on 
your patch.
arg_separator.input = "|" hasn't any side effect ?

 If you use HTML as output encoding and do foreach($_GET as $idx=$val) echo 
 $idx=$val; you will
 see the original input again (as expected minus | if any).

I think using HTML as a name of output encoding is a little bit 
confusing because almost PHP scripts output is html.
And it't not compatible with multibyte encoding 
which is necessary to use output encoding conversion.

 
 But again i have problems if internal encoding is Multibate encoding. I 
 have to send some
 search on internal handling
 
 Anyone interested may download the patch: 
 http://marcus-boerger.de/php/ext/mbstring/mbstring-entities-const.patch
 And the additional file holding the translation table: 
 http://marcus-boerger.de/php/ext/mbstring/html_entities.c
 
 marcus
 
 At 04:11 01.08.2002, Yasuo Ohgaki wrote:
 Interesting.
 
 Marcus Boerger wrote:
 Anyone interested may download the patch: 
 http://marcus.boerger.de/php/ext/mbstring/mbstring-entities-const.patch
 And the additional file holding translation the table: 
 http://marcus.boerger.de/php/ext/mbstring/html_entities.c
 
 but I cannot access to your web site
 
 
 --
 Yasuo Ohgaki
 
 
 --
 PHP Development Mailing List http://www.php.net/
 To unsubscribe, visit: http://www.php.net/unsub.php


-- 
-----
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] Re: mbstring and html encode/const structs

2002-08-01 Thread Rui Hirokawa


I think adding 'const' is good idea to clarify the code.
We should check the new code before release process of PHP 4.3.0.

Rui

On Fri, 02 Aug 2002 03:08:12 +0200
[EMAIL PROTECTED] (Marcus B¾­ŽÓrger) wrote:

 Spent some more work and now it works if the internal encoding is
 UTF-8. So maybe the work is worth a comit the next days after some
 further testing. And the question is with or without const modifiers?
 


-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php


[PHP-DEV] Re: Warnings in ext/mbstring

2002-04-30 Thread Rui Hirokawa


I modified these warning excluding the last one.
I couldn't understand why the last warning in mbstring.c happened.
mbre_free_pattern() is defined in mbregex.h and mbregex.h is included in mbstring.c.

Rui

On Tue, 30 Apr 2002 06:25:00 +0200
[EMAIL PROTECTED] (Sebastian Bergmann) wrote:

   c:\home\php\php4\ext\mbstring\mbfilter.c(6382): warning C4018: '=':
   Conflict between signed and unsigned
   c:\home\php\php4\ext\mbstring\mbfilter.c(6751): warning C4018: '=':
   Conflict between signed and unsigned
 
   c:\home\php\php4\ext\mbstring\mbregex.c(2207): warning C4018: '=':
   Conflict between signed and unsigned
   c:\home\php\php4\ext\mbstring\mbregex.c(3619): warning C4018: '=':
   Conflict between signed and unsigned
   c:\home\php\php4\ext\mbstring\mbregex.c(3624): warning C4018: '=':
   Conflict between signed and unsigned
   c:\home\php\php4\ext\mbstring\mbregex.c(3740): warning C4018: '=':
   Conflict between signed and unsigned
   c:\home\php\php4\ext\mbstring\mbregex.c(3794): warning C4018: '=':
   Conflict between signed and unsigned
   c:\home\php\php4\ext\mbstring\mbregex.c(3807): warning C4018: '=':
   Conflict between signed and unsigned
   c:\home\php\php4\ext\mbstring\mbregex.c(3819): warning C4018: '=':
   Conflict between signed and unsigned
   c:\home\php\php4\ext\mbstring\mbregex.c(4432): warning C4244: '=':
   Conversion from 'long' in 'short', possible loss of data
 
   c:\home\php\php4\ext\mbstring\mbstring.c(409): warning C4013:
   'mbre_free_pattern' undefined
 
 -- 
   Sebastian Bergmann
   http://sebastian-bergmann.de/ http://phpOpenTracker.de/
 
   Did I help you? Consider a gift: http://wishlist.sebastian-bergmann.de/


-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP-DEV] Re: PHP and UTF

2002-03-21 Thread Rui Hirokawa


Hi,

Currently ext/mbstring has such features.
It is primary focus on multibyte character encoding, 
but it is also support conversion between UTF-8 and singlebyte encodings.

I think Unicode (or some multibyte encodings) support as internal 
character encoding is desired feature for PHP 5/ZE2.

Rui

On Mon, 11 Mar 2002 14:22:44 -0800 (PST)
[EMAIL PROTECTED] (Brad Lafountain) wrote:

 Has it ever been disscuesd to make php
 support different char encodings internally? 
 
 
  - Brad
 


-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP-DEV] Re: FW: [PHP-QA] New Windows Binaries

2002-03-01 Thread Rui Hirokawa


Is this patch for Windows already applied 
to CVS's PHP 4_0_7 branch ?

Rui

 Shane and I worked last night to build Windows versions of 4.1.2, and also
 fix a further vulnerability which exists when you call the cgi directly, for
 example in cgi with apache, it was possible to call
 http://example.com/php/php.exe?c:\winnt\repair\sam to get the equivalent of
 the /etc/passwd file.
 
 We have patched it so it is no longer possible to call it directly, so this
 vulenerability is at least worked around.
 
 Due to the fact that some webservers fix this by default anyway, we have 2
 new ini options. (see them in the php.ini in the source).
 
 The particular one you'll need to set is cgi.force-redirect (0|1) so that
 for servers that are not exploitable (eg, IIS) you override the setting.
 
 I hope that made sense, check out the attached binaries... let us know if
 there are any problems. if not, i'll put them up on the website with
 detauiled (Thought out) install instructions for all those windows users,
 and add comments to the docs.
 
 Thanks,
 
 James


-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] Built-in SOAP based Web Services support (wasRe: PHP 5)

2002-01-02 Thread Rui Hirokawa


ext/xmlrpc in PHP 4.2.0dev already supports SOAP 1.1.

It is still in experimental status, but, this is 
good start point to add native SOAP support for PHP.

I think the light weight and fast SOAP implementation is prefereable
because SOAP is really basic layer/infrastrucure for Web Services.


-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] Proporsal for cascadable general HTTP input handler

2001-12-09 Thread Rui Hirokawa


The input could be an array pointer for the splited and url decoded input by 
POST/GET/Cookie.

The output could be an array including result or return code of 
the handler.

These handlers should be activated in php_treat_data before
php_register_variable_safe().

An example is php_mbstr_encoding_handler() in ext/mbstring.c.
The argument of php_mbstr_encoding_handler() is,

static void
php_mbstr_encoding_handler(zval *arg, char *res, char *separator TSRMLS_DC)

But, the array pointer should be better to simplify the handler.


On Sun, 09 Dec 2001 20:21:02 +0200
Zeev Suraski [EMAIL PROTECTED] wrote:

 What would be the input/output of these input handlers?
 
 Zeev
 
 At 07:19 09/12/2001, Rui Hirokawa wrote:
 
 Hi,
 
 I propose a new idea for HTTP input handler to improve security and
 multibyte encoding support.
 
 Currently, user input by POST/GET/Cookie is treated by
 internal function php_treat_variables().
 
 Some security related work to prevent some security attack
 is preformed in PHP script by htmlspecialchars() and regex().
 
 And multibyte encoding detection and translation which is necessary
 for multibyte enable Web application is implemented by
 override php_treat_variables().
 
 My idea is to introduce some general input filter/handler
 for php_treat_variables().
 
 It is a similar concept as output buffering handler.
 
 For example, if a user defined
 
 input_handler = http_input_check,mb_filter
 
 in php.ini, user defined security check handler and
 multibyte encoding translation are perfomed.
 
 Generally, http input check for secure transaction is really
 hard work and some programers might make some critical mistake.
 And PHP script with http input check is usually hard to read.
 
 If we can use http input handler, we can implemnt separately
 http input check and Web application.
 
 --
 -
 Rui Hirokawa [EMAIL PROTECTED]
   [EMAIL PROTECTED]
 
 
 --
 PHP Development Mailing List http://www.php.net/
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 To contact the list administrators, e-mail: [EMAIL PROTECTED]
 
 
 -- 
 PHP Development Mailing List http://www.php.net/
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 To contact the list administrators, e-mail: [EMAIL PROTECTED]


-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




[PHP-DEV] Proporsal for cascadable general HTTP input handler

2001-12-08 Thread Rui Hirokawa


Hi,

I propose a new idea for HTTP input handler to improve security and
multibyte encoding support.

Currently, user input by POST/GET/Cookie is treated by
internal function php_treat_variables().

Some security related work to prevent some security attack 
is preformed in PHP script by htmlspecialchars() and regex().

And multibyte encoding detection and translation which is necessary
for multibyte enable Web application is implemented by 
override php_treat_variables().

My idea is to introduce some general input filter/handler
for php_treat_variables().

It is a similar concept as output buffering handler.

For example, if a user defined 

input_handler = http_input_check,mb_filter

in php.ini, user defined security check handler and
multibyte encoding translation are perfomed.

Generally, http input check for secure transaction is really
hard work and some programers might make some critical mistake.
And PHP script with http input check is usually hard to read.

If we can use http input handler, we can implemnt separately
http input check and Web application.

-- 
-
Rui Hirokawa [EMAIL PROTECTED]
 [EMAIL PROTECTED]


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




[PHP-DEV] Re: 4.1.0RC1 out

2001-10-19 Thread Rui Hirokawa

I also failed on some tests.

For examples, 'make test' failed on ext/session/tests/00[1|3|6].phpt,

php_smart_str.h(74) :  Freeing 0x081ABEA4 (284 bytes), 
script=/home/rui/work/php/php-4.1.0RC1//ext/session/tests/phpt.YnpRbJ

I suspect that there is a memory leak.

Rui


Yasuo Ohgaki wrote:

 Stig S. Bakken wrote:
 
 Hi,

 4.1.0RC1 is out, download it from
 http://www.php.net/~ssb/php-4.1.0RC1.tar.gz

  - Stig

 
 
 Built and tested under linux 2.4.4/glibc 2.2.2




-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] Branching 4.0.6...

2001-05-05 Thread Rui Hirokawa


Hi,

I fixed a compilation problem for ext/mbstring recently.
I think this problem is critical for this module,
it should be applied for php-4.0.6 branch.

hirokawaSat May  5 19:44:12 2001 EDT

   Modified files:
 /php4/ext/mbstring  mbstring.c
   Log:
   fixed a compilation problem without --enable-mbstr-enc-trans.

Index: php4/ext/mbstring/mbstring.c
diff -u php4/ext/mbstring/mbstring.c:1.7 php4/ext/mbstring/mbstring.c:1.8
--- php4/ext/mbstring/mbstring.c:1.7Fri May  4 03:42:54 2001
+++ php4/ext/mbstring/mbstring.cSat May  5 19:44:12 2001
@@ -16,7 +16,7 @@
 +--+
   */

-/* $Id: mbstring.c,v 1.7 2001/05/04 10:42:54 hirokawa Exp $ */
+/* $Id: mbstring.c,v 1.8 2001/05/06 02:44:12 hirokawa Exp $ */

  /*
   * PHP4 Multibyte String module mbstring (currently only for Japanese)
@@ -73,6 +73,7 @@

  static unsigned char third_and_rest_force_ref[] = { 3, BYREF_NONE, 
 BYREF_NONE, BYREF_FORCE_REST };

+#if defined(MBSTR_ENC_TRANS)
  SAPI_POST_HANDLER_FUNC(php_mbstr_post_handler);

  static sapi_post_entry mbstr_post_entries[] = {
@@ -80,6 +81,7 @@
 { 
 MULTIPART_CONTENT_TYPE,   sizeof(MULTIPART_CONTENT_TYPE)-1, 
  sapi_read_standard_form_data,   rfc1867_post_handler },
 { NULL, 0, NULL }
  };
+#endif

  function_entry mbstring_functions[] = {
 PHP_FE(mb_internal_encoding,NULL)





-- 
--
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] 4.0.6

2001-04-30 Thread Rui Hirokawa


Andi,

We have plan to add ext/jstring which is a japanese string extension module
to php-4.0.6.
Is there any problem to add this module on CVS tree now ?

On Sun, 29 Apr 2001 22:35:43 +0200
Andi Gutmans [EMAIL PROTECTED] wrote:

 Guys,
 
 I think that despite the release of 4.0.5 tomorrow we are pretty close to 
 having an RC1 for 4.0.6. Lots of things have been fixed/added since 4.0.5 
 (check the NEWS file).
 Can we make a list of things which still need to make it into 4.0.6 before 
 we branch?
 
 Andi

-- 
--
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] 4.0.6

2001-04-30 Thread Rui Hirokawa


On Mon, 30 Apr 2001 15:37:14 +0300
Andi Gutmans [EMAIL PROTECTED] wrote:

 At 09:35 PM 4/30/2001 +0900, Rui Hirokawa wrote:
 
 Andi,
 
 We have plan to add ext/jstring which is a japanese string extension module
 to php-4.0.6.
 Is there any problem to add this module on CVS tree now ?
 
 No I don't see a problem with this but please do it quickly. 4.0.6 has 
 already gone a long way since we started RC'ing 4.0.5 and I would like to 
 start RC'ing it pretty soon. You should probably also copy 
 dotnet/EXPERIMENTAL to your directory until it becomes completely stable.

I did.

 
 By the way, do you think jstring is the right name? Is it only for Japanese?

It includes some functions for japanese,but some functions are not 
only for japanese.
For example, this module supports encoding conversion functionality between
Unicode and some other encodings like ISO-8859-X.
Currently, it includes encoding conversion filter for japanese and ISO-8859-X,
but it is relatively easy task to add another conversion filter for some other
language.

I believe this module is small step for PHP internationalization.
I think the name 'jstring' is right name now because it is existing
for japanese string function now.
But  'i18n' or 'wchar' or 'i18n-ja' are also the candidate for the name.

-- 
--
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] 4.0.6

2001-04-30 Thread Rui Hirokawa



On Mon, 30 Apr 2001 16:51:15 +0300 (IDT)
Stanislav Malyshev [EMAIL PROTECTED] wrote:

 RH For example, this module supports encoding conversion
 RH functionality between Unicode and some other encodings like
 RH ISO-8859-X. Currently, it includes encoding conversion filter
 
 Doesn't this duplicate the GNU recode functionality?


ext/jstring adds some unique functionalities as follows,

- automatic encoding recognition functionality for japanese and Unicode   encodings.
- the output encoding translation using output buffering functionality.
- adding encoding translation for HTTP input (POST/GET/Cookie).
- adding multibyte compatible string functions like mbstrlen (multibyte enabled 
strlen())

These functionalities are not fully supported by ext/recode or ext/iconv.


 
 
 -- 
 Stanislav Malyshev, Zend Products Engineer
 [EMAIL PROTECTED]  http://www.zend.com/ +972-3-6139665 ext.115
 
 
 
 -- 
 PHP Development Mailing List http://www.php.net/
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 To contact the list administrators, e-mail: [EMAIL PROTECTED]
 


-- 
--
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] 4.0.6

2001-04-30 Thread Rui Hirokawa


On Mon, 30 Apr 2001 17:26:58 +0300
Andi Gutmans [EMAIL PROTECTED] wrote:

 At 10:01 PM 4/29/2001 -0400, Sterling Hughes wrote:
 ext/wchar (wide character support?)
 ext/mstring (multibyte string functions)
 ext/jpstring (japanese string functions)
 
 I'd make mstring - mbstring.
 The question is if it's worth splitting this up into more than one 
 extension. Probably not.
 So we should probably be picking out of wchar, mbstring, jpstring.
 Rui, what do you think?

I prefer unified approach is better for php-i18n than splitting 
some modules.
I think mbstring is better, although this module also 
supports single-byte encoding like ISO-8859-X.
Some people might say 'wchar' is better choise,
because this module converts string to wide character internally.

If someone want to add some other encoding support,
he should add mbfilter_xx.c mbfilter_xx.h where xx means some
specific language like ja (japanese).

Anyway, because I am not original author of this module,
I must discuss to Mr. Tsukada ,the original author of jstring
about renaming the module.

-- 
--
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] 4.0.6

2001-04-30 Thread Rui Hirokawa


I am also one of the authors of ext/iconv module.
iconv module is only for encoding translation but
jstring (renaming to mbstring) is for general multibyte string 
handling fucntions.
The encoding translation is one of functionalities of mbstring.
I think rather iconv module should be merged into mbstring in the future.

On Mon, 30 Apr 2001 19:53:11 +0300
Alexander Bokovoy [EMAIL PROTECTED] wrote:

 On Mon, Apr 30, 2001 at 10:47:21PM +0900, Rui Hirokawa wrote:
  
  On Mon, 30 Apr 2001 15:37:14 +0300
  Andi Gutmans [EMAIL PROTECTED] wrote:
  
   At 09:35 PM 4/30/2001 +0900, Rui Hirokawa wrote:
   
   Andi,
   
   We have plan to add ext/jstring which is a japanese string extension module
   to php-4.0.6.
   Is there any problem to add this module on CVS tree now ?
   
   No I don't see a problem with this but please do it quickly. 4.0.6 has 
   already gone a long way since we started RC'ing 4.0.5 and I would like to 
   start RC'ing it pretty soon. You should probably also copy 
   dotnet/EXPERIMENTAL to your directory until it becomes completely stable.
  
  I did.
  
   
   By the way, do you think jstring is the right name? Is it only for Japanese?
  
  It includes some functions for japanese,but some functions are not 
  only for japanese.
  For example, this module supports encoding conversion functionality between
  Unicode and some other encodings like ISO-8859-X.
  Currently, it includes encoding conversion filter for japanese and ISO-8859-X,
  but it is relatively easy task to add another conversion filter for some other
  language.
  
  I believe this module is small step for PHP internationalization.
  I think the name 'jstring' is right name now because it is existing
  for japanese string function now.
  But  'i18n' or 'wchar' or 'i18n-ja' are also the candidate for the name.
 It would be very good to join this features with existing iconv extension
 rather than generate new extensions.


-- 
--
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] Re: [PHP-CVS] cvs: php4 / TODO-4.1.txt

2001-04-20 Thread Rui Hirokawa




  iconv P
 
 The iconv library in itself is useful enough that we should try to
 keep this one within PHP, maybe even integrate it tighter.

I hope so too.
iconv has some important role espetially for handling multibyte 
encoding.
I am also preparing a extension module called jstring for handing 
janapese multibyte string.
This module includes some encoding translation functionality
between Unicode and some other encodings.
It will be also elementary tool for japanese PHP users.
I want to keep tight integration between PHP and encoding
translation functions because of performance and useability.


-- 
--
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] iconv patch...

2001-03-09 Thread Rui Hirokawa



On 09 Mar 2001 18:31:07 +0100
Ondrej Sury [EMAIL PROTECTED] wrote:

 
 I think that UTF-8 is better choice as default encoding ;-) And because I
 couldn't compile this module, I have added php_iconv_init_globals and put
 it in ZEND_INIT_MODULE_GLOBALS.  (I have inspired by mysql module, so this
 shouldn't be neccessary correct.)

I personaly agree with your choice because I am using multibyte encoding (japanese) as 
my native language.
But some other modules like ext/xml are using ISO-8859-1 as default encoding now.
You can easily change these encoding settings using php.ini as follows,

[iconv]
iconv.internal_encoding = "UTF-8"
iconv.output_encoding = "UTF-8"

The iconv.input_encoding encoding has no meaning now.
I have a plan to add encoding translation capability between input encoding and 
internal encoding in http input (POST/GET/Cookie) parser.

-- 
----------
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




[PHP-DEV] about users of japanese localized php3

2001-03-02 Thread Rui Hirokawa


Hi,

Hi,

I remember there was a patch around for PHP 3.0.x which added some Japanese 
support functions.
Anyone know how many Japanese actually used these functions and how many 
are using Vanilla PHP?

As far as I know,
most of japanese Linux distributions like RedHat-6.2j/7.0j Linux or 
TurboLinux are shiped with japanese localized versions of PHP 3.0.x.

As Mr.Sato said,
it is nesserary to handle three kind of different encoding
set 'shift_jis','euc-jp','iso2022-jp' (or 'utf-8') in japan.

Because the exact encoding is unknown when php parser was started,
automatic encoding renognition and encoding translation are neccessary
in http input (post/get/cookie) parsing process.

I think 'Vanilla' PHP 3/4 is not very useful for japanese.
I know PHP 3 is obsolete now, but so many japanese PHP users
contininue to use japanese localized PHP 3 (e.g. PHP-3.0.18-i18n-ja).

Currently, I am working to make some patch for php4/main/php_variables.c
to add the encoding translation capability using ext/jstring made by 
Mr. Tsukada supporting japanese characters handling functions.

Does Japanese actually work in a decent way with PHP? From the zillions of 
Japanese sites I've seen running it I'd guess it works :)

Andi

-- 
--
Rui Hirokawa [EMAIL PROTECTED]
maintainer of japanese PHP manual [EMAIL PROTECTED]

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]