[PHP-DEV] RFC: Removing PHP tags

2012-03-31 Thread Moriyoshi Koizumi
Hi,

I wrote a RFC that proposes removal of PHP tags.  There is actually
strong public demand for it, and I also think it is necessary to
leverage PHP to a genuine, modern scripting language.

http://wiki.php.net/rfc/nophptags

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Removing PHP tags

2012-03-31 Thread Moriyoshi Koizumi
Ok, I'll try to fix that part. Thanks for the correction.

Moriyoshi

On Sun, Apr 1, 2012 at 11:29 AM, Rasmus Lerdorf ras...@lerdorf.com wrote:
 On Mar 31, 2012, at 6:59 PM, Moriyoshi Koizumi m...@mozo.jp wrote:

 Hi,

 I wrote a RFC that proposes removal of PHP tags.  There is actually
 strong public demand for it, and I also think it is necessary to
 leverage PHP to a genuine, modern scripting language.

 http://wiki.php.net/rfc/nophptags

 I really doubt this will find much support, and note that PHP came well 
 before ASP so that part of the RFC is factually inaccurate.

 -Rasmus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] 5.4 features for vote (long)

2011-06-20 Thread Moriyoshi Koizumi
On Mon, Jun 20, 2011 at 8:39 PM, Derick Rethans der...@php.net wrote:
 8. Cli web server. Built-in mini-HTTP server run directly from PHP binary.
 Assigned: Moriyoshi Koizumi

 I'd really like to see that one. I thought the patch was already
 committed?

Not yet.  I'm gonna commit it in six hours or so if no one objects ;-)

Moriyoshi


 Derick

 --
 http://derickrethans.nl | http://xdebug.org
 Like Xdebug? Consider a donation: http://xdebug.org/donate.php
 twitter: @derickr and @xdebug

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] streams problem in 5.3

2011-03-04 Thread Moriyoshi Koizumi
It looks like the only solution is define a new stream type for zend_stream 
that delegates stream operations to user-defined callbacks.

Moriyoshi

On 2011/03/04, at 10:26, Stas Malyshev wrote:

 Hi!
 
 I try to do some complex code with custom streams and I have discovered the 
 following problem:
 
 The code in main/streams/cast.c, specifically _php_stream_cast, creates 
 fopencookie() synthetic stream for streams that are not actual file streams. 
 Which works fine until such stream is used in include(), in which case it 
 ultimately arrives at zend_stream_fixup(). Which would in turn call 
 zend_stream_fsize() - which would do fstat(fileno(file_handle-handle.fp), 
 buf) - and that would fail since you can't get fileno for FILE* created by 
 fopencookie.
 Which ultimately means I can't use my custom streams for include(), which is 
 bad. Now, looking at the code, it doesn't actually need the exact size - http 
 streams can be included just fine - but insists on having it if it has fp 
 (which it can have for basically any kind of stream due to the cookie trick). 
 Does anyone has any idea why and if it can be fixed?
 -- 
 Stanislav Malyshev, Software Architect
 SugarCRM: http://www.sugarcrm.com/
 (408)454-6900 ext. 227
 
 -- 
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php
 


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] streams problem in 5.3

2011-03-04 Thread Moriyoshi Koizumi
Hmm, it has already supported through ZEND_HANDLE_STREAM.  So then, changing 
the interface of fopen_function to return zend_stream instead of FILE* should 
be fine.

Moriyoshi

On 2011/03/05, at 3:05, Moriyoshi Koizumi wrote:

 It looks like the only solution is define a new stream type for zend_stream 
 that delegates stream operations to user-defined callbacks.
 
 Moriyoshi
 
 On 2011/03/04, at 10:26, Stas Malyshev wrote:
 
 Hi!
 
 I try to do some complex code with custom streams and I have discovered the 
 following problem:
 
 The code in main/streams/cast.c, specifically _php_stream_cast, creates 
 fopencookie() synthetic stream for streams that are not actual file streams. 
 Which works fine until such stream is used in include(), in which case it 
 ultimately arrives at zend_stream_fixup(). Which would in turn call 
 zend_stream_fsize() - which would do fstat(fileno(file_handle-handle.fp), 
 buf) - and that would fail since you can't get fileno for FILE* created by 
 fopencookie.
 Which ultimately means I can't use my custom streams for include(), which is 
 bad. Now, looking at the code, it doesn't actually need the exact size - 
 http streams can be included just fine - but insists on having it if it has 
 fp (which it can have for basically any kind of stream due to the cookie 
 trick). Does anyone has any idea why and if it can be fixed?
 -- 
 Stanislav Malyshev, Software Architect
 SugarCRM: http://www.sugarcrm.com/
 (408)454-6900 ext. 227
 
 -- 
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php
 
 


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Reverting ext/mbstring patch

2011-03-03 Thread Moriyoshi Koizumi
Hi,

The obvious problem looked like handling of internal encoding.  When
the script is written in an encoding that is incompatible with the
lexer, the script is converted into internal encoding (input_filter)
for parsing, and then gets every string literal converted back to the
original encoding (output_filter).  Some of the test cases fail
because the internal encoding is not set to an encoding that is
bidirectionally convertible from/to the script encoding (ISO-8859-1
against Shift_JIS for example.)

I'm gonna make a fix to change that behavior so that the input_filter
always converts the script into UTF-8 instead of internal_encoding.

Also gonna take a closer look into your patch.  You basically don't
have to adjust the style of codes under libmbfl as it is a separate
library.  Bugfixes are always appreciated.

Regards,
Moriyoshi

On Thu, Mar 3, 2011 at 4:44 PM, Dmitry Stogov dmi...@zend.com wrote:
 Hi Moriyoshi,

 OK, I thought the email was lost, so ignore the email I just resent.

 In general I like your patch and I would glad to see it fixed.

 I already tried to make some fixes.
 See the attached patch.

 Thanks. Dmitry.

 On 03/02/2011 11:51 PM, Moriyoshi Koizumi wrote:

 Hey,

 I think I can fix it somehow.  Please don't be haste with it.  I am
 going to look into it.

 Moriyoshi

 On Tue, Mar 1, 2011 at 11:35 PM, Dmitry Stogovdmi...@zend.com  wrote:

 Hi,

 I'm going to revert Moriyoshi patch from December and some following
 fixes.

 I like the idea of the patch, but it just doesn't work as expected.
 It breaks 10 tests, but in general it breaks most things related to
 Unicode
 (declare statement, multibyte scripts, exif support for Unicode,
 multibyte
 POST requests).

 I tried to fix it myself, but I just can't understand how it should work
 (it's too big). It also has several places where integers messed with
 pointers, old API messed with new one and so on.

 I'm going to revert (apply the attached patch) on Thursday.

 Following is the list of failed tests:

 Shift_JIS request [tests/basic/029.phpt]
 Testing declare statement with several type values
 [Zend/tests/declare_001.phpt]
 Zend Multibyte and ShiftJIS
 [Zend/tests/multibyte/multibyte_encoding_001.phpt]
 Zend Multibyte and UTF-8 BOM
 [Zend/tests/multibyte/multibyte_encoding_002.phpt]
 Zend Multibyte and UTF-16 BOM
 [Zend/tests/multibyte/multibyte_encoding_003.phpt]
 encoding conversion from script encoding into internal encoding
 [Zend/tests/multibyte/multibyte_encoding_005.phpt]
 086: bracketed namespace with encoding [Zend/tests/ns_086.phpt]
 Check for exif_read_data, Unicode user comment
 [ext/exif/tests/exif003.phpt]
 Check for exif_read_data, Unicode WinXP tags
 [ext/exif/tests/exif004.phpt]
 Test mb_get_info() function [ext/mbstring/tests/mb_get_info.phpt]

 Thanks. Dmitry.




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: built-in web server in CLI.

2011-03-03 Thread Moriyoshi Koizumi
Hi,

On Thu, Mar 3, 2011 at 3:35 PM, Alexey Zakhlestin indey...@gmail.com wrote:
 On Wed, Mar 2, 2011 at 11:55 PM, Moriyoshi Koizumi m...@mozo.jp wrote:
 Hi,

 Just to let you know that I wrote a RFC about built-in web server
 feature with which PHP can serve contents without a help of web
 servers.  That would be handy for development purpose.

 If interested, have a look at http://wiki.php.net/rfc/builtinwebserver .

 Interesting, indeed.

 I noticed, that you hardcode mimetypes and index_files.
 Mimetypes can probably be obtained from the system — we even had some
 extension doing that.
 And index_files should be configurable, because there are some
 situations when people don't want any mime-types at all.

 Also, it would be good to be able to configure which files are
 actually parsed by php, not just served. Currently, these are only
 .php files


We coundn't always count on the existence of mime.types, which is
likely installed with Apache that is uncalled for.  Neither do I see
any good reason to make index files configurable because I have hardly
seen such a peculiar setting for several years that uses file names
other than index.html or index.php for index files.  I used to use
index.htm for technical reasons though.

In short, if you need to configure it more, it'd be better off
installing Apache to do the right job.  I would like to cover just a
marginal part of the developer needs with this.

Regards,
Moriyoshi

 --
 Alexey Zakhlestin, http://twitter.com/jimi_dini
 http://www.milkfarmsoft.com/


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: built-in web server in CLI.

2011-03-03 Thread Moriyoshi Koizumi
Hi,

2011/3/3 Ángel González keis...@gmail.com:
 Moriyoshi Koizumi wrote:
 Regarding the patch (https://gist.github.com/835698):
 I don't see a switch to disable the internal parse on configure.
 I don't see any obvious reason it should be able to be turned off
 through the build option.  The only problem is binary size increase,
 which I guess is quite subtle.
 Seems sufficiently different from normal cli.

I've seen a number of people arguing on the same, but I'd rather have
them both in for the sake of simplicity.  As discussed when the CLI
version of PHP was born, multiple PHP binaries actually have confused
the users to a certain degree.

 The patch looks messy as it splits main in two functions, so it gets
 hard to follow,
 but is probably good overall.
 Assuming you are mentioning about the option parsing portion of the
 code, yes, it's a bit messy, but I had to do so because runtime
 initialization procedure is very different from the ordinary CLI.
 Wasn't critizising you. It's a limitation of unified diffs, which can't say
 move this bunch of code 25 lines down. A bit hard to follow, but
 aparently good.

I would have put a whitespace-ignoring diff as well ;-)

 Any special reason to disable it on PHP_CLI_WIN32_NO_CONSOLE ?
 cli-win32 version of PHP doesn't have an associated console and is
 supposed to use to create applications without console interactions
 (i.e. GUI).  So, It doesn't make sense to enable this feature for it.
 With the embedded web server, the interaction would be done via the browser.

It is not intended to be daemonized at all since it's just a
development web server.  To do more, Apache should work great then.

Regards,
Moriyoshi




 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: built-in web server in CLI.

2011-03-03 Thread Moriyoshi Koizumi
On Fri, Mar 4, 2011 at 11:17 AM, Christopher Jones
christopher.jo...@oracle.com wrote:


 On 03/02/2011 12:55 PM, Moriyoshi Koizumi wrote:

 Hi,

 Just to let you know that I wrote a RFC about built-in web server
 feature with which PHP can serve contents without a help of web
 servers.  That would be handy for development purpose.

 If interested, have a look at http://wiki.php.net/rfc/builtinwebserver .

 Regards,
 Moriyoshi


 To allow for future changes, all options should require flags.
 For example, php -S localhost:8000 -d docroot instead of the
 currently proposed php -S localhost:8000 docroot

That might make sense.  I am thinking that being unable to specify a
document root with a router script is a bit inconsistent.

 Have you thought about integration with run-tests.php?

Not in mind yet, but if it's better than CGI.

Moriyoshi

 --
 Email: christopher.jo...@oracle.com
 Tel:  +1 650 506 8630
 Blog:  http://blogs.oracle.com/opal/


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Reverting ext/mbstring patch

2011-03-02 Thread Moriyoshi Koizumi
Hey,

I think I can fix it somehow.  Please don't be haste with it.  I am
going to look into it.

Moriyoshi

On Tue, Mar 1, 2011 at 11:35 PM, Dmitry Stogov dmi...@zend.com wrote:
 Hi,

 I'm going to revert Moriyoshi patch from December and some following fixes.

 I like the idea of the patch, but it just doesn't work as expected.
 It breaks 10 tests, but in general it breaks most things related to Unicode
 (declare statement, multibyte scripts, exif support for Unicode, multibyte
 POST requests).

 I tried to fix it myself, but I just can't understand how it should work
 (it's too big). It also has several places where integers messed with
 pointers, old API messed with new one and so on.

 I'm going to revert (apply the attached patch) on Thursday.

 Following is the list of failed tests:

 Shift_JIS request [tests/basic/029.phpt]
 Testing declare statement with several type values
 [Zend/tests/declare_001.phpt]
 Zend Multibyte and ShiftJIS
 [Zend/tests/multibyte/multibyte_encoding_001.phpt]
 Zend Multibyte and UTF-8 BOM
 [Zend/tests/multibyte/multibyte_encoding_002.phpt]
 Zend Multibyte and UTF-16 BOM
 [Zend/tests/multibyte/multibyte_encoding_003.phpt]
 encoding conversion from script encoding into internal encoding
 [Zend/tests/multibyte/multibyte_encoding_005.phpt]
 086: bracketed namespace with encoding [Zend/tests/ns_086.phpt]
 Check for exif_read_data, Unicode user comment [ext/exif/tests/exif003.phpt]
 Check for exif_read_data, Unicode WinXP tags [ext/exif/tests/exif004.phpt]
 Test mb_get_info() function [ext/mbstring/tests/mb_get_info.phpt]

 Thanks. Dmitry.


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] RFC: built-in web server in CLI.

2011-03-02 Thread Moriyoshi Koizumi
Hi,

Just to let you know that I wrote a RFC about built-in web server
feature with which PHP can serve contents without a help of web
servers.  That would be handy for development purpose.

If interested, have a look at http://wiki.php.net/rfc/builtinwebserver .

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: built-in web server in CLI.

2011-03-02 Thread Moriyoshi Koizumi
2011/3/3 Ángel González keis...@gmail.com:
 Moriyoshi Koizumi wrote:
 Hi,

 Just to let you know that I wrote a RFC about built-in web server
 feature with which PHP can serve contents without a help of web
 servers.  That would be handy for development purpose.

 If interested, have a look at http://wiki.php.net/rfc/builtinwebserver .

 Regards,
 Moriyoshi
 I like the idea.

 Regarding the patch (https://gist.github.com/835698):
 I don't see a switch to disable the internal parse on configure.

I don't see any obvious reason it should be able to be turned off
through the build option.  The only problem is binary size increase,
which I guess is quite subtle.

 I'd expect the files to be on its own folder inside sapi, even being
 able to
 bundle them in a single binary.

 Why is this needed on WIndows?

 + ADD_FLAG(LIBS_CLI, ws2_32.lib);

 Surely php will already link with the sockets library for its own functions.

Of course the objects that directly involves generation of php.exe
depend on WinSock functions. Other socket related portion is inside
php5.dll (php5ts.dll) whose imported symbols cannot be referred to
unlike ELF shared objects.

 The http parser code seems copied from https://github.com/ry/http-parser and
 it may not be a good idea to modify it downstream, but it  seems to do more
 things than strictly needed by php (eg. there are more methods than those a
 php server would take use).
 It also seems to be a hand-coded lexer, so that's much more verbose than a
 set of rules.

Do we really have to look into the parser right now?  I don't think we
have to limit the methods that the server can accept since there is no
reason limiting it though the server can deal with,  I don't find it a
problem for it to be hand-coded either.

 The patch looks messy as it splits main in two functions, so it gets
 hard to follow,
 but is probably good overall.

Assuming you are mentioning about the option parsing portion of the
code, yes, it's a bit messy, but I had to do so because runtime
initialization procedure is very different from the ordinary CLI.

 The change from php_printf to printf in line 3988 looks wrong.

php_printf() eventually redirects the output to
sapi_module.ub_write(), which should only be available after proper
SAPI initialization.  The changed part can be reached before the
initialization and it absolutely makes no sense to use php_printf()
when you simply want to print a message text before the script starts
in the console.

 Any special reason to disable it on PHP_CLI_WIN32_NO_CONSOLE ?

cli-win32 version of PHP doesn't have an associated console and is
supposed to use to create applications without console interactions
(i.e. GUI).  So, It doesn't make sense to enable this feature for it.

Regards,
Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: --enable-zend-multibyte

2010-12-06 Thread Moriyoshi Koizumi
Hi,

How about using the value of mbstring.script_encoding to determine
whether to enable the encoding conversion feature?  If the value is
the same as that of mbstring.internal_encoding, then no conversion
should be needed in the first place.  Besides we can define some
singular value like none that completely disables the conversion.

Regarding the dependency on mbstring extension, I think it's time to
enable mbstring by default.

Regards,
Moriyoshi

On Thu, Nov 18, 2010 at 11:26 PM, Dmitry Stogov dmi...@zend.com wrote:
 Hi,

 The proposed patch allows compiling PHP with --enable-zend-multibyte and
 then enable or disable multibyte support at run-time using
 zend.multibyte=0/1 in php.ini. As result the single binary will be able to
 support multibyte encodings and run without zend-multibyte overhead
 dependent on configuration.

 The patch doesn't affect PHP compiled without --enable-zend-multibyte.

 I'm going to commit it into trunk before alpha.
 Any objections?

 Thanks. Dmitry.

 Index: ext/standard/info.c
 ===
 --- ext/standard/info.c (revision 305494)
 +++ ext/standard/info.c (working copy)
 @@ -760,7 +760,7 @@
                php_info_print_table_row(2, Zend Memory Manager,
 is_zend_mm(TSRMLS_C) ? enabled : disabled );

  #ifdef ZEND_MULTIBYTE
 -               php_info_print_table_row(2, Zend Multibyte Support,
 enabled);
 +               php_info_print_table_row(2, Zend Multibyte Support,
 CG(multibyte) ? enabled : disabled);
  #else
                php_info_print_table_row(2, Zend Multibyte Support,
 disabled);
  #endif
 Index: ext/mbstring/mbstring.c
 ===
 --- ext/mbstring/mbstring.c     (revision 305494)
 +++ ext/mbstring/mbstring.c     (working copy)
 @@ -1132,6 +1132,9 @@
  {
        int *list, size;

 +       if (!CG(multibyte)) {
 +               return FAILURE;
 +       }
        if (php_mb_parse_encoding_list(new_value, new_value_length, list,
 size, 1 TSRMLS_CC)) {
                if (MBSTRG(script_encoding_list) != NULL) {
                        free(MBSTRG(script_encoding_list));
 @@ -1442,8 +1445,10 @@
        PHP_RINIT(mb_regex) (INIT_FUNC_ARGS_PASSTHRU);
  #endif
  #ifdef ZEND_MULTIBYTE
 -
 zend_multibyte_set_internal_encoding(mbfl_no_encoding2name(MBSTRG(internal_encoding))
 TSRMLS_CC);
 -       php_mb_set_zend_encoding(TSRMLS_C);
 +       if (CG(multibyte)) {
 +
 zend_multibyte_set_internal_encoding(mbfl_no_encoding2name(MBSTRG(internal_encoding))
 TSRMLS_CC);
 +               php_mb_set_zend_encoding(TSRMLS_C);
 +       }
  #endif /* ZEND_MULTIBYTE */

        return SUCCESS;
 @@ -1570,7 +1575,7 @@
                        MBSTRG(current_internal_encoding) = no_encoding;
  #ifdef ZEND_MULTIBYTE
                        /* TODO: make independent from
 mbstring.encoding_translation? */
 -                       if (MBSTRG(encoding_translation)) {
 +                       if (CG(multibyte)  MBSTRG(encoding_translation)) {
                                zend_multibyte_set_internal_encoding(name
 TSRMLS_CC);
                        }
  #endif /* ZEND_MULTIBYTE */
 Index: Zend/zend.c
 ===
 --- Zend/zend.c (revision 305494)
 +++ Zend/zend.c (working copy)
 @@ -93,6 +93,7 @@
        ZEND_INI_ENTRY(error_reporting,                               NULL,
           ZEND_INI_ALL,           OnUpdateErrorReporting)
        STD_ZEND_INI_BOOLEAN(zend.enable_gc,                          1,
    ZEND_INI_ALL,           OnUpdateGCEnabled,      gc_enabled,
 zend_gc_globals,        gc_globals)
  #ifdef ZEND_MULTIBYTE
 +       STD_ZEND_INI_BOOLEAN(zend.multibyte, 0, ZEND_INI_PERDIR,
 OnUpdateBool, multibyte,      zend_compiler_globals, compiler_globals)
        STD_ZEND_INI_BOOLEAN(detect_unicode, 1, ZEND_INI_ALL,
 OnUpdateBool, detect_unicode, zend_compiler_globals, compiler_globals)
  #endif
  ZEND_INI_END()
 Index: Zend/zend_language_scanner.l
 ===
 --- Zend/zend_language_scanner.l        (revision 305494)
 +++ Zend/zend_language_scanner.l        (working copy)
 @@ -181,7 +181,7 @@
        lex_state-filename = zend_get_compiled_filename(TSRMLS_C);
        lex_state-lineno = CG(zend_lineno);

 -#ifdef ZEND_MULTIBYTE
 +#ifdef ZEND_MULTIBYTE
        lex_state-script_org = SCNG(script_org);
        lex_state-script_org_size = SCNG(script_org_size);
        lex_state-script_filtered = SCNG(script_filtered);
 @@ -270,27 +270,32 @@

        if (size != -1) {
  #ifdef ZEND_MULTIBYTE
 -               if (zend_multibyte_read_script((unsigned char *)buf, size
 TSRMLS_CC) != 0) {
 -                       return FAILURE;
 -               }
 +               if (CG(multibyte)) {
 +                       if (zend_multibyte_read_script((unsigned char *)buf,
 size TSRMLS_CC) != 0) {
 +                               return FAILURE;
 +                       }

 -      

[PHP-DEV] Re: --enable-zend-multibyte

2010-12-06 Thread Moriyoshi Koizumi
On Mon, Dec 6, 2010 at 7:49 PM, Dmitry Stogov dmi...@zend.com wrote:
 Hi Moriyoshi,

 On 12/06/2010 01:31 PM, Moriyoshi Koizumi wrote:

 Hi,

 How about using the value of mbstring.script_encoding to determine
 whether to enable the encoding conversion feature?  If the value is
 the same as that of mbstring.internal_encoding, then no conversion
 should be needed in the first place.  Besides we can define some
 singular value like none that completely disables the conversion.

 Right now I introduced zend.multibyte directive which enables to look into
 mbstring.script_encoding and mbstring.internal_encoding as it did before
 with --enable-zend-multibyte. Note that we can't check for them in ZE
 directly because ZE knows nothing about ext/mbstring. It may be not compiled
 into PHP or compiled as DSO. Probably it's possible to do through additional
 callbacks.

Indeed mbstring.script_encoding is only used by zend_multibyte, so it
would make sense to alter the name of the ini setting to something
like zend.script_encoding, and move the relevant part of mbstring.c
into zend_multibyte.c


 Regarding the dependency on mbstring extension, I think it's time to
 enable mbstring by default.

 The idea I'm working on is to provide an ability to enable/disable all
 multibyte features without PHP recompilation. So the same binaries will be
 able to support Asian languages and work with European without performance
 degradation. My second patch adds the missing parts (POST request parsing,
 htmlentities, EXIF). I sent it to internals@ today.

I am totally fine with the idea.  I just mentioned it in terms of
users' convenience.

 Thanks. Dmitry.


 Regards,
 Moriyoshi


Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] ext/mbstring dependencies

2010-12-06 Thread Moriyoshi Koizumi
Hi,

The patch almost looks good to me, but we should be more careful about
introducing a set of hook points into the API.  I think it'd be great
if the multipart parser portion was rewritten so that it would only
call the Zend multibyte API's despite a slight performance drawback.

Regards,
Moriyoshi

On Mon, Dec 6, 2010 at 5:31 PM, Dmitry Stogov dmi...@zend.com wrote:
 Hi,

 The proposed patch completely removes ext/mbstring compile-time
 dependencies. As result the same php binaries may be used for Asian and
 European languages without performance degradation. ext/mbstring now may be
 compiled as a DSO. I'm going to commit the patch on Wednesday.
 Any comments are welcome.

 Thanks. Dmitry.

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: --enable-zend-multibyte

2010-12-06 Thread Moriyoshi Koizumi
On Tue, Dec 7, 2010 at 5:57 AM, Dmitry Stogov dmi...@zend.com wrote:
 On 12/06/2010 08:25 PM, Moriyoshi Koizumi wrote:

 On Mon, Dec 6, 2010 at 7:49 PM, Dmitry Stogovdmi...@zend.com  wrote:

 Hi Moriyoshi,

 On 12/06/2010 01:31 PM, Moriyoshi Koizumi wrote:

 Hi,

 How about using the value of mbstring.script_encoding to determine
 whether to enable the encoding conversion feature?  If the value is
 the same as that of mbstring.internal_encoding, then no conversion
 should be needed in the first place.  Besides we can define some
 singular value like none that completely disables the conversion.

 Right now I introduced zend.multibyte directive which enables to look
 into
 mbstring.script_encoding and mbstring.internal_encoding as it did before
 with --enable-zend-multibyte. Note that we can't check for them in ZE
 directly because ZE knows nothing about ext/mbstring. It may be not
 compiled
 into PHP or compiled as DSO. Probably it's possible to do through
 additional
 callbacks.

 Indeed mbstring.script_encoding is only used by zend_multibyte, so it
 would make sense to alter the name of the ini setting to something
 like zend.script_encoding, and move the relevant part of mbstring.c
 into zend_multibyte.c


 Usage of zend.script_encoding instead of zend.multibyte might make sense.
 I'll take a look into it. However I don't see a way to move the relevant
 parts of ext/mbstring into zend_multibute.c. zend_multibyte will have to
 call for external detector and converter anyway. May be I misunderstood you.
 Could you explain what you means?

I meant the part only related to the ini setting.  Related part in
mb_get_info() can be removed as it would no longer belong to mbstring
settings.  While I am not sure of what you meant by external detector
and converter, encoding detector would be supplied through
zend_multibyte_set_functions() in mbstring's MINIT and there should be
no need to bring any extra facility into zend_multibyte.c .

Regards,
Moriyoshi

 Thanks. Dmitry.

 Thanks. Dmitry.




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Autoboxing in PHP

2010-05-03 Thread Moriyoshi Koizumi
Hey,

Just to let you know about a new RFC for adding autoboxing feature in PHP.
Look at http://wiki.php.net/rfc/autoboxing .

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Using default_charset for htmlspecialchars() and others

2010-05-03 Thread Moriyoshi Koizumi
Hi,

I am under the impression that we have to provide an alternative to
htmlspecialchars() that incorporates the following ideas:

- Shorter function name
  html_escape() for example. _h() would be much more preferable in
terms of preventing XSS ;-p
- Using default_charset as the default encoding for it.
- ENT_QUOTES as default.

Regards,
Moriyoshi

On Mon, May 3, 2010 at 7:53 AM, Brian Moon br...@moonspot.net wrote:
 I am not sure if this has been discussed or not. I will gladly make an RFC
 if not. I think it would be very intuitive if htmlspecialchars used the ini
 value default_charset as its default. And any function that takes an
 optional character set.

 A) Has this been discussed?
 B) If not, do others think it is worth of a proper RFC?

 There would be some BC breakage for sure as the default behavior would be
 changing.

 --

 Brian.
 
 brianlm...@php.net

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] php and multithreading (additional arguments)

2010-04-05 Thread Moriyoshi Koizumi
I used to play with TSRM days ago and successfully implemented
userland threading support using GNU Pth.  It's just a proof of
concept and I did it for fun.

If interested, check out
http://github.com/moriyoshi/php-src/tree/PHP_5_3-threading/ and
read 
http://github.com/moriyoshi/php-src/blob/PHP_5_3-threading/ext/threading/README
for detail (not much information though).

Also note that the language syntax was extended there so it would
support golang-like message passing.

?php
function sub($i, $ch) {
for (;;) {
// receive the message from $ch
$a = - [$ch];
printf(%d: %s\n, $i, $a);
}
}

$ch = thread_message_queue_create();
for ($i = 0; $i  10; $i++) {
thread_create('sub', $i, $ch);
}

$i = 0;
for (;;) {
// send $i to $ch
[$ch] - $i++;
usleep(5);
}
?

Moriyoshi

On Thu, Apr 1, 2010 at 11:32 PM, speedy speedy.s...@gmail.com wrote:
 Hello PHP folks,

      I've seen this discussed previously, and would like to add a few
      arguments for the multi-threading side vs. async processing,
      which I've seen mentioned as a viable alternative:

      1. Imagine that from time to time, some background processing takes 1
      second of CPU time - w/o multithreading, all your async operations,
      like accepting a connection to a socket, aio or others are basically
      stalled. So, async is a good approach, but does not work as a magic
      wand for all problem spaces. Alternatively, you could fork and then do 
 the
      processing, but then the state syncing of the forked background 
 processing
      results with the main thread requires a whole new protocol / switching to
      interprocess communication, which makes such developments unnecessarily
      hard. Threads exist for a _reason_ not only because they sound cool.

      2. Without thread support, it is not possible to use multi-core 
 processing
      paradigms directly from PHP - which means PHP relies on external 
 frameworks for
      that feature, which, in a sense, makes it a non-general-purpose language.
      It _could become_ a general purpouse tool, if it had proper 
 multi-threading
      support built-in.

      I, personally, considered developing websockets / nanoserv server stack 
 with PHP
      and bumped into the multithreading limitation - AFAIK it is the only big 
 feature
      separating PHP from the general purpouse languages. Everything else is 
 well
      integrated with lots of external libraries/modules giving rise to 
 potential rapid
      application development while using it.

      Cheers and let me know about your thoughts, and potential core 
 implementation
      issues regarding developing this language feature.

 --
 Best regards,
  speedy                          mailto:speedy.s...@gmail.com


 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] php and multithreading (additional arguments)

2010-04-05 Thread Moriyoshi Koizumi
On Mon, Apr 5, 2010 at 7:17 PM, Alexey Zakhlestin indey...@gmail.com wrote:

 On 05.04.2010, at 13:46, Moriyoshi Koizumi wrote:

 I used to play with TSRM days ago and successfully implemented
 userland threading support using GNU Pth.  It's just a proof of
 concept and I did it for fun.

 So these are share-nothing worker-threads, which can send results to 
 master-thread using messages. right?
 I am perfectly fine with such approach

A new thread can be created within a sub-thread as well.  In addition,
messages can be interchanged between any pair of threads.

While it is based on shared-nothing approach, some kinds of resources
are shared across threads besides classes and functions that would
have already been defined before the thread creation.


 some stylistic moments:
 * I would use closures instead of callback-functions

I was trying hard to make closures work with the extension, but it
wouldn't end up with a success.  I guess I can fix it somehow.

 * Is extra language construct really needed? function-call would work just 
 fine

I don't quite think so.  It was just an experiment, and each extra
syntactic sugar would get converted to a corresponding single function
call (either thread_message_queue_post() or
thread_message_queue_poll() .)

 Is overhead of starting new thread large?

The cost is almost the same as when spawning a new runtime instance on
a threaded web server with TSRM enabled.  If you'd pass a large data
to the subthread, then the overhead should go large because of the
deep copy.

Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Performance improvements

2010-03-25 Thread Moriyoshi Koizumi
Hi,

On Thu, Mar 25, 2010 at 8:41 AM, Stanislav Malyshev s...@zend.com wrote:
 Hi!

 Wouldn't it suffice to add a field for the hash value and a flag that
 indicates its validity to zval instead of appending zend_literal
 everywhere?

 Enlarging zval would be costly (the engine uses tons of zvals) and may also
 be more complicated to track (all zval operations now would also have to
 take care to set the flag right - what if we forget in some extension to set
 it right?). I think it's better not to mess with zval.

If all the constants were interned, then we should not need
zend_literal in the first place because we can store the hash values
separately in an array whose indices correspond to those of the
interned string vector.  Plus, I think the hash value can be stored in
the following extra bytes of the string buffer pointed by str.val.

Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Performance improvements

2010-03-24 Thread Moriyoshi Koizumi
Hi,

Wouldn't it suffice to add a field for the hash value and a flag that
indicates its validity to zval instead of appending zend_literal
everywhere?

Moriyoshi

On Wed, Mar 24, 2010 at 11:12 PM, Zeev Suraski z...@zend.com wrote:
 Hi,

 Over the last few weeks we've been working on several ideas we had for
 performance enhancements. We've managed to make some good progress.  Our
 initial tests show roughly 10% speed improvement on real world apps.  On
 pure OO code we're seeing as much as 25% improvement (!)

 While this still is a work in progress (and not production quality code yet)
 we want to get feedback sooner rather than later. The diff (available at
 http://bit.ly/aDPTmv) applies cleanly to trunk.  We'd be happy for people to
 try it out and send comments.

 What does it contain?

 1) Constant operands have been moved from being embedded within the opcodes
 into a separate literal table. In additional to the zval it contains
 pre-calculated hash values for string literals. As result PHP uses less
 memory and doesn't have to recalculate hash values for constants at
 run-time.

 2) Lazy HashTable buckets allocation – we now only allocate the buckets
 array when we actually insert data into the hash for the first time.  This
 saves both memory and time as many hash tables do not have any data in them.

 3) Interned strings (see
 http://en.wikipedia.org/wiki/String_interninghttp://en.wikipedia.org/wiki/String_interning).
 Most strings known at compile-time are allocated in a single copy with some
 additional information (pre-calculated hash value, etc.).  We try to make
 most incarnations of a given string point to that same single version,
 allowing us to save memory, but more importantly - run comparisons by
 comparing pointers instead of comparing strings and avoid redundant hash
 value calculations.

 A couple of notes:
 a.  Not all of the strings are interned - which means that if a pointer
 comparison fails, we still go through a string comparison;  But if it
 succeeds - it's good enough.
 b.  We'd need to add support for this in the bytecode caches. We'd be happy
 to work with the various bytecode cache teams to guide how to implement
 support so that you do not have to intern on each request.

 To get a better feel for what interning actually does, consider the
 following examples:

 // Lookup for $arr will not calculate a hash value, and will only require a
 pointer comparison in most cases
 // Lookup for foo in $arr will not calculate a hash value, and will only
 require a pointer comparison
 // The string foo will not have to be allocated as a key in the Bucket
 // blah when assigned doesn't have to be duplicated
 $arr[“foo”] = “blah”;

 $a = “b”;
 if ($a == “b”) { // pointer comparison only
  ...
 }

 Comments welcome!

 Zeev

 Patch available at: http://bit.ly/aDPTmv


 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Where are we ACTUALLY on Unicode?

2010-03-14 Thread Moriyoshi Koizumi
On Sun, Mar 14, 2010 at 11:23 PM, Jordi Boggiano j.boggi...@seld.be wrote:
 On Sun, Mar 14, 2010 at 12:03 PM, Stan Vassilev sv_for...@fmethod.com wrote:
 UTF8 also takes 4 bytes for representing characters in the higher bit
 planes, as quite a lot of bits are lost for every char in order to describe
 how long the code point is, and when it ends and so on. This means
 memory-wise it may not be of big benefit to asian countries.

 I remember Brian Aker saying that they chose to work internally with
 UTF-8 for Drizzle. His explanation of it was that asian countries have
 so much english content mixed in that on average even for them UTF-8
 still had a lower footprint than UTF-16/32. I do not know where the
 stats came from, but if it holds any truth it is worth considering.

This is true, as most of the text data that are interchanged in the
Internet should be represented in HTML, in which such characters and
alphabetic tags always appear alternatively.

Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: moving forward

2010-03-14 Thread Moriyoshi Koizumi
On Mon, Mar 15, 2010 at 6:25 AM, Herman Radtke hermanrad...@gmail.com wrote:
 Oh no .. another dangerous topic. Again we have been there even before the 
 switch. The idea is to keep the centralized repo on svn, because the masses 
 know how it works, the tools are widely available and we have plenty of 
 experience among us in how to keep svn running. I see little incentive to 
 move the _central_ repo to a DVCS. Are the bridges to git, mercurial, bzaar 
 etc really so bad that this topic is worth discussing (no sarcasm, honest 
 question)?

 I only have experience with git.  The problem with something like
 git-svn is that your git branch becomes an island.  I can't share that
 branch with anyone else.  So all I really get is git syntax within an
 svn environment.

There are a number of ways to share your branches with others.  At
least you can do it by pushing your local changesets to some remote
repository.  I've actually been experimenting with modified PHP core
with some language features added by forking the mirror on github.com
[1].  I've never felt any inconvenience there.  I really appreciate
those who set up the mirror.

 I have no problem working with svn and actually prefer it for projects
 that use a compiler.  For PHP apps, git is great because nothing has
 to be built.  Bouncing between git branches means I have to recompile
 PHP every time (or set up some system of symlinks).

I guess you can have several local repositories that have different
branches checked out at the same time...

[1] http://github.com/moriyoshi/php-src/

Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PHP 6

2010-03-13 Thread Moriyoshi Koizumi
On Sat, Mar 13, 2010 at 8:09 PM, Lester Caine les...@lsces.co.uk wrote:
 Handling unicode CONTENT is not the problem here. People nowadays expect to
 be able to use their own language to write code, and create functions using
 words that they recognize. In databases, table and field names are now
 expected to support unicode, rather than just handling unicode data pumped
 into ascii titled fields.

 Personally I'm quite happy with just using ascii names for things, but more
 and more overseas customers provide contact details in 'strange' character
 sets that only unicode can handle, and handling THAT in PHP5 is not a
 problem. It's when people start building databases with unicode metadata and
 expect the tools interfacing with that to understand unicode as well.

 It was my understanding that PHP6 was intended to provide international
 users with something that they could use in their own native language?
 Unicode titled files with unicode titled classes and functions.

I doubt that many people are gonna start using non-latin characters
for the identifiers, as long as most of those generally use the
alphabetic keyboard, with which I cannot type any Japanese words as
fast as alphanumerics. In addition, I think RTL languages such as
Arabic aren't supposed to use with latin punctuation marks in a
programming language in the first place.

Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PHP 6

2010-03-13 Thread Moriyoshi Koizumi
On Sat, Mar 13, 2010 at 6:07 PM, Chen Ze surfc...@gmail.com wrote:
 I think unicode should only care for string handling. Formatting
 numbers should not be the thing that unicode cares. Unicode is a
 standard for text, not for text or number formatting.

 Back to the days we don't have unicode, the number formatting have
 already existed. It even exists when computer was not invented.

 That is same for sorting.

 When we think about Unicode, we should think about those really
 related to Unicode,like file system. Number formatting and sorting are
 other things which intl cares.

 For the unicode, I think we should implement something like:

 $chars=new mchar($bytes,$bytes_encoding);
 echo $chars;//output encoding
 foreach ($chars as $char) {
      echo $char;//output single utf-16/utf-8 char (depends on default
 output encoding)
 }
 echo $chars-bytes('gbk');

 $chars-outputEncoding('gbk');
 echo $chars;

 ini_set('mchar_output_encoding','gbk');
 echo $chars;

 ini_set('mchar_filesystem_encoding','gbk');
 echo $chars-filepath();


I don't totally agree with what is being said here, but I guess we
don't have to make Unicode a first-class value.  Once operator
overloading is supported, Unicode strings can be represented as
objects, like Python does although  I didn't have a look at past
discussion on this topic.

Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PHP 6

2010-03-13 Thread Moriyoshi Koizumi
Surprisingly, It can be done quite easily with the current object
handler infrastructure.

Moriyoshi

On Sun, Mar 14, 2010 at 12:08 AM, Pierre Joye pierre@gmail.com wrote:
 On Sat, Mar 13, 2010 at 3:13 PM, Moriyoshi Koizumi m...@mozo.jp wrote:

 I don't totally agree with what is being said here, but I guess we
 don't have to make Unicode a first-class value.  Once operator
 overloading is supported, Unicode strings can be represented as
 objects, like Python does although  I didn't have a look at past
 discussion on this topic.

 Operators overloading, while being a cool feature, should not be
 associated with Unicoderelated features. Or we are going to do the
 exact same mistakes than before, way too many changes, features, work
 to even get a visible deadline for the next major release.

 Cheers,
 --
 Pierre

 @pierrejoye | http://blog.thepimp.net | http://www.libgd.org

diff --git a/Zend/zend.h b/Zend/zend.h
index 38f461c..0ffcb1a 100644
--- a/Zend/zend.h
+++ b/Zend/zend.h
@@ -442,6 +442,7 @@ struct _zend_class_entry {
union _zend_function *__call;
union _zend_function *__callstatic;
union _zend_function *__tostring;
+   union _zend_function *__concat;
union _zend_function *serialize_func;
union _zend_function *unserialize_func;
 
diff --git a/Zend/zend_API.c b/Zend/zend_API.c
index 5433dc1..e0dcd73 100644
--- a/Zend/zend_API.c
+++ b/Zend/zend_API.c
@@ -1799,7 +1799,7 @@ ZEND_API int zend_register_functions(zend_class_entry 
*scope, const zend_functio
int count=0, unload=0;
HashTable *target_function_table = function_table;
int error_type;
-   zend_function *ctor = NULL, *dtor = NULL, *clone = NULL, *__get = NULL, 
*__set = NULL, *__unset = NULL, *__isset = NULL, *__call = NULL, *__callstatic 
= NULL, *__tostring = NULL;
+   zend_function *ctor = NULL, *dtor = NULL, *clone = NULL, *__get = NULL, 
*__set = NULL, *__unset = NULL, *__isset = NULL, *__call = NULL, *__callstatic 
= NULL, *__tostring = NULL, *__concat = NULL;
char *lowercase_name;
int fname_len;
char *lc_class_name = NULL;
@@ -1929,6 +1929,8 @@ ZEND_API int zend_register_functions(zend_class_entry 
*scope, const zend_functio
__unset = reg_function;
} else if ((fname_len == 
sizeof(ZEND_ISSET_FUNC_NAME)-1)  !memcmp(lowercase_name, 
ZEND_ISSET_FUNC_NAME, sizeof(ZEND_ISSET_FUNC_NAME))) {
__isset = reg_function;
+   } else if ((fname_len == 
sizeof(ZEND_CONCAT_FUNC_NAME)-1)  !memcmp(lowercase_name, 
ZEND_CONCAT_FUNC_NAME, sizeof(ZEND_CONCAT_FUNC_NAME))) {
+   __concat = reg_function;
} else {
reg_function = NULL;
}
@@ -1967,6 +1969,7 @@ ZEND_API int zend_register_functions(zend_class_entry 
*scope, const zend_functio
scope-__set = __set;
scope-__unset = __unset;
scope-__isset = __isset;
+scope-__concat = __concat;
if (ctor) {
ctor-common.fn_flags |= ZEND_ACC_CTOR;
if (ctor-common.fn_flags  ZEND_ACC_STATIC) {
@@ -2030,6 +2033,12 @@ ZEND_API int zend_register_functions(zend_class_entry 
*scope, const zend_functio
}
__isset-common.fn_flags = ~ZEND_ACC_ALLOW_STATIC;
}
+   if (__concat) {
+   if (__concat-common.fn_flags  ZEND_ACC_STATIC) {
+   zend_error(error_type, Method %s::%s() cannot 
be static, scope-name, __concat-common.function_name);
+   }
+   __concat-common.fn_flags = ~ZEND_ACC_ALLOW_STATIC;
+   }
efree(lc_class_name);
}
return SUCCESS;
diff --git a/Zend/zend_compile.c b/Zend/zend_compile.c
index 13b6c55..91cd34a 100644
--- a/Zend/zend_compile.c
+++ b/Zend/zend_compile.c
@@ -1267,6 +1267,10 @@ void zend_do_begin_function_declaration(znode 
*function_token, znode *function_n
} else if ((name_len == 
sizeof(ZEND_TOSTRING_FUNC_NAME)-1)  (!memcmp(lcname, ZEND_TOSTRING_FUNC_NAME, 
sizeof(ZEND_TOSTRING_FUNC_NAME)-1))) {
if (fn_flags  ((ZEND_ACC_PPP_MASK | 
ZEND_ACC_STATIC) ^ ZEND_ACC_PUBLIC)) {
zend_error(E_WARNING, The magic method 
__toString() must have public visibility and cannot be static);
+}
+   } else if ((name_len == 
sizeof(ZEND_CONCAT_FUNC_NAME)-1)  (!memcmp(lcname, ZEND_CONCAT_FUNC_NAME, 
sizeof(ZEND_CONCAT_FUNC_NAME)-1))) {
+   if (fn_flags  ((ZEND_ACC_PPP_MASK | 
ZEND_ACC_STATIC) ^ ZEND_ACC_PUBLIC)) {
+   zend_error(E_WARNING, The magic method 
 ZEND_CONCAT_FUNC_NAME  must have public visibility

Re: [PHP-DEV] PHP 6

2010-03-13 Thread Moriyoshi Koizumi
It looks like I stripped off too much. Attached is the right one.

Moriyoshi

On Sun, Mar 14, 2010 at 12:41 AM, Moriyoshi Koizumi m...@mozo.jp wrote:
 Surprisingly, It can be done quite easily with the current object
 handler infrastructure.

 Moriyoshi

 On Sun, Mar 14, 2010 at 12:08 AM, Pierre Joye pierre@gmail.com wrote:
 On Sat, Mar 13, 2010 at 3:13 PM, Moriyoshi Koizumi m...@mozo.jp wrote:

 I don't totally agree with what is being said here, but I guess we
 don't have to make Unicode a first-class value.  Once operator
 overloading is supported, Unicode strings can be represented as
 objects, like Python does although  I didn't have a look at past
 discussion on this topic.

 Operators overloading, while being a cool feature, should not be
 associated with Unicoderelated features. Or we are going to do the
 exact same mistakes than before, way too many changes, features, work
 to even get a visible deadline for the next major release.

 Cheers,
 --
 Pierre

 @pierrejoye | http://blog.thepimp.net | http://www.libgd.org


diff --git a/Zend/zend.h b/Zend/zend.h
index 38f461c..0ffcb1a 100644
--- a/Zend/zend.h
+++ b/Zend/zend.h
@@ -442,6 +442,7 @@ struct _zend_class_entry {
union _zend_function *__call;
union _zend_function *__callstatic;
union _zend_function *__tostring;
+   union _zend_function *__concat;
union _zend_function *serialize_func;
union _zend_function *unserialize_func;
 
diff --git a/Zend/zend_API.c b/Zend/zend_API.c
index 5433dc1..e0dcd73 100644
--- a/Zend/zend_API.c
+++ b/Zend/zend_API.c
@@ -1799,7 +1799,7 @@ ZEND_API int zend_register_functions(zend_class_entry 
*scope, const zend_functio
int count=0, unload=0;
HashTable *target_function_table = function_table;
int error_type;
-   zend_function *ctor = NULL, *dtor = NULL, *clone = NULL, *__get = NULL, 
*__set = NULL, *__unset = NULL, *__isset = NULL, *__call = NULL, *__callstatic 
= NULL, *__tostring = NULL;
+   zend_function *ctor = NULL, *dtor = NULL, *clone = NULL, *__get = NULL, 
*__set = NULL, *__unset = NULL, *__isset = NULL, *__call = NULL, *__callstatic 
= NULL, *__tostring = NULL, *__concat = NULL;
char *lowercase_name;
int fname_len;
char *lc_class_name = NULL;
@@ -1929,6 +1929,8 @@ ZEND_API int zend_register_functions(zend_class_entry 
*scope, const zend_functio
__unset = reg_function;
} else if ((fname_len == 
sizeof(ZEND_ISSET_FUNC_NAME)-1)  !memcmp(lowercase_name, 
ZEND_ISSET_FUNC_NAME, sizeof(ZEND_ISSET_FUNC_NAME))) {
__isset = reg_function;
+   } else if ((fname_len == 
sizeof(ZEND_CONCAT_FUNC_NAME)-1)  !memcmp(lowercase_name, 
ZEND_CONCAT_FUNC_NAME, sizeof(ZEND_CONCAT_FUNC_NAME))) {
+   __concat = reg_function;
} else {
reg_function = NULL;
}
@@ -1967,6 +1969,7 @@ ZEND_API int zend_register_functions(zend_class_entry 
*scope, const zend_functio
scope-__set = __set;
scope-__unset = __unset;
scope-__isset = __isset;
+   scope-__concat = __concat;
if (ctor) {
ctor-common.fn_flags |= ZEND_ACC_CTOR;
if (ctor-common.fn_flags  ZEND_ACC_STATIC) {
@@ -2030,6 +2033,12 @@ ZEND_API int zend_register_functions(zend_class_entry 
*scope, const zend_functio
}
__isset-common.fn_flags = ~ZEND_ACC_ALLOW_STATIC;
}
+   if (__concat) {
+   if (__concat-common.fn_flags  ZEND_ACC_STATIC) {
+   zend_error(error_type, Method %s::%s() cannot 
be static, scope-name, __concat-common.function_name);
+   }
+   __concat-common.fn_flags = ~ZEND_ACC_ALLOW_STATIC;
+   }
efree(lc_class_name);
}
return SUCCESS;
diff --git a/Zend/zend_compile.c b/Zend/zend_compile.c
index 13b6c55..91cd34a 100644
--- a/Zend/zend_compile.c
+++ b/Zend/zend_compile.c
@@ -1267,6 +1267,10 @@ void zend_do_begin_function_declaration(znode 
*function_token, znode *function_n
} else if ((name_len == 
sizeof(ZEND_TOSTRING_FUNC_NAME)-1)  (!memcmp(lcname, ZEND_TOSTRING_FUNC_NAME, 
sizeof(ZEND_TOSTRING_FUNC_NAME)-1))) {
if (fn_flags  ((ZEND_ACC_PPP_MASK | 
ZEND_ACC_STATIC) ^ ZEND_ACC_PUBLIC)) {
zend_error(E_WARNING, The magic method 
__toString() must have public visibility and cannot be static);
+}
+   } else if ((name_len == 
sizeof(ZEND_CONCAT_FUNC_NAME)-1)  (!memcmp(lcname, ZEND_CONCAT_FUNC_NAME, 
sizeof(ZEND_CONCAT_FUNC_NAME)-1))) {
+   if (fn_flags  ((ZEND_ACC_PPP_MASK

Re: [PHP-DEV] PHP 6

2010-03-13 Thread Moriyoshi Koizumi
On Sun, Mar 14, 2010 at 1:57 AM, Derick Rethans der...@php.net wrote:

 - in the meanwhile, start working on patching in back Unicode support,
  but in small steps. Exactly which things, and how we'd have to find
  out. But I do think it needs to be a *core* language feature, and not
  simply solved by extensions. We also need to make sure everybody
  understands that Unicode isn't just about encodings, or charsets and
  that thre are differences between that. Education is going to be
  important (and adding Unicode back in small bits would certainly help
  there).

Unicode isn't just about a coded character set, but a whole lot of
specifications that are essential in handling various kinds of texts
within an application.  But, I don't really think it reinforces the
reason Unicode functionality needs to be part of core language.

Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PHP 6

2010-03-12 Thread Moriyoshi Koizumi
I'd love to see my brand-new mbstring implementation in the release.
Dropping mbstring completely won't be any good because lots of
applications rely on it, but I don't really want to maintain the funky
library bundled with it.

Moriyoshi

On Fri, Mar 12, 2010 at 2:22 AM, Rasmus Lerdorf ras...@lerdorf.com wrote:
 Ah, Jani went a little crazy today in his typical style to force a
 decision.  The real decision is not whether to have a version 5.4 or
 not, it is all about solving the Unicode problem.  The current effort
 has obviously stalled.  We need to figure out how to get development
 back on track in a way that people can get on board.  We knew the
 Unicode effort was hugely ambitious the way we approached it.  There are
 other ways.

 So I think Lukas and others are right, let's move the PHP 6 trunk to a
 branch since we are still going to need a bunch of code from it and move
 development to trunk and start exploring lighter and more approachable
 ways to attack Unicode.  We have a few already.  Enhanced mbstring and
 ext/intl.  Let's see some good ideas around that and work on those in
 trunk.  Other features necessarily need to play along with these in the
 same branch.  I refuse to go down the path of a 5.4 branch and a
 separate Unicode branch again.

 The main focus here needs to be to get everyone working in the same branch.

 -Rasmus

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PHP 6

2010-03-12 Thread Moriyoshi Koizumi
Huh? mbstring has been capable of handling lots of encodings other
than UTF-8 since it was introduced.

We might often find it annoying that Unicode is handled transparently
through I/O functions when the internal encoding is different from the
outside encoding.  It just seems you didn't ever make a serious
internaltionalized application.

Moriyoshi

On Sat, Mar 13, 2010 at 3:34 AM, Derick Rethans der...@php.net wrote:
 On Fri, 12 Mar 2010, Hannes Magnusson wrote:

 On Fri, Mar 12, 2010 at 17:38, Moriyoshi Koizumi m...@mozo.jp wrote:
  I'd love to see my brand-new mbstring implementation in the release.
  Dropping mbstring completely won't be any good because lots of
  applications rely on it, but I don't really want to maintain the funky
  library bundled with it.

 Thats actually one of the ideas we had on IRC.
 That mbstring patch and more ext/intl features should be enough to
 solve the unicode problem.

 Sorry, but that is not true. intl and mbstring can provide functionality
 to deal with UTF 8 string manipulation functions, they can not provide
 proper Unicode support. Proper Unicode support is *not* only just
 dealing with UTF-8 strings. Proper Unicode support includes dealing with
 file streams, with different encodings, with localiztion, with sorting,
 with locales, with formatting numbers. Offloading this to extensions
 makes Unicode support an add-on hack, and not a language feature. I am
 not saying that intl and mbstring aren't *useful*, but they definitely
 do not solve the unicode problem.

 regards,
 Derick


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] adding GB18030 support for mbstring

2010-02-01 Thread Moriyoshi Koizumi
2010/2/1 KITAZAKI Shigeru shigeru_kitaz...@cybozu.co.jp:
 * php_syslog.patch
  syslog() function cannot properly send UTF-8 strings to event log on
  Windows. This patch changes the internal API. We, however, must set
  UTF-8 on 'mbstring.internal_incoding'.
  In addition, this changes the severity of 'LOG_ERR' from eventlog's
  warning to eventlog's error.

It seems this doesn't relies on any mbstring settings, but just
changed syslog() to take strings encoded in UTF-8 instead of the
system's default encoding.  It'd look good to me if it had a new flag
causing syslog() to switch to the new behaviour.

Regards,
Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-I18N] adding GB18030 support for mbstring

2010-01-31 Thread Moriyoshi Koizumi
Kitazaki-san,

First thank you for your effort. But, I am under the impression that
the conversion table looks too huge to include in a distribution
(30MB).  Is there any way to get this more compressed?

BTW, I created an extension that is near-compatible with mbstring and
based on ICU that of course supports GB18030. See
http://github.com/moriyoshi/mbstring-ng for detail.

Regards,
Moriyoshi

2010/1/28 KITAZAKI Shigeru shigeru_kitaz...@cybozu.co.jp:
 We made a patch to add a mbfilter for GB18030 encoding for PHP-5.3.1.
 Please take a look at our blog:
  http://developer.cybozu.co.jp/oss/2010/01/php-mbstring-pa.html

 We would appreciate if you take this patch into the mainline.

 BTW, our blog has various other patches for PHP in addition to this one.
 Feel free to mail me if you are interested in some of them.

 Regards,
 KITAZAKI Shigeru shigeru_kitaz...@cybozu.co.jp

 --
 PHP Unicode  I18N Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Alternative mbstring implementation using ICU

2009-08-03 Thread Moriyoshi Koizumi
On Sat, Aug 1, 2009 at 9:02 AM, Stanislav Malyshevs...@zend.com wrote:
 Hi!

 They calculate the total width of a string based on east asian width
 property, which is still valid to give a rough measurement of the
 rendered string.

 OK, I guess if it's some kind of special calculation that doesn't follow
 from others it should be preserved, there are tons of such special functions
 in PHP.

 That's a common problem, IIRC PHP 6 converters have configurable error
 modes
 for that. Don't unicode_set_error_handler() and unicode_set_error_mode()
 do
 what you want?

 I guess it isn't what I want. If my understanding is correct, a
 handler set by unicode_set_error_handler() merely deals with the
 aftermath and cannot interact with the converter.  There are good

 That depends. For some error modes, it says to converter to replace invalid
 chars with some other char or skip it. You can't however now specify custom
 mappings (I'm not sure ICU allows that, but maybe it can be simulated...).
 Here the question is - is it really worth to keep whole separate conversion
 system for just this, or can it be done with standard conversion, possibly
 somewhat tweaked?

It can be done through conversion error handlers. You can append an
encoded form of a codepoint for such unassigned characters to the
buffer within the handler.

And yes, it's worth providing separate conversion system.  You might
not be aware of it, but there are several sets of different character
sets, each of which is often represented with a specific encoding
scheme.  Shift_JIS is one of those.

 In addition to these, shouldn't there be any case where one have to
 manipulate Unicode strings on per-coded-character-basis rather than
 per-grapheme-basis just like substr() in PHP6?

 In PHP 6 right now it's actually the only case, grapheme functions not even
 ported to PHP 6 yet (I know, not good) - but that's what regular str*
 functions should be doing, right?

What I am mainly interested in is 5.4, or something that will come
before 6.  BTW, it would be much better if there had been a sort of
coordination between the developers of mbstring and intl extension.

Moriyoshi

 --
 Stanislav Malyshev, Zend Software Architect
 s...@zend.com   http://www.zend.com/
 (408)253-8829   MSN: s...@zend.com


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Alternative mbstring implementation using ICU

2009-08-03 Thread Moriyoshi Koizumi
On Tue, Aug 4, 2009 at 2:47 AM, Stanislav Malyshevs...@zend.com wrote:

 And yes, it's worth providing separate conversion system.  You might
 not be aware of it, but there are several sets of different character
 sets, each of which is often represented with a specific encoding
 scheme.  Shift_JIS is one of those.

 I'm not sure I understand. There are tons of character sets, etc. but as I
 understand ICU conversion routines handle them, including Shift_JIS - isn't
 it true?

Coded character sets and character encoding schemes are different
concepts. As for the specific case I mentioned, there are a number of
variants of the character set that is commonly represented as
Shift_JIS, and ICU doesn't support all of those.


 What I am mainly interested in is 5.4, or something that will come
 before 6.  BTW, it would be much better if there had been a sort of
 coordination between the developers of mbstring and intl extension.

 I'm not sure what will happen about 5.4 etc. but sure I'd be glad to help as
 much as I could with anything regarding intl extension. DO you have some
 specific things that need to be done?

This is just one of my ideas, but If intl extension eventually obtains
enough functionality that allows one to write emulated mbstring
functions in userland, then it would sound very attractive to me.

Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Alternative mbstring implementation using ICU

2009-07-31 Thread Moriyoshi Koizumi
On Thu, Jul 30, 2009 at 7:21 PM, Alexey Zakhlestinindey...@gmail.com wrote:
 2009/7/26 Moriyoshi Koizumi m...@mozo.jp:

 - mb_ereg_search(), mb_ereg_search_getpos(), mb_ereg_search_getregs(),
  mb_ereg_search_init(), mb_ereg_search_pos(), mb_ereg_search_regs() and
  mb_ereg_search_setpos()
  I rarely heard a script that actively uses these functions. They
  involve an internal state that is not visible to users, and thus it
  most likely causes confusion when used across the function calls.
  Need to be reimplemented as a class.

 I actually do use these. ;)
 Probably, it will make sense to implement a completely new oniguruma
 extension instead of keeping it as a part of mb_?

I'm planng to reimplement them as a single SPL iterator.  As I noted,
I also created a separate oniguruma extension that you can browse at
http://github.com/moriyoshi/php-oniguruma/

Regards,
Moriyoshi

 --
 Alexey Zakhlestin
 http://www.milkfarmsoft.com/


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Alternative mbstring implementation using ICU

2009-07-31 Thread Moriyoshi Koizumi
On Thu, Jul 30, 2009 at 8:05 PM, Niel Archerspam-f...@blueyonder.co.uk wrote:
 Implemented functions:

 - mb_ereg()
 - mb_ereg_replace()

 as ereg functions are deprecated in 5.3, are these still needed?

mb_ereg_XXX() have nothing to do with the plain ereg functions. They
are named so purely for the historical reasons.

Moriyoshi

 --
 Niel Archer
 niel.archer (at) blueyonder.co.uk

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP-DEV] Re: Alternative mbstring implementation using ICU

2009-07-31 Thread Moriyoshi Koizumi
On Fri, Jul 31, 2009 at 2:37 AM, Stanislav Malyshevs...@zend.com wrote:
 Hi!

 Aren't there any interests on this? If you think PHP 6 is gonna cover
 all of the functionality that allegedly-cruft mbstring currently
 provides, that is almost wrong :-p

 Could you please explain why PHP6 doesn't provide what mbstring is doing?
 I.e, let's go over the functions:

 mb_parse_str - since detecting encoding doesn't work per RFC, what is the
 usefulness of this function? Wouldn't PHP 6 do the same with correct
 charset?

As for this you got the point.

 mb_str* - shouldn't you in 6 just convert them to unicode and do all string
 operations with Unicode strings? Also, in 5 isn't there some intersection
 with grapheme_* functions?

mb_strwidth() and mb_strimwidth() are not covered.

 mb_output_handler - shouldn't setting the proper encoding in 6 do the same 
 job?
 mb_convert_encoding - don't we already have a number of functions that do 
 encoding conversions?

I don't think It can gracefully handle characters that have no
corresponding entries in the target character set. I'm even thinking
of adding a class interface that is dedicated to encoding conversion
with which one can deal with such characters in a user-supplied
handler.

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Alternative mbstring implementation using ICU

2009-07-31 Thread Moriyoshi Koizumi
On Fri, Jul 31, 2009 at 5:24 PM, Moriyoshi Koizumim...@mozo.jp wrote:
 mb_str* - shouldn't you in 6 just convert them to unicode and do all string
 operations with Unicode strings? Also, in 5 isn't there some intersection
 with grapheme_* functions?

 mb_strwidth() and mb_strimwidth() are not covered.

I should have also noted that grapheme_* functions. Yes, there might
be intersection among them and I even think grapheme_* provide better
support for Unicode string manipulation, but it would actually be
better a bit if they supported arbitrary encoding as arguments.

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Alternative mbstring implementation using ICU

2009-07-31 Thread Moriyoshi Koizumi
Hi,

On Sat, Aug 1, 2009 at 1:37 AM, Stanislav Malyshevs...@zend.com wrote:
 Hi!

 mb_str* - shouldn't you in 6 just convert them to unicode and do all
 string
 operations with Unicode strings? Also, in 5 isn't there some intersection
 with grapheme_* functions?

 mb_strwidth() and mb_strimwidth() are not covered.

 True. I wonder what this function is useful for?

They calculate the total width of a string based on east asian width
property, which is still valid to give a rough measurement of the
rendered string.


 mb_output_handler - shouldn't setting the proper encoding in 6 do the
 same job?
 mb_convert_encoding - don't we already have a number of functions that do
 encoding conversions?

 I don't think It can gracefully handle characters that have no
 corresponding entries in the target character set. I'm even thinking

 That's a common problem, IIRC PHP 6 converters have configurable error modes
 for that. Don't unicode_set_error_handler() and unicode_set_error_mode() do
 what you want?

I guess it isn't what I want. If my understanding is correct, a
handler set by unicode_set_error_handler() merely deals with the
aftermath and cannot interact with the converter.  There are good
reasons to support user-supplied mappings of characters in PUA to one
of legacy encodings such as Shift_JIS, not just replacing such
characters by placeholders.

In addition to these, shouldn't there be any case where one have to
manipulate Unicode strings on per-coded-character-basis rather than
per-grapheme-basis just like substr() in PHP6?

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Alternative mbstring implementation using ICU

2009-07-29 Thread Moriyoshi Koizumi
Aren't there any interests on this? If you think PHP 6 is gonna cover
all of the functionality that allegedly-cruft mbstring currently
provides, that is almost wrong :-p

Moriyoshi

On Tue, Jul 28, 2009 at 5:41 PM, Moriyoshi Koizumim...@mozo.jp wrote:
 I set up a RFC page for this in wiki.php.net.  Here it goes:
 http://wiki.php.net/rfc/altmbstring

 Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: ext/iconv/tests/bug16069.phpt

2009-07-28 Thread Moriyoshi Koizumi
That is a test that is involved in the iconv's transilteration
feature, the behavior of which may vary by the platform you use.  I
guess we don't actually need to test it then.

Moriyoshi

On Tue, Jul 28, 2009 at 12:27 PM, Rasmus Lerdorfras...@lerdorf.com wrote:
 Moriyoshi, or someone who knows CP932 and EUC-JP, could you please have
 a look at ext/iconv/tests/bug16069.phpt

 It is failing in all the branches, so I am assuming the expected output
 listed in the test is wrong, but I am a bit lost in figuring out how to
 tell what is going wrong at byte 113 into the output there.

 -Rasmus


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: svn: /php/php-src/branches/PHP_5_3/ext/gd/ libgd/gdft.c tests/bug48555.phpt tests/bug48732.phpt tests/bug48801.phpt

2009-07-28 Thread Moriyoshi Koizumi
Incorporating the changes and merges across the branches into one
commit under a sparse-layouted local copy doesn't do the book-keeping
against svn:mergeinfo.  That's why I suppose it is not a good idea.

Moriyoshi

On Tue, Jul 28, 2009 at 7:44 AM, Gwynne Raskindgwy...@darkrainfall.org wrote:
 On Jul 27, 2009, at 6:31 PM, Takeshi Abe wrote:

 Just to be sure, is there any consensus on this? I thought I should
 use svn merge.

 README.SVN-RULES says

   1. All changes should first go to trunk and then get merged from trunk
      (aka MFH'ed) to all other relevant branches.

 which I've been following so far.

 That document is outdated. It's now (strongly) preferred that you use one of
 the various methods for multi-branch commits available in SVN, using merge
 or a sparse checkout.

 -- Gwynne



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Alternative mbstring implementation using ICU

2009-07-28 Thread Moriyoshi Koizumi
I set up a RFC page for this in wiki.php.net.  Here it goes:
http://wiki.php.net/rfc/altmbstring

Moriyoshi

2009/7/26 Moriyoshi Koizumi m...@mozo.jp:
 Hi there,

 I almost finished an alternative implementation of mbstring that uses
 ICU instead of the exotic libmbfl in hope of replacing the current one
 for 5.4 (and possibly, 6.0.)

 Although there are admittingly some known incompatibilities that need
 extra libraries to resolve them besides a number of missing functions
 that are intentionally removed for simplicity's sake, frequently used
 functions are fully usable, and more compliant with the standard (e.g.
 case insensitive matches).

 Any comments are appreciated.

 The source is ready in the following location:

 http://github.com/moriyoshi/mbstring-ng/


 Implemented functions:

 - mb_convert_encoding()
 - mb_detect_encoding()
 - mb_ereg()
 - mb_ereg_replace()
 - mb_internal_encoding()
 - mb_list_encodings()
 - mb_output_handler()
 - mb_parse_str()
 - mb_preferred_mime_name()
 - mb_regex_set_options()
 - mb_split()
 - mb_strcut()
 - mb_strimwidth()
 - mb_stripos()
 - mb_stristr()
 - mb_strlen()
 - mb_strpos()
 - mb_strripos()
 - mb_strrpos()
 - mb_strstr()
 - mb_strtolower()
 - mb_strtotitle()
 - mb_strtoupper()
 - mb_strwidth()
 - mb_substr()
 - mb_substr_count()

 Removed functions and reasons behind it:

 - mb_check_encoding()
  Not that usable as it is advertised, period.  First of all, validation
  in terms of encoding is just as same as filtering through the
  converter supplied with the same value for the input and output
  encoding.  Thus just use mb_convert_encoding().

 - mb_convert_case()
  Use mb_strtoupper(), mb_strtolower() and mb_strtotitle()

 - mb_convert_kana()
  This can't be standard-compliant. In addition, part of the
  functionality is already covered by Normalizer of intl extension, so
  we need to carefully consider what is actually needed here again.

 - mb_convert_variables()
  This can be implemented as a script.

 - mb_decode_mimeheader(), mb_encode_mimeheader()
  Non-standard compliancy.

 - mb_decode_numericentity()
  Removed in favor of html_entity_decode().

 - mb_encode_numericentity()
  Removed in favor of htmlentities() and htmlspecialchars().

 - mb_encoding_aliases()
  Just unnecessary.

 - mb_ereg_match()
  Use mb_ereg().

 - mb_ereg_search(), mb_ereg_search_getpos(), mb_ereg_search_getregs(),
  mb_ereg_search_init(), mb_ereg_search_pos(), mb_ereg_search_regs() and
  mb_ereg_search_setpos()
  I rarely heard a script that actively uses these functions. They
  involve an internal state that is not visible to users, and thus it
  most likely causes confusion when used across the function calls.
  Need to be reimplemented as a class.

 - mb_eregi()
  Use mb_regex_options() and mb_ereg()

 - mb_eregi_replace()
  I wonder why this function was added in the first place because giving
  'i' option to mb_ereg_replace() works in the same way.

 - mb_detect_order(), mb_get_info(), mb_http_input(), mb_http_output(),
  mb_language() and mb_substitute_character()
  ini_set() and ini_get() are your friend, I guess...

 - mb_regex_encoding()
  It is really confusing that the current mbstring allows two different
  encoding defaults that are applied to regex functions and the rest.
  Those settings are unified in the alternative version and so this is
  no longer necessary.

 - mb_send_mail()
  The behavior of this function relies on the pseudo-locale setting
  called mbstring.language that supports just a limited set of
  possible locales. As not everyone can benefit from the function and
  most significant applications implement their own mail functions, I
  suppose this is no longer wanted.

 - mb_strrchr()
  Use mb_strrpos().

 - mb_strrichr()
  Use mb_strripos().


 Known limitations and incompatibilities:

 - mb_detect_encoding() doesn't work well anymore due to the
  inaccuracy of ICU's encoding detection facility.

 - Request encoding translator now takes advantage of SAPI filter,
  therefore the name parts of the query components are not to be
  converted anymore.

 - The group reference placeholders for mb_ereg_replace() is now
  $0, $1, $2... instead of \0, \1, \2.  This can be avoided if we
  don't use uregex_replaceAll() and implement our own.

 - ILP64 :-p


 Regards,
 Moriyoshi




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: svn: /php/php-src/branches/PHP_5_3/ext/gd/ libgd/gdft.c tests/bug48555.phpt tests/bug48732.phpt tests/bug48801.phpt

2009-07-27 Thread Moriyoshi Koizumi
Just to be sure, is there any consensus on this? I thought I should
use svn merge.

Moriyoshi

On Tue, Jul 28, 2009 at 12:30 AM, David Soria Parras...@gmx.net wrote:
 On 2009-07-27, Takeshi Abe t...@php.net wrote:
 Log:
 MFH: fixed #48732 (TTF Bounding box wrong for letters below baseline) and 
 #48801 (Problem with imagettfbbox)

 please commit the changes to 5.3/5.2 and trunk at once in one changeset
 instead of MFH'ing. Thanks

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Alternative mbstring implementation using ICU

2009-07-26 Thread Moriyoshi Koizumi
Hi there,

I almost finished an alternative implementation of mbstring that uses
ICU instead of the exotic libmbfl in hope of replacing the current one
for 5.4 (and possibly, 6.0.)

Although there are admittingly some known incompatibilities that need
extra libraries to resolve them besides a number of missing functions
that are intentionally removed for simplicity's sake, frequently used
functions are fully usable, and more compliant with the standard (e.g.
case insensitive matches).

Any comments are appreciated.

The source is ready in the following location:

http://github.com/moriyoshi/mbstring-ng/


Implemented functions:

- mb_convert_encoding()
- mb_detect_encoding()
- mb_ereg()
- mb_ereg_replace()
- mb_internal_encoding()
- mb_list_encodings()
- mb_output_handler()
- mb_parse_str()
- mb_preferred_mime_name()
- mb_regex_set_options()
- mb_split()
- mb_strcut()
- mb_strimwidth()
- mb_stripos()
- mb_stristr()
- mb_strlen()
- mb_strpos()
- mb_strripos()
- mb_strrpos()
- mb_strstr()
- mb_strtolower()
- mb_strtotitle()
- mb_strtoupper()
- mb_strwidth()
- mb_substr()
- mb_substr_count()

Removed functions and reasons behind it:

- mb_check_encoding()
  Not that usable as it is advertised, period.  First of all, validation
  in terms of encoding is just as same as filtering through the
  converter supplied with the same value for the input and output
  encoding.  Thus just use mb_convert_encoding().

- mb_convert_case()
  Use mb_strtoupper(), mb_strtolower() and mb_strtotitle()

- mb_convert_kana()
  This can't be standard-compliant. In addition, part of the
  functionality is already covered by Normalizer of intl extension, so
  we need to carefully consider what is actually needed here again.

- mb_convert_variables()
  This can be implemented as a script.

- mb_decode_mimeheader(), mb_encode_mimeheader()
  Non-standard compliancy.

- mb_decode_numericentity()
  Removed in favor of html_entity_decode().

- mb_encode_numericentity()
  Removed in favor of htmlentities() and htmlspecialchars().

- mb_encoding_aliases()
  Just unnecessary.

- mb_ereg_match()
  Use mb_ereg().

- mb_ereg_search(), mb_ereg_search_getpos(), mb_ereg_search_getregs(),
  mb_ereg_search_init(), mb_ereg_search_pos(), mb_ereg_search_regs() and
  mb_ereg_search_setpos()
  I rarely heard a script that actively uses these functions. They
  involve an internal state that is not visible to users, and thus it
  most likely causes confusion when used across the function calls.
  Need to be reimplemented as a class.

- mb_eregi()
  Use mb_regex_options() and mb_ereg()

- mb_eregi_replace()
  I wonder why this function was added in the first place because giving
  'i' option to mb_ereg_replace() works in the same way.

- mb_detect_order(), mb_get_info(), mb_http_input(), mb_http_output(),
  mb_language() and mb_substitute_character()
  ini_set() and ini_get() are your friend, I guess...

- mb_regex_encoding()
  It is really confusing that the current mbstring allows two different
  encoding defaults that are applied to regex functions and the rest.
  Those settings are unified in the alternative version and so this is
  no longer necessary.

- mb_send_mail()
  The behavior of this function relies on the pseudo-locale setting
  called mbstring.language that supports just a limited set of
  possible locales. As not everyone can benefit from the function and
  most significant applications implement their own mail functions, I
  suppose this is no longer wanted.

- mb_strrchr()
  Use mb_strrpos().

- mb_strrichr()
  Use mb_strripos().


Known limitations and incompatibilities:

- mb_detect_encoding() doesn't work well anymore due to the
  inaccuracy of ICU's encoding detection facility.

- Request encoding translator now takes advantage of SAPI filter,
  therefore the name parts of the query components are not to be
  converted anymore.

- The group reference placeholders for mb_ereg_replace() is now
  $0, $1, $2... instead of \0, \1, \2.  This can be avoided if we
  don't use uregex_replaceAll() and implement our own.

- ILP64 :-p


Regards,
Moriyoshi


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_2) / NEWS configure.in /main php_version.h

2009-05-14 Thread Moriyoshi Koizumi
Is there any chance of revisiting this issue? I mean, to change the
default back to SORT_STRING. We now have a couple more reports
regarding this:

http://bugs.php.net/47370#c147594
http://bugs.php.net/48115

Regards,
Moriyoshi

On Fri, Feb 27, 2009 at 8:30 AM, Ilia Alshanetsky i...@prohost.org wrote:
 Moriyoshi,

 First of thank you for taking the time to provide examples regarding the
 issues you are demonstrating. I've looked at a number of different
 applications and have not found a functionality breakage due to this change.
 While your example does show a change, I am not convinced (sorry) that it is
 an adverse one, type-different comparison is always tricky and not entirely
 reliable and I think demonstrates more of a corner-case situation.

 I think our best option at this time is to release 5.2.9 as quickly as
 possible as it introduces a number of crucial fixes and if your comments are
 validated via user feedback we can adjust the values with 5.2.10 that can be
 repackaged fairly rapidly. IMHO the current functionality is desired and is
 acceptable.

 Ilia Alshanetsky




 On 26-Feb-09, at 1:58 PM, Moriyoshi Koizumi wrote:

 Robin Burchell wrote:

 On Thu, Feb 26, 2009 at 5:21 PM, Moriyoshi Koizumi m...@mozo.jp wrote:

 So, in what point do you guys think of this change as valid?

 Moriyoshi

 Is there any known examples of code broken by this, or is it a more
 academic than practical problem?

 snip

 That's indeed a practical problem.

 1. array_unique() has never been supposed to handle values other than
 strings. That's how bug #10658 is handled.

 http://bugs.php.net/10658

 See also:

 http://cvs.php.net/viewvc.cgi/phpdoc/en/reference/array/functions/array-unique.xml?revision=1.16view=markup

 2. the results are inconsistent between SORT_STRING and SORT_REGULAR
 when the items are a mixture of different types.

 ?php
 $objs = array(
   0x10,
   16,
   true,
   true,
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 var_dump(array_unique($objs, SORT_STRING));

 $objs = array(
   0x10,
   true,
   16,
   true,
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 var_dump(array_unique($objs, SORT_STRING));
 ?

 I could hardly imagine what would show up. Do you?

 array(1) {
  [0]=
  string(4) 0x10
 }
 array(4) {
  [0]=
  string(4) 0x10
  [1]=
  int(16)
  [2]=
  bool(true)
  [3]=
  string(4) true
 }
 array(2) {
  [0]=
  string(4) 0x10
  [3]=
  string(4) true
 }
 array(4) {
  [0]=
  string(4) 0x10
  [1]=
  bool(true)
  [2]=
  int(16)
  [3]=
  string(4) true
 }


 3. the result can be unreasonable even with SORT_REGULAR

 As the equality of the object is only determined by member-wise
 comparison, there must be cases where the behavior is not acceptable:

 ?php
 class Foo {
   public $a;
   function __construct($a) {
       $this-a = $a;
   }
 };


 $objs = array(
   new Foo(1), new Foo(2e0), new Foo(2), new Foo(3)
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 ?

 This yields:

 array(3) {
  [0]=
  object(Foo)#1 (1) {
   [a]=
   int(1)
  }
  [1]=
  object(Foo)#2 (1) {
   [a]=
   string(3) 2e0
  }
  [3]=
  object(Foo)#4 (1) {
   [a]=
   int(3)
  }
 }

 while the second item is semantically not expected to be equal to the
 third.


 Moriyoshi

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_3) /ext/iconv iconv.c

2009-05-14 Thread Moriyoshi Koizumi
Ilia, do you still see any problem merging this to 5.2?

Moriyoshi

On Sat, Apr 11, 2009 at 6:16 PM, Hannes Magnusson
hannes.magnus...@gmail.com wrote:
 Ilia?

 I guess the chances of getting this merged Moriyoshi will increase by
 100% if you have a testcase..

 -Hannes

 On Tue, Mar 17, 2009 at 07:31, Moriyoshi Koizumi moriyo...@php.net wrote:
 moriyoshi               Tue Mar 17 05:31:04 2009 UTC

  Modified files:              (Branch: PHP_5_3)
    /php-src/ext/iconv  iconv.c
  Log:
  - MFH: Make iconv filter accept '.' as the delimiter between encoding names 
 as
    well as '/'. It's impossible to specify the filter in php://filter without
    this fix.

  # I hope this to be merged to 5.2 as well. This doesn't break BC as there is
  # no such encoding name that contains '.'. (Andif there were to be such one,
  # the filter is failed in the first place since it also uses '.' for the
  # delimiter between the filter name and the from encoding name.



 http://cvs.php.net/viewvc.cgi/php-src/ext/iconv/iconv.c?r1=1.124.2.8.2.20.2.13r2=1.124.2.8.2.20.2.14diff_format=u
 Index: php-src/ext/iconv/iconv.c
 diff -u php-src/ext/iconv/iconv.c:1.124.2.8.2.20.2.13 
 php-src/ext/iconv/iconv.c:1.124.2.8.2.20.2.14
 --- php-src/ext/iconv/iconv.c:1.124.2.8.2.20.2.13       Wed Dec 31 11:15:37 
 2008
 +++ php-src/ext/iconv/iconv.c   Tue Mar 17 05:31:04 2009
 @@ -18,7 +18,7 @@
    +--+
  */

 -/* $Id: iconv.c,v 1.124.2.8.2.20.2.13 2008/12/31 11:15:37 sebastian Exp $ */
 +/* $Id: iconv.c,v 1.124.2.8.2.20.2.14 2009/03/17 05:31:04 moriyoshi Exp $ */

  #ifdef HAVE_CONFIG_H
  #include config.h
 @@ -2759,7 +2759,7 @@
                return NULL;
        }
        ++from_charset;
 -       if ((to_charset = strchr(from_charset, '/')) == NULL) {
 +       if ((to_charset = strpbrk(from_charset, /.)) == NULL) {
                return NULL;
        }
        from_charset_len = to_charset - from_charset;



 --
 PHP CVS Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php



 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_2) / NEWS configure.in /main php_version.h

2009-05-14 Thread Moriyoshi Koizumi
While I am stil wondering why we could not stop this kind of mess
before the release, we should also revert the default for 5.3. There's
no point making the behavior different between releases.

Moriyoshi

On Fri, May 15, 2009 at 5:10 AM, Andrei Zmievski and...@gravitonic.com wrote:
 Ilia Alshanetsky wrote:

 Andrei,

 I think Moriyoshi has a point here there are several reports by people who
 are affected by this, I think it makes sense to leave the introduced
 functionality as is in 5.3/6, but for PHP 5.2 it probably should be rolled
 back.

 Fine. Our sorting/comparison is pretty screwed up anyway, if 400.00 is the
 same as 400. Maybe it makes sense for array_unique() to accept an optional
 flag at the end to specify the type of sorting. And we should add one that
 uses is_identical_function() to prevent crap like that.

 -Andrei


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Segfault while looping through hash table

2009-05-14 Thread Moriyoshi Koizumi
On Fri, May 15, 2009 at 12:31 PM, Farley Knight farleykni...@gmail.com wrote:

  zend_hash_internal_pointer_reset(Z_ARRVAL(zhash));

  printf(This hash table has %d entries\n,
 zend_hash_num_elements(Z_ARRVAL(zhash)));

  int current = 0;

  while (zend_hash_get_current_data(Z_ARRVAL(zhash), (void**)value)
 == SUCCESS) {
current++;
printf(Currently on entry %d\n, current);
if (zend_hash_move_forward(Z_ARRVAL(zhash)) == SUCCESS)
  printf(Done moving hash forward. Result was successful\n);
else
  printf(Done moving hash forward. Result was a failure\n);
  }


Does the problem persist if replacing the hashtable functions by the
_ex counterparts: zend_internal_pointer_reset_ex(),
zend_hash_get_current_data_ex() and zend_move_forward_ex()? These are
always recommended (I believe) because the internal HashPosition value
associated to a hashtable is also used in the user script.

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] request for comments on threadsafe / multi-thread enabled Embed2 SAPI

2009-03-26 Thread Moriyoshi Koizumi
Isn't it better to avoid any behaviour-changing #define's in a header
file? I mean the following series of lines in php_embed2.h:

/* we control char* lifetime of smart_str as we allow it to cross
request boundaries */
#define SMART_STR_USE_REALLOC 1
/* we use bigger numbers than default as script output will most
likely be bigger than anticipated for smart_str usage */
#define SMART_STR_PREALLOC 2048
#define SMART_STR_START_SIZE 2000
#include ext/standard/php_smart_str.h

TSRM needs to be initialized just once on startup. It's not the thing
that has to be initialized per thread

my 2c,
Moriyoshi

On Fri, Mar 27, 2009 at 4:35 AM, Bas van Beek b...@tobin.nl wrote:
 Hi Guys,

 The Embed2 SAPI for embedded applications needing multi-thread enabled PHP
 scripting support can be inspected from the following repository:

        http://svn.tobin.nl/public/php/embed2/trunk/

 The original Embedded SAPI does not allow concurrent script runs and as such
 is less suited for multi-threaded applications . Also single threaded
 applications
 can benefit from the Embed2 SAPI as for each PHP script run, it is not
 necessary to initialize and shutdown the entire PHP environment.

 Examples for embedding PHP using the Embed2 SAPI can be found in the
 examples directory.

 The C example uses pthreads to showcase a typical posix implementation.
 The C++ example uses the boost::thread library to showcase a cross platform
 implementation. Build instructions / make files are also added for Linux, OS
 X and Windows

 If good enough I would like to donate the Embed2 SAPI to the PHP community
 using the standard PHP License and request a CVS account for adding it to
 the source tree.

 The code has been tested to run on PHP-5.2.9 and PHP-5.3.0RC1 and was tested
 using Mac OS X, Debian Linux and Windows XP.

 kind regards,

 Bas van Beek




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] max_execution_time and async signal handling in apache2handler

2009-03-15 Thread Moriyoshi Koizumi
Hi,

I got a bug report on the Japanese PHP user's list that states free()
aborts within the timer signal handler due to reentrance to the
function when max_execution_time takes effect and the signal occurs
within the same libc function. The reporter also states he uses
apache2handler, which doesn't provide block_interruptions nor
unblock_interruptions SAPI handlers in contrast to the apache1
handler.

Whilst I doubt this happens quite frequently because PHP has its own
memory pool and it's far more rare that the libc's allocators get
called than the emalloc() and efree(), this should be addressed
somehow. A proposed fix is attached but note this greatly compromises
overall performance due to excessive system calls.

Regards,
Moriyoshi
Index: sapi/apache2handler/php_apache.h
===
RCS file: /repository/php-src/sapi/apache2handler/php_apache.h,v
retrieving revision 1.8.2.1.2.4
diff -u -r1.8.2.1.2.4 php_apache.h
--- sapi/apache2handler/php_apache.h	31 Dec 2008 11:17:48 -	1.8.2.1.2.4
+++ sapi/apache2handler/php_apache.h	15 Mar 2009 21:13:26 -
@@ -46,6 +46,8 @@
 	int request_processed;
 	/* final content type */
 	char *content_type;
+	volatile apr_uint32_t block_sigs_in_cs;
+	sigset_t old_sig_set;
 } php_struct;
 
 void *merge_php_config(apr_pool_t *p, void *base_conf, void *new_conf);
Index: sapi/apache2handler/sapi_apache2.c
===
RCS file: /repository/php-src/sapi/apache2handler/sapi_apache2.c,v
retrieving revision 1.57.2.10.2.19
diff -u -r1.57.2.10.2.19 sapi_apache2.c
--- sapi/apache2handler/sapi_apache2.c	31 Dec 2008 11:17:48 -	1.57.2.10.2.19
+++ sapi/apache2handler/sapi_apache2.c	15 Mar 2009 21:13:27 -
@@ -50,6 +50,7 @@
 #include util_script.h
 #include http_core.h
 #include ap_mpm.h
+#include apr_atomic.h
 
 #include php_apache.h
 
@@ -316,6 +317,42 @@
 	return SUCCESS;
 }
 
+static void php_apache2_block_signals()
+{
+#ifndef PHP_WIN32
+	sigset_t ss;
+	php_struct *ctx = SG(server_context);
+
+	if (!ctx || apr_atomic_cas32(ctx-block_sigs_in_cs, 1, 0)) {
+		return;
+	}
+
+	sigemptyset(ss);
+	sigaddset(ss, SIGPROF);
+	sigaddset(ss, SIGALRM);
+#if defined(ZTS)  defined(PTHREADS)
+	pthread_sigmask(SIG_BLOCK, ss, ctx-old_sig_set);
+#else
+	sigprocmask(SIG_BLOCK, ss, ctx-old_sig_set);
+#endif
+#endif
+}
+
+static void php_apache2_unblock_signals()
+{
+#ifndef PHP_WIN32
+	php_struct *ctx = SG(server_context);
+	if (!ctx || !apr_atomic_cas32(ctx-block_sigs_in_cs, 0, 1)) {
+		return;
+	}
+#if defined(ZTS)  defined(PTHREADS)
+	pthread_sigmask(SIG_SETMASK, ctx-old_sig_set, NULL);
+#else
+	sigprocmask(SIG_SETMASK, ctx-old_sig_set, NULL);
+#endif
+#endif
+}
+
 static sapi_module_struct apache2_sapi_module = {
 	apache2handler,
 	Apache 2.0 Handler,
@@ -344,7 +381,10 @@
 	php_apache_sapi_log_message,			/* Log message */
 	php_apache_sapi_get_request_time,		/* Request Time */
 
-	STANDARD_SAPI_MODULE_PROPERTIES
+	NULL,   /* php.ini path override */
+
+	php_apache2_block_signals,
+	php_apache2_unblock_signals
 };
 
 static apr_status_t php_apache_server_shutdown(void *tmp)
@@ -523,6 +563,8 @@
 		 */
 		apr_pool_cleanup_register(r-pool, (void *)SG(server_context), php_server_context_cleanup, apr_pool_cleanup_null);
 		ctx-r = r;
+		ctx-block_sigs_in_cs = 0;
+		sigemptyset(ctx-old_sig_set);
 		ctx = NULL; /* May look weird to null it here, but it is to catch the right case in the first_try later on */
 	} else {
 		parent_req = ctx-r;
-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] max_execution_time and async signal handling in apache2handler

2009-03-15 Thread Moriyoshi Koizumi
Hi,

I noticed it, but I took it as a completely different idea, something
like a performance improvement that doesn't cover the issue. It turned
out that is what I should have had a look at.

Thanks for the pointer.

Moriyoshi

On Mon, Mar 16, 2009 at 8:52 AM, shire sh...@tekrat.com wrote:

 Hi Moriyoshi,

 Moriyoshi Koizumi wrote:

 Hi,

 I got a bug report on the Japanese PHP user's list that states free()
 aborts within the timer signal handler due to reentrance to the
 function when max_execution_time takes effect and the signal occurs
 within the same libc function. The reporter also states he uses
 apache2handler, which doesn't provide block_interruptions nor
 unblock_interruptions SAPI handlers in contrast to the apache1
 handler.

 Whilst I doubt this happens quite frequently because PHP has its own
 memory pool and it's far more rare that the libc's allocators get
 called than the emalloc() and efree(), this should be addressed
 somehow. A proposed fix is attached but note this greatly compromises
 overall performance due to excessive system calls.

 Isn't this  already covered in a proposal here:
 http://wiki.php.net/rfc/zendsignals?  I believe the last hang-up keeping
 this from getting committed is in the ability to handle this properly and
 efficiently under ZTS builds, I would really like to see this get applied in
 a future PHP release though.

 -shire


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_2) / NEWS configure.in /main php_version.h

2009-02-26 Thread Moriyoshi Koizumi
So, in what point do you guys think of this change as valid?

Moriyoshi

On Thu, Feb 26, 2009 at 11:36 PM, Antony Dovgal t...@daylessday.org wrote:
 On 26.02.2009 17:19, Ilia Alshanetsky wrote:
 Let's reach a conclusion by end of day (EST time) so release can either be 
 made or
 delayed.

 +0

 Just go ahead and release it.

 --
 Wbr,
 Antony Dovgal

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_2) / NEWS configure.in /main php_version.h

2009-02-26 Thread Moriyoshi Koizumi
Robin Burchell wrote:
 On Thu, Feb 26, 2009 at 5:21 PM, Moriyoshi Koizumi m...@mozo.jp wrote:
 So, in what point do you guys think of this change as valid?

 Moriyoshi
 
 Is there any known examples of code broken by this, or is it a more
 academic than practical problem?
 
 snip

That's indeed a practical problem.

1. array_unique() has never been supposed to handle values other than
strings. That's how bug #10658 is handled.

http://bugs.php.net/10658

See also:
http://cvs.php.net/viewvc.cgi/phpdoc/en/reference/array/functions/array-unique.xml?revision=1.16view=markup

2. the results are inconsistent between SORT_STRING and SORT_REGULAR
when the items are a mixture of different types.

?php
$objs = array(
0x10,
16,
true,
true,
);

var_dump(array_unique($objs, SORT_REGULAR));
var_dump(array_unique($objs, SORT_STRING));

$objs = array(
0x10,
true,
16,
true,
);

var_dump(array_unique($objs, SORT_REGULAR));
var_dump(array_unique($objs, SORT_STRING));
?

I could hardly imagine what would show up. Do you?

array(1) {
  [0]=
  string(4) 0x10
}
array(4) {
  [0]=
  string(4) 0x10
  [1]=
  int(16)
  [2]=
  bool(true)
  [3]=
  string(4) true
}
array(2) {
  [0]=
  string(4) 0x10
  [3]=
  string(4) true
}
array(4) {
  [0]=
  string(4) 0x10
  [1]=
  bool(true)
  [2]=
  int(16)
  [3]=
  string(4) true
}


3. the result can be unreasonable even with SORT_REGULAR

As the equality of the object is only determined by member-wise
comparison, there must be cases where the behavior is not acceptable:

?php
class Foo {
public $a;
function __construct($a) {
$this-a = $a;
}
};


$objs = array(
new Foo(1), new Foo(2e0), new Foo(2), new Foo(3)
);

var_dump(array_unique($objs, SORT_REGULAR));
?

This yields:

array(3) {
  [0]=
  object(Foo)#1 (1) {
[a]=
int(1)
  }
  [1]=
  object(Foo)#2 (1) {
[a]=
string(3) 2e0
  }
  [3]=
  object(Foo)#4 (1) {
[a]=
int(3)
  }
}

while the second item is semantically not expected to be equal to the third.


Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_2) / NEWS configure.in /main php_version.h

2009-02-26 Thread Moriyoshi Koizumi
On Fri, Feb 27, 2009 at 3:58 AM, Moriyoshi Koizumi m...@mozo.jp wrote:
 1. array_unique() has never been supposed to handle values other than
 strings. That's how bug #10658 is handled.

That's not what I really wanted to mean. I should have said not
supposed to handle values other than scalars.

Moriyoshi

 http://bugs.php.net/10658

 See also:
 http://cvs.php.net/viewvc.cgi/phpdoc/en/reference/array/functions/array-unique.xml?revision=1.16view=markup

 2. the results are inconsistent between SORT_STRING and SORT_REGULAR
 when the items are a mixture of different types.

 ?php
 $objs = array(
    0x10,
    16,
    true,
    true,
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 var_dump(array_unique($objs, SORT_STRING));

 $objs = array(
    0x10,
    true,
    16,
    true,
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 var_dump(array_unique($objs, SORT_STRING));
 ?

 I could hardly imagine what would show up. Do you?

 array(1) {
  [0]=
  string(4) 0x10
 }
 array(4) {
  [0]=
  string(4) 0x10
  [1]=
  int(16)
  [2]=
  bool(true)
  [3]=
  string(4) true
 }
 array(2) {
  [0]=
  string(4) 0x10
  [3]=
  string(4) true
 }
 array(4) {
  [0]=
  string(4) 0x10
  [1]=
  bool(true)
  [2]=
  int(16)
  [3]=
  string(4) true
 }


 3. the result can be unreasonable even with SORT_REGULAR

 As the equality of the object is only determined by member-wise
 comparison, there must be cases where the behavior is not acceptable:

 ?php
 class Foo {
    public $a;
    function __construct($a) {
        $this-a = $a;
    }
 };


 $objs = array(
    new Foo(1), new Foo(2e0), new Foo(2), new Foo(3)
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 ?

 This yields:

 array(3) {
  [0]=
  object(Foo)#1 (1) {
    [a]=
    int(1)
  }
  [1]=
  object(Foo)#2 (1) {
    [a]=
    string(3) 2e0
  }
  [3]=
  object(Foo)#4 (1) {
    [a]=
    int(3)
  }
 }

 while the second item is semantically not expected to be equal to the third.


 Moriyoshi



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src(PHP_5_2) / NEWS configure.in /main php_version.h

2009-02-26 Thread Moriyoshi Koizumi
I'm just wondering why you appreciate the new behavior. Are there
still any positive reasons to incorporate the change? Could anyone
provide some example that the new behavior works better than the
original?

By the way, I noticed that SORT_REGULAR behaves differently than
SORT_STRING if the items are all strings.

?php
$objs = array(
0x1,
1,
);
var_dump(array_unique($objs, SORT_REGULAR));
var_dump(array_unique($objs, SORT_STRING));
?

Furthermore, the code that depends on __toString() also won't work
properly anymore. It's pretty much possible that an user uses the
function for semantic comparison.

?php
class Foo {
public $a;

function __construct($a) {
$this-a = $a;
}

function __toString() {
return 1;
}
}

$objs = array(new Foo(1), new Foo(2), new Foo(3));
var_dump(array_unique($objs, SORT_REGULAR));
var_dump(array_unique($objs, SORT_STRING));
?

I think it's where the word BC break is appropriate to describe these
situations.

Moriyoshi

On Fri, Feb 27, 2009 at 8:30 AM, Ilia Alshanetsky i...@prohost.org wrote:
 Moriyoshi,

 First of thank you for taking the time to provide examples regarding the
 issues you are demonstrating. I've looked at a number of different
 applications and have not found a functionality breakage due to this change.
 While your example does show a change, I am not convinced (sorry) that it is
 an adverse one, type-different comparison is always tricky and not entirely
 reliable and I think demonstrates more of a corner-case situation.

 I think our best option at this time is to release 5.2.9 as quickly as
 possible as it introduces a number of crucial fixes and if your comments are
 validated via user feedback we can adjust the values with 5.2.10 that can be
 repackaged fairly rapidly. IMHO the current functionality is desired and is
 acceptable.

 Ilia Alshanetsky




 On 26-Feb-09, at 1:58 PM, Moriyoshi Koizumi wrote:

 Robin Burchell wrote:

 On Thu, Feb 26, 2009 at 5:21 PM, Moriyoshi Koizumi m...@mozo.jp wrote:

 So, in what point do you guys think of this change as valid?

 Moriyoshi

 Is there any known examples of code broken by this, or is it a more
 academic than practical problem?

 snip

 That's indeed a practical problem.

 1. array_unique() has never been supposed to handle values other than
 strings. That's how bug #10658 is handled.

 http://bugs.php.net/10658

 See also:

 http://cvs.php.net/viewvc.cgi/phpdoc/en/reference/array/functions/array-unique.xml?revision=1.16view=markup

 2. the results are inconsistent between SORT_STRING and SORT_REGULAR
 when the items are a mixture of different types.

 ?php
 $objs = array(
   0x10,
   16,
   true,
   true,
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 var_dump(array_unique($objs, SORT_STRING));

 $objs = array(
   0x10,
   true,
   16,
   true,
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 var_dump(array_unique($objs, SORT_STRING));
 ?

 I could hardly imagine what would show up. Do you?

 array(1) {
  [0]=
  string(4) 0x10
 }
 array(4) {
  [0]=
  string(4) 0x10
  [1]=
  int(16)
  [2]=
  bool(true)
  [3]=
  string(4) true
 }
 array(2) {
  [0]=
  string(4) 0x10
  [3]=
  string(4) true
 }
 array(4) {
  [0]=
  string(4) 0x10
  [1]=
  bool(true)
  [2]=
  int(16)
  [3]=
  string(4) true
 }


 3. the result can be unreasonable even with SORT_REGULAR

 As the equality of the object is only determined by member-wise
 comparison, there must be cases where the behavior is not acceptable:

 ?php
 class Foo {
   public $a;
   function __construct($a) {
       $this-a = $a;
   }
 };


 $objs = array(
   new Foo(1), new Foo(2e0), new Foo(2), new Foo(3)
 );

 var_dump(array_unique($objs, SORT_REGULAR));
 ?

 This yields:

 array(3) {
  [0]=
  object(Foo)#1 (1) {
   [a]=
   int(1)
  }
  [1]=
  object(Foo)#2 (1) {
   [a]=
   string(3) 2e0
  }
  [3]=
  object(Foo)#4 (1) {
   [a]=
   int(3)
  }
 }

 while the second item is semantically not expected to be equal to the
 third.


 Moriyoshi

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-CVS] cvs: php-src /ext/standard array.c

2009-02-25 Thread Moriyoshi Koizumi
Last call: any more objections?

Moriyoshi

On Thu, Feb 19, 2009 at 11:52 AM, Moriyoshi Koizumi m...@mozo.jp wrote:
 On Thu, Feb 19, 2009 at 4:51 AM, Andrei Zmievski and...@gravitonic.com 
 wrote:
 Moriyoshi Koizumi wrote:

 As I said earlier, the function is never supposed to be used with
 objects. Therefore, we cannot declare it to be broken, and any change
 to the behavior anyway leads to a huge BC break. I got a report that
 claims the reporter's real-world application behaves strangely with
 the latest release candidate.

 Should we have array_unique_for_non_strings() then or something?

 That'd be stupid. What is reasonable is to make it default to
 SORT_STRING... Or is there any strong reason to make the default
 SORT_REGULAR now that you can specify SORT_REGULAR to array_unique()
 throught the second argument?


 That said, I'm not really against making SORT_REGULAR default for
 later versions than 5.2.x as long as *appropriate notices* are
 provided, while I strongly disagree for 5.2.x.

 What sort of notices do you propose? At runtime or in the docs?

 You even didn't mention that the behavior would be changed starting
 with 5.2.9 in the document, instead you simply added the description
 for the second optional argument that defaults to SORT_REGULAR, as if
 it was the default long before. That's absolutely the thing we should
 not do.

 Eitherway, if we were to make such change, I think we should at least
 make the second argument mandatory.

 Moriyoshi


 -Andrei



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] FD_SETSIZE limitation

2009-02-19 Thread Moriyoshi Koizumi
Robin Burchell wrote:
 On Thu, Feb 19, 2009 at 10:24 PM, Andrei Zmievski and...@gravitonic.com 
 wrote:
 Can someone explain why ext/sockets and also stream socket functions care
 about FD_SETSIZE?
 
 They care, because they use the select(2) syscall, which cares about 
 FD_SETSIZE.
 

select(2) itself can handle more fildes than FD_SETSIZE on Linux at least.

Moriyoshi


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Bug #46701

2009-02-18 Thread Moriyoshi Koizumi
To summarize what were the problems:

1. casting a float value that is unrepresentable in a target type is
undefined according to C spec.
2. any constant values that are unrepresentable in the standard
integer type are automagically represented as double values in PHP.
i.e. 0xc000 will result in a double value in the 32bit
architecture.
3. PHP's cast behavior used to rely on the C cast, which was addressed
in #42868 and fixed for 5.3.x and later.
4. PHP's associative array allows strings and integers for its keys.
5. when a double value is fed to an array as the key, it is converted
to integer using the standard C cast.
6. bug #46701 addresses the problem that the result of 6. varies by
the architecture.
7. the patch for bug #46701 uses DVAL_TO_LVAL to get the consistent
cast behavior.
8. therefore, it's natural to consider bug #46701 is based on the
behavior that is achieved by the patch for bug #42868.
9. the patch for bug #42868 may affect existing applications that rely
on the old behavior.
 (http://marc.info/?l=php-internalsm=120799720922202w=2)
10. the change like 9. is not tolerable in a minor release.
11. if bug #42868 is not going to be merged, then bug #46701 should be
reverted from the 5.2 branch.

Moriyoshi

On Wed, Feb 18, 2009 at 8:28 PM, zoe zoe.slatt...@googlemail.com wrote:
 Moriyoshi Koizumi wrote:
 I guess the patch relies on the 5.3's DVAL_TO_LVAL behavior that was
 changed by the fix for bug #42868, right?
 If so, this patch shouldn't be MFH'ed as the #42868 patch was not
 merged although I didn't remember any discussion on this.

 See also: http://marc.info/?l=php-internalsm=120799720922202w=2



 Hey all

 I'm sorry - I should have replied to this before since I was responsible for
 raising #42868.  I didn't do a good job at explaining what the issue was in
 that bug, mainly because I didn't know what it was when I started. The
 central problem is that PHP's behaviour on casting double to int defaults to
 whatever the underlying C environment does. On Windows and Linux (all of the
 versions that I looked at) this turns out to be a simple truncation of the
 last 32 bits. Unfortunately the C behaviour is 'undefined' (Kernigan and
 Ritchie, page 197, A6.3). The issue that I found in #42868 was that on the
 Mac the casting behaviour is completely different so many of the PHP tests
 failed. I believe that PHP should behave in a platform independent way -
 that is what the fix to #42868 achieves. It is also fair to say that any
 applications that depend on the overflow behaviour in PHP 5.2 cannot be
 guaranteed to run on any platform.

 Zoe

 Regards,
 Moriyoshi



 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



 Ilia Alshanetsky











-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-CVS] cvs: php-src /ext/standard array.c

2009-02-18 Thread Moriyoshi Koizumi
On Thu, Feb 19, 2009 at 4:51 AM, Andrei Zmievski and...@gravitonic.com wrote:
 Moriyoshi Koizumi wrote:

 As I said earlier, the function is never supposed to be used with
 objects. Therefore, we cannot declare it to be broken, and any change
 to the behavior anyway leads to a huge BC break. I got a report that
 claims the reporter's real-world application behaves strangely with
 the latest release candidate.

 Should we have array_unique_for_non_strings() then or something?

That'd be stupid. What is reasonable is to make it default to
SORT_STRING... Or is there any strong reason to make the default
SORT_REGULAR now that you can specify SORT_REGULAR to array_unique()
throught the second argument?


 That said, I'm not really against making SORT_REGULAR default for
 later versions than 5.2.x as long as *appropriate notices* are
 provided, while I strongly disagree for 5.2.x.

 What sort of notices do you propose? At runtime or in the docs?

You even didn't mention that the behavior would be changed starting
with 5.2.9 in the document, instead you simply added the description
for the second optional argument that defaults to SORT_REGULAR, as if
it was the default long before. That's absolutely the thing we should
not do.

Eitherway, if we were to make such change, I think we should at least
make the second argument mandatory.

Moriyoshi


 -Andrei


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [PHP-CVS] cvs: php-src /ext/standard array.c

2009-02-18 Thread Moriyoshi Koizumi
On Thu, Feb 19, 2009 at 3:14 PM, Ian Eure i...@digg.com wrote:
 On Feb 18, 2009, at 6:52 PM, Moriyoshi Koizumi wrote:

 On Thu, Feb 19, 2009 at 4:51 AM, Andrei Zmievski and...@gravitonic.com
 wrote:

 Moriyoshi Koizumi wrote:

 As I said earlier, the function is never supposed to be used with
 objects. Therefore, we cannot declare it to be broken, and any change
 to the behavior anyway leads to a huge BC break. I got a report that
 claims the reporter's real-world application behaves strangely with
 the latest release candidate.

 Should we have array_unique_for_non_strings() then or something?

 That'd be stupid. What is reasonable is to make it default to
 SORT_STRING... Or is there any strong reason to make the default
 SORT_REGULAR now that you can specify SORT_REGULAR to array_unique()
 throught the second argument?

 Because it's bad to require an argument to make a function do the right
 thing; it should do the right thing by default.

As I said, whether it is right or not is pretty much a context-depndent matter.

 That said, I'm not really against making SORT_REGULAR default for
 later versions than 5.2.x as long as *appropriate notices* are
 provided, while I strongly disagree for 5.2.x.

 What sort of notices do you propose? At runtime or in the docs?

 You even didn't mention that the behavior would be changed starting
 with 5.2.9 in the document, instead you simply added the description
 for the second optional argument that defaults to SORT_REGULAR, as if
 it was the default long before. That's absolutely the thing we should
 not do.

 Why would it be necessary? SORT_REGULAR is a superset of SORT_STRING, yes?
 If the function should only be used with strings (prior to this patch), as
 you claim, that's not a BC break. Unless you're arguing that we should
 preserve BC for code calling functions in unsupported ways.

If SORT_REGULAR was a superset of SORT_STRING, there would have been
no problem in the first place. See the original bug report to see what
is to be broken.

 Eitherway, if we were to make such change, I think we should at least
 make the second argument mandatory.

 Wouldn't this be an even bigger BC break than fixing the existing (broken)
 behavior?

A little sarcasm, the point here is that it'd be better off making the
*incompatible* application not work at all than still allowing them to
silently malfunction, but such a decision is just a bit less absurd
than changing the default, as you pointed out.

Anyway that's basically not a fix IMO.

Moriyoshi


  - Ian

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-CVS] cvs: php-src /ext/standard array.c

2009-02-17 Thread Moriyoshi Koizumi
In addition, we should look at similar comparison-involved array
functions such as array_intersect, array_diff and so on, otherwise
it's gonna be a mess.

Moriyoshi

On Wed, Feb 18, 2009 at 11:43 AM, Moriyoshi Koizumi m...@mozo.jp wrote:
 On Wed, Feb 18, 2009 at 3:11 AM, Andrei Zmievski and...@gravitonic.com 
 wrote:

 SORT_STRING can only reliably deal with strings - its behavior on non-string
 type is basically broken. Unless we agree that PHP is Tcl (strings are the
 only type), then SORT_REGULAR makes much more sense to me, and evidently
 others.

 If you really have a huge problem with BC, perhaps we could leave the
 default behavior as SORT_STRING for 5.2.x, but it definitely needs to be
 SORT_REGULAR for 5.3/6.

 As I said earlier, the function is never supposed to be used with
 objects. Therefore, we cannot declare it to be broken, and any change
 to the behavior anyway leads to a huge BC break. I got a report that
 claims the reporter's real-world application behaves strangely with
 the latest release candidate.

 That said, I'm not really against making SORT_REGULAR default for
 later versions than 5.2.x as long as *appropriate notices* are
 provided, while I strongly disagree for 5.2.x.

 Moriyoshi


 -Andrei



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-CVS] cvs: php-src /ext/standard array.c

2009-02-15 Thread Moriyoshi Koizumi
Ilia Alshanetsky wrote:
 I've discussed this issue with Andrei at least a month ago (if not
 longer) when the patch was originally added, and I believe that the
 introduced behavior is the correct one.

IMO correct or not depends on the context where the function is used.

At least, as array_unique() was not capable of dealing with objects
before the Andrei's patch, every existing code should use it with
strings, not objects.

If SORT_REGULAR could handle objects as well as strings in the same
manner as SORT_STRING I wouldn't see any problem, although it cannot.

Moriyoshi


 
 On 14-Feb-09, at 9:12 PM, Moriyoshi Koizumi wrote:
 
 So, what are RM's thoughts on this? My points are:

 1. Making SORT_REGULAR default *actually* broke existing code.
 2. Adding the second argument addressed the problem enough that the
 elements are treated indifferently when used with objects.

 Regards,
 Moriyoshi

 Moriyoshi Koizumi wrote:
 Whatever reasoning, I don't think it's a good idea to revert someone
 else's patch before discussing anything.

 Aside from this, I agree with you the old behavior is that stupid, but
 BC should always be honored.

 Moriyoshi

 Andrei Zmievski wrote:
 Don't do this please. Why did you feel the need to go back and
 change my
 patch including the NEWS entry? I knew what I was doing when I set the
 default behavior to SORT_REGULAR and this was discussed with both 5.3
 and 5.2 RMs. With your change it'l back to the stupid  old behavior of:

 $array = array(new stdClass(), new stdClass(), new Foo());
 $array = array_unique($array);

 And now $array has only 1 element. I really hate having tell PHP not to
 be stupid, rather than having it default to being smart.

 I'm going to revert this.

 -Andrei

 Moriyoshi Koizumi wrote:
 moriyoshiThu Feb 12 18:29:15 2009 UTC

  Modified files:  /php-src/ext/standardarray.c
 Log:
  * Fix bug #47370 (BC breakage of array_unique())

 http://cvs.php.net/viewvc.cgi/php-src/ext/standard/array.c?r1=1.471r2=1.472diff_format=u


 Index: php-src/ext/standard/array.c
 diff -u php-src/ext/standard/array.c:1.471
 php-src/ext/standard/array.c:1.472
 --- php-src/ext/standard/array.c:1.471Mon Feb  9 10:47:19 2009
 +++ php-src/ext/standard/array.cThu Feb 12 18:29:15 2009
 @@ -21,7 +21,7 @@

 +--+

 */

 -/* $Id: array.c,v 1.471 2009/02/09 10:47:19 dmitry Exp $ */
 +/* $Id: array.c,v 1.472 2009/02/12 18:29:15 moriyoshi Exp $ */

 #include php.h
 #include php_ini.h
 @@ -2924,7 +2924,7 @@
 };
 struct bucketindex *arTmp, *cmpdata, *lastkept;
 unsigned int i;
 -long sort_type = PHP_SORT_REGULAR;
 +long sort_type = PHP_SORT_STRING;

 if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, a|l,
 array, sort_type) == FAILURE) {
 return;






 
 Ilia Alshanetsky
 
 
 
 


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Bug #46701

2009-02-14 Thread Moriyoshi Koizumi
Please don't even think of backporting. This will definitely break a
lot of things, and this kind of thing must not be done in a minor
release.

Moriyoshi

On Fri, Feb 13, 2009 at 10:57 PM, Ilia Alshanetsky i...@prohost.org wrote:
 Dmitry,

 Does it make sense to backport 42868 fix to address this issue?


 On 12-Feb-09, at 3:56 PM, Moriyoshi Koizumi wrote:

 See the results of  the following on 5.2.6, 5.2.9rc2 and 5.3:

 php -r '$a[1e100] = 1; var_dump($a);'

 5.2.6:
 array(1) {
  [-2147483648]=
  int(1)
 }

 5.2.9rc2:
 array(1) {
  [-1]=
  int(1)
 }

 5.3:
 array(1) {
  [2147483647]=
  int(1)
 }

 I  doubt the result of 5.2.9rc2 is quite what we expect, and this
 problem should be addressed in 5.3 with the 5.2's behavior unchanged.

 Moriyoshi

 On Fri, Feb 13, 2009 at 1:48 AM, Moriyoshi Koizumi m...@mozo.jp wrote:

 Hey,

 I guess the patch relies on the 5.3's DVAL_TO_LVAL behavior that was
 changed by the fix for bug #42868, right?
 If so, this patch shouldn't be MFH'ed as the #42868 patch was not
 merged although I didn't remember any discussion on this.

 See also: http://marc.info/?l=php-internalsm=120799720922202w=2

 Regards,
 Moriyoshi


 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php


 Ilia Alshanetsky






-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Bug #46701

2009-02-12 Thread Moriyoshi Koizumi
Hey,

I guess the patch relies on the 5.3's DVAL_TO_LVAL behavior that was
changed by the fix for bug #42868, right?
If so, this patch shouldn't be MFH'ed as the #42868 patch was not
merged although I didn't remember any discussion on this.

See also: http://marc.info/?l=php-internalsm=120799720922202w=2

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Bug #46701

2009-02-12 Thread Moriyoshi Koizumi
See the results of  the following on 5.2.6, 5.2.9rc2 and 5.3:

php -r '$a[1e100] = 1; var_dump($a);'

5.2.6:
array(1) {
  [-2147483648]=
  int(1)
}

5.2.9rc2:
array(1) {
  [-1]=
  int(1)
}

5.3:
array(1) {
  [2147483647]=
  int(1)
}

I  doubt the result of 5.2.9rc2 is quite what we expect, and this
problem should be addressed in 5.3 with the 5.2's behavior unchanged.

Moriyoshi

On Fri, Feb 13, 2009 at 1:48 AM, Moriyoshi Koizumi m...@mozo.jp wrote:
 Hey,

 I guess the patch relies on the 5.3's DVAL_TO_LVAL behavior that was
 changed by the fix for bug #42868, right?
 If so, this patch shouldn't be MFH'ed as the #42868 patch was not
 merged although I didn't remember any discussion on this.

 See also: http://marc.info/?l=php-internalsm=120799720922202w=2

 Regards,
 Moriyoshi


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PATCH] Bug #46806 - mb_strimwidth

2009-01-04 Thread Moriyoshi Koizumi

Hi,

This behavior seems strange, but the rationale of this function is that 
the (east asian) width of the resulting string does not exceed the 
specified value so that it would fit to a fixed sized box when rendered 
in the browser, assuming the string would be displayed with a monospace 
font.  So, it is the document that is wrong.


Regards,
Moriyoshi

(2008/12/31 1:24), Henrique M. Decaria wrote:

Hello guys,

Looking in the subject mentioned bug report, it seems that the attached
patch might do the trick and make the mb_strimwidth() function work as
explained in the php manual (http://br2.php.net/mb_strimwidth).
According to it, the function should trim up to the defined width and append
the trimmarker, that being said, the following code should return hello...
not he...

echo mb_strimwidth('helloworld', 0, 5, '...', 'UTF-8');

The change in the file is:

-pc.width = width - mkwidth;
+pc.width = width;

I am not sure why there is this - mkwidth there, but it seems to be used
to remove the last mkwidth characters from the trimmed string so it would
be replaced with the trimmarker.

It could be that the test file for this function is incorrect (i can't
guarantee as I am having issues with Japanese characters here). So please,
if you could have a look and see if it works as expected with this patch it
would be great.

Thanks,

Henrique





--
-- moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Need some help about using Boost.PHP to make extensions

2008-11-19 Thread Moriyoshi Koizumi
Please redirect your question to me. Boost.PHP is not an official part
of the PHP project.

Moriyoshi

On Wed, Nov 19, 2008 at 2:13 AM, Chris Jiang [EMAIL PROTECTED] wrote:
 Hi all, it's been a while since my last thread. I've been playing around
 with Boost.PHP these days, and find it pretty much satisfactory. Now, I need
 some help on how Boost.PHP treat with PHP arrays. Not complaining, but there
 isn't much document I can learn from the official website.

 And if this thread is NOT totally about PHP, I'm sorry about this. This
 mailing list is the only place possibly I can find answers about it. I can't
 find a suitable C/C++ forum to ask this question since they don't work with
 PHP though. :(

 I'm trying to get an array('key'='val') (associative array) as argument,
 and return a std::string of 'key:val' back to PHP. Now, I can get the keys
 but not the arrays. I thought there might be something wrong with how I was
 using php::array::const_reference.

 Can anyone be so kind to explain how it works? Or perhaps a piece of working
 code that I can learn about?

 Thank you!

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Extending PHP with C++ (Newbie Question)

2008-11-16 Thread Moriyoshi Koizumi
Hey,

Don't forget to have a look at Boost.PHP :)

http://github.com/moriyoshi/boost.php/wikis

Moriyoshi

On Sun, Nov 16, 2008 at 12:11 AM, Chris Jiang [EMAIL PROTECTED] wrote:
 Hi all, it's my first time posting in this mailing list.

 I've been trying to make a PHP extension for my project, and would really
 like to use C++ instead of C to write the code. I've been searching for some
 tutoral or manual for some time already, but not so lucky fining anything
 useful for newbies like myself.

 Yes, I've searched quite carefully on the search engine and the achive of
 this mailing list. End up finding a really old (posted in 2004) thread by
 someone signed as 'J', directing to a link sort of like
 http://tutorbuddy.com/software/phpcpp. I guess this might be what I need,
 but the article seems already out of date. I've found a translated version
 of this article, and made a testing extension following the instructions,
 however it didn't work.

 The article was written for PHP 4.2.X, and the sample code is missing (the
 link is broken). I'm currently working with PHP 5.2.5. Is it because PHP
 5.2.5 is really different with 4.X? Or the document was not properly
 translated?

 Can someone be so kind to point me an URL of this original article?

 Thank you all!

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] quick polls for 5.3

2008-11-15 Thread Moriyoshi Koizumi
On Thu, Nov 13, 2008 at 4:14 AM, Lukas Kahwe Smith [EMAIL PROTECTED] wrote:
 1) ext/mhash in 5.3.
 I) enable ext/hash by default
+1
 II) remove ext/mhash
+1

 2) deprecate ereg*.
+1

 3) resource constants (choose one)
 a) Should we deprecate constant resources (mostly used to emulate STDIN and
 friends)
 b) Should we instead just throw an E_STRICT
 c) Document as is
a

 4) keep ext/phar enabled by default in 5.3?
+1

 5) keep ext/sqlite3 enabled by default in 5.3?
+1

 6) enable mysqlnd by default in 5.3? (answer both)
 I) enable mysqlnd by default
-1
 II) also enable ext/mysql, mysqli and pdo_mysql by default since there will
 be no external dependencies in this case
-1

 7) should Output buffering rewrite MFH? this one comes with some baggage, we
 need enough people to actually have a look at how things are in HEAD and
 make it clear that they will be available for bug fixing and BC issues
 resolving. the risk here is obviously that any BC issues will be hard to
 isolate for end users.
+1

 8) MFH mcrypt cleanups in HEAD. either the make sense or they dont, so
 either (choose one)
 a) revert in HEAD
 b) MFH to 5.3
b

-- moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistencies in 5.3

2008-08-06 Thread Moriyoshi Koizumi

Stanislav Malyshev wrote:

Hi!


function ($arg) { use $a, $b;


Note that neither static not global allow  inside definitions, so from 
consistency point of view it doesn't work.


What a nitpicking :) So would I say that the global statement is 
inconsistent with static because it doesn't allow assignments within the 
statement :p


Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistencies in 5.3

2008-08-06 Thread Moriyoshi Koizumi

Stanislav Malyshev wrote:

Hi!

What a nitpicking :) So would I say that the global statement is 
inconsistent with static because it doesn't allow assignments within 
the statement :p


Sure it is. That's just another thing to show all this consistency 
talk is blown way out of proportion long ago. Now let's make global 
accept assignments and ignore them for consistency, should we?


I'd prefer the opposite way in which no initializers are allowed for 
static. null type was introduced to mark uninitialized variables in the 
first place, so if we have strictly kept the original intention, there 
should be no problem with this idea. (I'd say NULL values from the 
database or other softwares should have been made effectively 
differentiable with language-defined null values, but this is another issue)


Apart from this, the constructs in question share the same semantics 
where their role is to define variables that refers to values of a 
non-local data storage, so making the lexical variable declaration 
statement look like the others is not inconsistent with the current 
language syntax, whereas the use construct after the argument list is 
inconsistent with the ordinary function definition unless my proposal 
[1] is accepted.


[1] http://news.php.net/php.internals/39071

Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistencies in 5.3

2008-08-06 Thread Moriyoshi Koizumi

Stanislav Malyshev wrote:

Hi!

language syntax, whereas the use construct after the argument list 
is inconsistent with the ordinary function definition unless my proposal 


Because it is _not_ an ordinary function definition. It's like saying 
'+' is inconsistent with '-' because $a+$b=$b+$a but $a-$b!=$b-$a.


Your argument sounds far too extreme.


P.S. btw, in your proposal functions would be inconsistent with methods.


Should I follow like methods are not functions? or PHP has 
inconsistent language features. That seems oddly enough.


Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistencies in 5.3

2008-08-05 Thread Moriyoshi Koizumi

Larry Garfield wrote:

On Tuesday 05 August 2008 12:48:37 am Moriyoshi Koizumi wrote:

I don't think there are many differences in ambiguity between

$closure = function ($arg) { use $a;
   ...
};

and

$closure = function ($arg) use ($a) {
};

Moriyoshi

--
Moriyoshi Koizumi [EMAIL PROTECTED]


The former has no good way to differentiate between by-ref and by-value 
importing.  The latter has a very intuitive way.  That's why (IIRC) it was 
used.


The only difference I could see between the two is the presence of 
parenthesis. I doubt they contribute to the intuitiveness that much.


function ($arg) { use $a, $b;

v.s.

function ($arg) use ($a, $b) {

Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistencies in 5.3

2008-08-04 Thread Moriyoshi Koizumi

Marcus Boerger wrote:

Hello Lukas,

Monday, August 4, 2008, 10:49:43 AM, you wrote:



On 04.08.2008, at 10:41, Marcus Boerger wrote:



Hello Lukas,

Monday, August 4, 2008, 10:32:26 AM, you wrote:



On 04.08.2008, at 10:28, Stefan Priebsch wrote:

Hannes Magnusson schrieb:
I don't think anyone but him likes multiple namespaces per file.  
I do

remember a PhD thesis sized mail from him explaining why multiple
namespaces per file was needed though (can hardly believe anyone  
read

the whole thing..).

In some deployment processes, multiple PHP files are merged together
into one file. Symfony, for example, does this, at least optionally.



Right, this is common practice to reduce disk I/O without having to
make development too hard. Also that way people can pick and choose
what they want to include (like not all drivers of a DBAL).

If an edgecase optimization is th eonly reason then I am against this
even more.



its not an edge optimization .. like i said its common practice in  
many PHP frameworks. this way they can more easily develop the code,  
while not having to suffer the drawbacks from a lot of disk I/O from  
files that need to be loaded in every request anyways.


and those frameworks are the main users of namespaces, because they  
pull in libs from all sorts of libraries, add plugins etc.


In that case lets have curly braces at least to be consistent with the rest
of the language as every other grouping statement has curly braces.


I think the point is that PHP doesn't have the concept of compilation 
units unlike major namespaced programming languages such as C++ and 
Java. PHP so far chose to differentiate grammatical contexts with braces 
or blocked statements, so should namespace.


Moriyoshi
--
Moriyoshi Koizumi [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistencies in 5.3

2008-08-04 Thread Moriyoshi Koizumi

Dmitry Stogov wrote:



Marcus Boerger wrote:

Hello Dmitry,

Monday, August 4, 2008, 8:55:00 AM, you wrote:


Hi Marcus,



see below



Marcus Boerger wrote:

Hello Internals,

  please let's not introduce new inconsistencies. Rather lets make new
stuff consistent with old stuff during the alpha phase of 5.3.

1) new keyword 'use'. Semantically it is the same as 'static' or 
'global'

so it should be used in the same location.


For me 'use' is the best keyword as it says that closure uses 
variables from current content. (the same keyword is used for import 
from namespaces)


To be clear, I wasn't complaining about the keyword per se. I just prefer
it to be inside the curly braces of a closure next to global rather 
than in

front of it.



No. The list of lexical variables is a part of the closure definition.

The earlier implementation had lexical keyword which worked as you are 
suggesting, but it was much unclear.


I don't think there are many differences in ambiguity between

$closure = function ($arg) { use $a;
  ...
};

and

$closure = function ($arg) use ($a) {
};

Moriyoshi

--
Moriyoshi Koizumi [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Inconsistencies in 5.3

2008-08-03 Thread Moriyoshi Koizumi

Fully agreed with all of those. (especially for 1)

Moriyoshi

Marcus Boerger wrote:

Hello Internals,

  please let's not introduce new inconsistencies. Rather lets make new
stuff consistent with old stuff during the alpha phase of 5.3.

1) new keyword 'use'. Semantically it is the same as 'static' or 'global'
so it should be used in the same location.

2) namespaces, either use 'package' and only one per file, or use
'namespace' with curly braces. Read this as be consistent with other
languages and even if one or two people do not like it the two main
languages out there which have it are Java which goes with the former and
C++ which does the latter. Please chose and not mix it. Also our mix is a
nightmare when developing code.

If we feel we have to keep the keyword 'namesapce' but cannot have curly
braces, than I suggest we disallow multiple namespace per file.

And there is no technical reason and surely no other reason whatsoever to
not have curly braces. If there is then we either fix that or went with the
wrong approach.

3) __invokable, see Etiene's mail

Best regards,
 Marcus





--
Moriyoshi Koizumi [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Const-correctness

2008-08-03 Thread Moriyoshi Koizumi

Marcus Boerger wrote:

Overall you do a ton of 'struct * const var' which only means that you are
going to copy the pointer explicitly. Now functions that use a pointer in a
loop and increment it cannot optimize the code anymore and are forced to use
an additional real variable on the stack. So unless you have a very smart
compiler, the result is an increased stack size. Generally speaking I prefer
const on the reight and mid (between *'s) only. But others prefer it to denote
that not even the passed in pointer gets modifed and this sometimes even makes
debugging easier.


That's not necessarily true. In contrast to the volatile qualifier, the 
const qualifier in this context is just a semantic thing. So if your 
function has a immutable argument variable, it probably gets compiled a 
procedure that receives the argument in the registers and use it 
mutatively as long as no side-effects are possible. (of cource how it 
compiles depends on the target platform)


Having that said, I don't think it makes any sense to make pointer 
variables immutable, while marking the referent immutable greatly 
reduces tiny errors. Compilers are often smarter in this case.


Moriyoshi
--
Moriyoshi Koizumi [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] enabling everything by default

2008-08-02 Thread Moriyoshi Koizumi

Hi,

Antony Dovgal wrote:


Extensions enabled by default in 5.3:
ctype
date
dom
ereg
fileinfo - new, untested.
filter
hash
iconv
json
libxml
pcre
PDO
pdo_sqlite
Phar - new, untested
posix
Reflection
session
SimpleXML
SPL
SQLite
sqlite3 - new, untested
standard
tokenizer
xml
xmlreader
xmlwriter
--
Total: 26 extensions


I don't see any reason to not include mbstring in this list.

Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Introducing Boost.PHP - PHP Extensions in C++, in a minute

2008-07-30 Thread Moriyoshi Koizumi

Hi Marcus,

Really? I haven't heard anything like that.

MOriyoshi

On 2008/07/30, at 16:19, Marcus Boerger wrote:


Hello Stefan,

which doesn't belong there either.

marcus

Wednesday, July 30, 2008, 9:14:07 AM, you wrote:


Marcus Boerger schrieb:

Hello Moriyoshi,

  actually you should place it as a PHP module. Boost provides  
core level
  stuff and algorithmns and such. This is a highly specialized  
bridge.


marcus



Just like Boost.Python
(http://www.boost.org/doc/libs/1_35_0/libs/python/doc/index.html) ;)



Regards,
Stefan





Best regards,
 Marcus





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Introducing Boost.PHP - PHP Extensions in C++, in a minute

2008-07-29 Thread Moriyoshi Koizumi

Hi folks,

I created a library that may draw some attention. Boost.PHP is a set  
of macros and C++ classes that wrap around common Zend Engine structs  
that allow you to create a PHP extension in C++, in a very efficient  
way. Most notably, you no longer need most of the ZE macros and APIs  
like ZEND_FE and zend_parse_parameters() since the library  
automagically handles the signatures of your C++ functions and  
enables them to be exposed just as they are.


For further information, please look at the dedicated wiki page on  
github:

http://github.com/moriyoshi/boost.php/wikis

Regards,
Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] RFC question (Re: [PHP-DEV] [PATCH] Allow use($var..) statement ubiquitously)

2008-07-23 Thread Moriyoshi Koizumi
I would like to keep this as a RFC page in wiki.php.net. Are there  
any conventions or rules that I should keep in mind? (or just-not- 
supposed-to-do-that-because-your-proposal-is-stupid-and-will-never-be- 
accepted?)


Moriyoshi

On 2008/07/18, at 8:23, Moriyoshi Koizumi wrote:


Hi,

Attached are the patches that allow the use statement that was  
introduced with closures to appear within every function statement  
except method definitions. I think this feature is a good addition  
because it resolves inconsistency between closures and unticked  
functions. In a nutshell,


?php
function foo() use ($some_globals) {
echo $some_globals;
}
?

is equivalent to

?php
function foo() {
global $some_globals;
echo $some_globals;
}
?

While,

?php
function bar() {
$some_local_var = '';
function fubar() use ($some_local_var) {
echo $some_local_var;
}
}
?

and

?php
function bar() {
function fubar() {
global $some_local_var;
echo $some_local_var;
}
}
?

are of course not the same.php-ubiquitous-use-statement- 
HEAD-20080718.patch.diff.txtphp-ubiquitous-use-statement- 
PHP_5.3-20080718.patch.diff.txt

Moriyoshi


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PATCH] Allow use($var..) statement ubiquitously

2008-07-18 Thread Moriyoshi Koizumi
That's one of the motivations for the patch. I never liked the new  
syntax, but if it was given a go, it should also be made consistent  
with the another part of the syntax. Oh, I just got one important  
thing in mind to mention;


test1.php:
?php
function a() {
$a = bar;
include(test2.php);
}

$a = foo;
a();
b();
?

test2.php:
?php
function b() use ($a) {
echo $a, \n;
}
b();
?

running test1.php ends up with two lines of bar, surprisingly. This  
is somewhat confusing, but surely one of the things that could not  
ever be done. This might be a great help when you use a PHP-script  
file as a mark-up template.


Moriyoshi


On 2008/07/18, at 15:10, Larry Garfield wrote:


Which is why I am not a fan of this syntax as proposed.  You're  
using the same

keyword to mean two different but very very close things.  That's very
confusing.  It also means that you cannot take a global by value in a
closure.

Earlier the following was proposed:

function foo($a, $b) global ($c, $d) {

}

$bar = new function($a, $b) use ($c, $d) global ($e, $f) {

};

Which I think is much more self-explanatory, accomplishes the same  
goal, and
still does not introduce any new keywords.  (I still think  
lexical is

better than use, and safe, but whatev. g)

--
Larry Garfield
[EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PATCH] Allow use($var..) statement ubiquitously

2008-07-18 Thread Moriyoshi Koizumi


On 2008/07/18, at 19:06, Richard Quadling wrote:


2008/7/18 Moriyoshi Koizumi [EMAIL PROTECTED]:

running test1.php ends up with two lines of bar, surprisingly.  
This is somewhat confusing, but surely one of the things that could  
not ever be done. This might be a great help when you use a PHP- 
script file as a mark-up template.


Moriyoshi

It was my understanding that include-d functions were added to the  
global scope (or I suppose the active namespace).


So, in that context, function b() use($a) {} should be getting the  
$a from the global scope where $a == foo.


I would say that getting 2 bars is wrong.


Lexical scopes are completely irrelevant to which namespace the  
function belongs to.


Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] [PATCH] Allow use($var..) statement ubiquitously

2008-07-17 Thread Moriyoshi Koizumi

Hi,

Attached are the patches that allow the use statement that was  
introduced with closures to appear within every function statement  
except method definitions. I think this feature is a good addition  
because it resolves inconsistency between closures and unticked  
functions. In a nutshell,


?php
function foo() use ($some_globals) {
echo $some_globals;
}
?

is equivalent to

?php
function foo() {
global $some_globals;
echo $some_globals;
}
?

While,

?php
function bar() {
$some_local_var = '';
function fubar() use ($some_local_var) {
echo $some_local_var;
}
}
?

and

?php
function bar() {
function fubar() {
global $some_local_var;
echo $some_local_var;
}
}
?

are of course not the same.
? .gdb_history
Index: Zend/zend_closures.c
===
RCS file: /repository/ZendEngine2/zend_closures.c,v
retrieving revision 1.4
diff -u -r1.4 zend_closures.c
--- Zend/zend_closures.c14 Jul 2008 12:17:16 -  1.4
+++ Zend/zend_closures.c17 Jul 2008 23:16:29 -
@@ -213,44 +213,6 @@
 }
 /* }}} */
 
-static int zval_copy_static_var(zval **p, int num_args, va_list args, 
zend_hash_key *key) /* {{{ */
-{
-   HashTable *target = va_arg(args, HashTable*);
-   zend_bool is_ref;
-   TSRMLS_FETCH();
-   
-   if (Z_TYPE_PP(p)  (IS_LEXICAL_VAR|IS_LEXICAL_REF)) {
-   is_ref = Z_TYPE_PP(p)  IS_LEXICAL_REF;
-
-   if (!EG(active_symbol_table)) {
-   zend_rebuild_symbol_table(TSRMLS_C);
-   }
-   if (zend_u_hash_quick_find(EG(active_symbol_table), key-type, 
key-arKey, key-nKeyLength, key-h, (void **) p) == FAILURE) {
-   if (is_ref) {
-   zval *tmp;
-
-   ALLOC_INIT_ZVAL(tmp);
-   Z_SET_ISREF_P(tmp);
-   zend_u_hash_quick_add(EG(active_symbol_table), 
key-type, key-arKey, key-nKeyLength, key-h, tmp, sizeof(zval*), 
(void**)p);
-   } else {
-   p = EG(uninitialized_zval_ptr);
-   zend_error(E_NOTICE,Undefined variable: %s, 
key-arKey);
-   }
-   } else {
-   if (is_ref) {
-   SEPARATE_ZVAL_TO_MAKE_IS_REF(p);
-   } else if (Z_ISREF_PP(p)) {
-   SEPARATE_ZVAL(p);
-   }
-   }
-   }
-   if (zend_u_hash_quick_add(target, key-type, key-arKey, 
key-nKeyLength, key-h, p, sizeof(zval*), NULL) == SUCCESS) {
-   Z_ADDREF_PP(p);
-   }
-   return ZEND_HASH_APPLY_KEEP;
-}
-/* }}} */
-
 ZEND_API void zend_create_closure(zval *res, zend_function *func, 
zend_class_entry *scope, zval *this_ptr TSRMLS_DC) /* {{{ */
 {
zend_closure *closure;
@@ -263,11 +225,8 @@
 
if (closure-func.type == ZEND_USER_FUNCTION) {
if (closure-func.op_array.static_variables) {
-   HashTable *static_variables = 
closure-func.op_array.static_variables;
-   
-   
ALLOC_HASHTABLE(closure-func.op_array.static_variables);
-   
zend_u_hash_init(closure-func.op_array.static_variables, 
zend_hash_num_elements(static_variables), NULL, ZVAL_PTR_DTOR, 0, UG(unicode));
-   zend_hash_apply_with_arguments(static_variables, 
(apply_func_args_t)zval_copy_static_var, 1, 
closure-func.op_array.static_variables);
+   zend_do_bind_static_variables(closure-func.op_array 
TSRMLS_CC);
+   func-op_array.static_variables = NULL;
}
(*closure-func.op_array.refcount)++;
}
Index: Zend/zend_compile.c
===
RCS file: /repository/ZendEngine2/zend_compile.c,v
retrieving revision 1.829
diff -u -r1.829 zend_compile.c
--- Zend/zend_compile.c 14 Jul 2008 12:17:16 -  1.829
+++ Zend/zend_compile.c 17 Jul 2008 23:16:30 -
@@ -1511,7 +1511,7 @@
 }
 /* }}} */
 
-void zend_do_end_function_declaration(znode *function_token TSRMLS_DC) /* {{{ 
*/
+void zend_do_end_function_declaration(znode *function_token, zend_bool 
is_closure TSRMLS_DC) /* {{{ */
 {
unsigned int lcname_len;
zstr lcname;
@@ -1554,6 +1554,9 @@
}
 
CG(active_op_array)-line_end = zend_get_compiled_lineno(TSRMLS_C);
+   if (is_closure) {
+   CG(active_op_array)-fn_flags |= ZEND_ACC_CLOSURE;
+   }
CG(active_op_array) = function_token-u.op_array;
 
 
@@ -1679,7 +1682,8 @@
lcname = zend_u_str_case_fold(Z_TYPE(function_name-u.constant), 
Z_UNIVAL(function_name-u.constant), Z_UNILEN(function_name-u.constant), 0, 
lcname_len);
if ((zend_u_hash_find(CG(function_table), 
Z_TYPE(function_name-u.constant), 

[PHP-DEV] --disable-mbregex in HEAD?

2008-07-15 Thread Moriyoshi Koizumi
Hi there,

(Yeah, it's been a long time.. Many of you might well have thought
I'm dead :)

This time I want to revisit the following issue:
http://article.gmane.org/gmane.comp.php.devel/50681

Is there any point to disable mbregex alone instead of disabling the
whole mbstring stuff? Unless the decision whether to send the module into
PECL (or anywhere you want it to go) has been made, I suppose it should
not.

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] --disable-mbregex in HEAD?

2008-07-15 Thread Moriyoshi Koizumi
Marcus Boerger wrote:
 Hello Moriyoshi,
 
 (just for Derick answered here)
 I never liked this. So imo we can simply get rid of this switch and enable
 mbregex as soon as there is mbstring to begin with and regex stuff.

Me neither. As far as I know, the switch exists just for historical
reasons. Since --enable-mbregex takes effect only when --enable-mbstring
is specified, it'd be the same as if there was the single option.

Aside from this, what about removing bundled libmbfl and oniguruma from
HEAD and requiring users' using external ones? Now that intl extension
was merged and I don't think it makes sense.

Regards,
Moriyoshi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] implode() speedup in PHP4

2005-03-18 Thread Moriyoshi Koizumi
On 2005/03/19, at 3:41, Alexander Valyalkin wrote:
I've posted patch last summer 
http://www.zend.com/zend/week/pat/pat5.txt
which imporves performance of implode() function in PHP4, but nobody 
doesn't
commit it yet. Can anybody explain the reason?
Because convert_to_string_ex() modifies the content of every element
that is not of string type.
?php
$arrays = array(array(), array(), array());
var_dump(implode('', $arrays));
var_dump($arrays);
?
We already have an optimisation in the 5.x branches.
I think it is appliable to the 4.3 branch.
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] implode() speedup in PHP4

2005-03-18 Thread Moriyoshi Koizumi
On 2005/03/19, at 6:14, Derick Rethans wrote:
On Sat, 19 Mar 2005, Moriyoshi Koizumi wrote:
?php
$arrays = array(array(), array(), array());
var_dump(implode('', $arrays));
var_dump($arrays);
?
We already have an optimisation in the 5.x branches.
I think it is appliable to the 4.3 branch.
4.3 is only for bugfixes!
It didn't mean that the fix should be in the 4.3. What was it
that excited you so much? :)
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] reference issue

2005-03-17 Thread Moriyoshi Koizumi
That's expected behaviour. Have a look at http://bugs.php.net/20993 .
The problem can also be prevented by the following hack:
// modifying and assining branch causes unexpexted results to 
$this-tree
$metatree[$object_id] = (array)unserialize(serialize($branch));

Regards,
Moriyoshi
On 2005/03/17, at 21:04, Lukas Smith wrote:
Hi,
I have encountered yet another reference issue. I tested this on PHP 
4.3.10 and 4.3.11RC1 on both windows and linux.

Here is a script that reproduces the issue:
http://www.backendmedia.com/PHP/reference_bug.phps
In the comments at the top you can also see my hack which fixes the 
problem for me.

I have also tried Derick's patch on the 4.3.11RC1 install on linux, 
but this didnt change the situation at all.

regards,
Lukas
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] reference issue

2005-03-17 Thread Moriyoshi Koizumi
On 2005/03/17, at 21:52, Lukas Smith wrote:
Derick Rethans wrote:
On Thu, 17 Mar 2005, Moriyoshi Koizumi wrote:
That's expected behaviour. Have a look at http://bugs.php.net/20993 .
Sorry, but this is not expected - it's a bug.
I would also like to disagree. I think this is a huge inconsistency 
that should be addressed even if that costs performance.
I agree that it's not intuitive at all, though this issue was discussed
ending up with the following words.
http://marc.theaimsgroup.com/?l=php-devm=104018832405963w=2
If however the core php developers do think this behavior should 
stick, then there should be a function that allows people to clean 
these references without such hacks (even if internally these hacks 
are still going on).
I think it's possible to add such a function to the 5.x branches.
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] XML Bug #32001

2005-03-11 Thread Moriyoshi Koizumi
On 2005/03/11, at 20:38, Rob Richards wrote:
The test works fine for me under linux. Only difference is on windows 
and there its to how windows performs php_strtoupper differently. If 
case folding is turned off under windows, the test passes.
It looks like the problem is due to the following facts.
- php_strtoupper() depends on the locale settings.
- toupper() works differently between platforms. Besides it isn't
  designed to handle multibyte strings like UTF-8.
I was mostly testing on Linux so I couldn't replicate it
(now verified on Mac OS X).
I think case folding option doesn't make any sense if it doesn't
work perfectly and we better deprecate it until some unicode-aware
case folding function is available.
Moriyoshi
Rob
Moriyoshi Koizumi wrote:
On 2005/03/11, at 10:24, Marcus Boerger wrote:
Hello moriyoshi or any other XMLer,
  please verify the --EXPECT-- data in test 
ext/xml/tests/bug32001.phpt
i am quite sure there are several mistakes in it.
Excuse me if I'm just missing something, what kind of mistake do
you want to address?
Moriyoshi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] HALT Patch

2005-03-11 Thread Moriyoshi Koizumi
Hi,
I modified your patch so it can capture the position where the
supposed data begins into the constant __HALT_PHP_PARSER__.
There may be a problem with my patch if more than one require()'d /
include()'d script contain __HALT_PHP_PARSER__, but it'd be
quite handy if such an issue is resolved.
?php
$fp = fopen(__FILE__, 'rb');
fseek($fp, __HALT_PHP_PARSER__, SEEK_SET);
fpassthru($fp);
?
?php __HALT_PHP_PARSER__; ?
abc
def
Moriyoshi
Index: Zend/zend_language_scanner.l
===
RCS file: /repository/ZendEngine2/zend_language_scanner.l,v
retrieving revision 1.124
diff -u -r1.124 zend_language_scanner.l
--- Zend/zend_language_scanner.l7 Mar 2005 16:48:49 -   1.124
+++ Zend/zend_language_scanner.l11 Mar 2005 23:47:11 -
@@ -1342,6 +1342,13 @@
return T_INLINE_HTML;
 }
 
+INITIAL?php __HALT_PHP_PARSER__; ?{NEWLINE}? {
+   long fpos = (yyin ? zend_stream_ftell(yyin TSRMLS_CC): 0) + 
(long)(yy_c_buf_p - (YY_CURRENT_BUFFER)-yy_ch_buf);
+   REGISTER_MAIN_LONG_CONSTANT(__HALT_PHP_PARSER__, fpos, CONST_CS);
+
+   yyterminate();
+}
+
 
INITIAL?|script{WHITESPACE}+language{WHITESPACE}*={WHITESPACE}*(php|\php\|\'php\'){WHITESPACE}*
 {
HANDLE_NEWLINES(yytext, yyleng);
if (CG(short_tags) || yyleng2) { /* yyleng2 means it's not ? but 
script */
Index: Zend/zend_stream.c
===
RCS file: /repository/ZendEngine2/zend_stream.c,v
retrieving revision 1.9
diff -u -r1.9 zend_stream.c
--- Zend/zend_stream.c  27 Sep 2004 09:03:40 -  1.9
+++ Zend/zend_stream.c  11 Mar 2005 23:47:12 -
@@ -35,6 +35,11 @@
fclose((FILE*)handle);
 }
 
+static long zend_stream_stdio_fteller(void *handle TSRMLS_DC)
+{
+   return ftell((FILE*)handle);
+}
+
 ZEND_API int zend_stream_open(const char *filename, zend_file_handle *handle 
TSRMLS_DC)
 {
if (zend_stream_open_function) {
@@ -80,9 +85,10 @@
}
 
/* promote to stream */
-   file_handle-handle.stream.handle = file_handle-handle.fp;
-   file_handle-handle.stream.reader = zend_stream_stdio_reader;
-   file_handle-handle.stream.closer = zend_stream_stdio_closer;
+   file_handle-handle.stream.handle  = file_handle-handle.fp;
+   file_handle-handle.stream.reader  = zend_stream_stdio_reader;
+   file_handle-handle.stream.closer  = zend_stream_stdio_closer;
+   file_handle-handle.stream.fteller = zend_stream_stdio_fteller;
file_handle-type = ZEND_HANDLE_STREAM;
 
file_handle-handle.stream.interactive = isatty(fileno((FILE 
*)file_handle-handle.stream.handle));
@@ -121,4 +127,7 @@
return 0;
 }
 
-
+ZEND_API long zend_stream_ftell(zend_file_handle *file_handle TSRMLS_DC)
+{
+   return 
file_handle-handle.stream.fteller(file_handle-handle.stream.handle TSRMLS_CC);
+}
Index: Zend/zend_stream.h
===
RCS file: /repository/ZendEngine2/zend_stream.h,v
retrieving revision 1.6
diff -u -r1.6 zend_stream.h
--- Zend/zend_stream.h  25 Jun 2004 12:55:11 -  1.6
+++ Zend/zend_stream.h  11 Mar 2005 23:47:12 -
@@ -27,11 +27,13 @@
 
 typedef size_t (*zend_stream_reader_t)(void *handle, char *buf, size_t len 
TSRMLS_DC);
 typedef void (*zend_stream_closer_t)(void *handle TSRMLS_DC);
+typedef long (*zend_stream_fteller_t)(void *handle TSRMLS_DC);
 
 typedef struct _zend_stream {
void *handle;
zend_stream_reader_t reader;
zend_stream_closer_t closer;
+   zend_stream_fteller_t fteller;
int interactive;
 } zend_stream;
 
@@ -52,6 +54,7 @@
 ZEND_API int zend_stream_ferror(zend_file_handle *file_handle TSRMLS_DC);
 ZEND_API int zend_stream_getc(zend_file_handle *file_handle TSRMLS_DC);
 ZEND_API size_t zend_stream_read(zend_file_handle *file_handle, char *buf, 
size_t len TSRMLS_DC);
+ZEND_API long zend_stream_ftell(zend_file_handle *file_handle TSRMLS_DC);
 ZEND_API int zend_stream_fixup(zend_file_handle *file_handle TSRMLS_DC);
 END_EXTERN_C()
 
Index: main/main.c
===
RCS file: /repository/php-src/main/main.c,v
retrieving revision 1.619
diff -u -r1.619 main.c
--- main/main.c 8 Mar 2005 21:42:10 -   1.619
+++ main/main.c 11 Mar 2005 23:47:20 -
@@ -838,6 +838,11 @@
php_stream_close((php_stream*)handle);
 }
 
+static long stream_fteller_for_zend(void *handle TSRMLS_DC)
+{
+   return (long)php_stream_tell((php_stream*)handle);
+}
+
 static int php_stream_open_for_zend(const char *filename, zend_file_handle 
*handle TSRMLS_DC)
 {
php_stream *stream;
@@ -851,6 +856,7 @@
handle-handle.stream.handle = stream;
handle-handle.stream.reader = 
(zend_stream_reader_t)_php_stream_read;
handle-handle.stream.closer = 

Re: [PHP-DEV] XML Bug #32001

2005-03-10 Thread Moriyoshi Koizumi
On 2005/03/11, at 10:24, Marcus Boerger wrote:
Hello moriyoshi or any other XMLer,
  please verify the --EXPECT-- data in test ext/xml/tests/bug32001.phpt
i am quite sure there are several mistakes in it.
Excuse me if I'm just missing something, what kind of mistake do
you want to address?
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] [PATCH] ext/xml/compat.c fix for #32001

2005-02-17 Thread Moriyoshi Koizumi
On 2005/02/17, at 22:28, Joe Orton wrote:
So it is a bit of a tricky trade-off...
How about #ifdef'ifying it? It's lame though...
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] iconv doesn't built on OSX

2005-02-10 Thread Moriyoshi Koizumi
On 2005/02/10, at 22:13, Jani Taskinen wrote:
Please blame the correct people, not always by default ME. :)
I never touched this macro, it was andrei:
I didn't blame you, but I just wanted to say you did it right :)
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] iconv doesn't built on OSX

2005-02-08 Thread Moriyoshi Koizumi
I've experienced the same. The following improper change was left
forgotten when Jani applied his fix that is eventually correct.
http://cvs.php.net/diff.php/php-src/acinclude.m4?r1=1.275r2=1.276ty=u
Moriyoshi
On 2005/02/09, at 3:55, Christian Stocker wrote:
Hi
After a long look into the configure stuff for iconv, I found out, why 
it doesn't (even try to) compile on my OS X 10.3 box.

in acinclude.m4 for PHP_SETUP_ICONV, there is somewhere
test -f $ICONV_DIR/$PHP_LIBDIR/lib$iconv_lib_name.$SHLIB_SUFFIX_NAME
 $SHLIB_SUFFIX_NAME is 'so', but system wide libraries on OSX end on 
'dylib'.

I assume, this variable is also used for later building the shared 
libs, so that's basically ok (libphp5 and shared extensions are ending 
on '.so', even on OSX )

I have no idea, what I have to change, or if there's another variable 
for that. Can please anyone with more insight than me have a look at 
that?

thanks
chregu
--
christian stocker | Bitflux GmbH | schoeneggstrasse 5 | ch-8004 zurich
phone +41 1 240 56 70 | mobile +41 76 561 88 60  | fax +41 1 240 56 71
http://www.bitflux.ch  |  [EMAIL PROTECTED]  |  gnupg-keyid 0x5CE1DECB
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


[PHP-DEV] Reversion of patch for bug #31398

2005-01-23 Thread Moriyoshi Koizumi
Ilia,
While your opinion that we shouldn't count on the filename
attribute in multiparted queries is quite to the point,
your patch totally disregards the old behaviour.
We must neither use php_basename() for pathes of a spec
different from that defined by the platform which is determined
in compile time. That most likely leads to inconsistent behaviour
between platforms.
In addition, you had removed the portion Rui added in the past
that allowed multibyte-encoded filenames to get parsed properly
without any notice. Actually the function in question doesn't
work at all with such filenames at this momemnt.
Due to those issues, I'm going to revert your patch and recommit
less radical one.
Regards,
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


[PHP-DEV] Re: #31098 [Csd]: isset false positive

2005-01-13 Thread Moriyoshi Koizumi
On 2005/01/13, at 15:21, [EMAIL PROTECTED] wrote:
I don't quite agree with you. Indeed it's semantically wrong,
yet I think we leave it to behave as in ZE1, in terms of
backwards compatibility.
I don't think we must make compatibility for bugs.
Users don't consider a certain behaviour that lasts for years
the way we do to be a bug, even if it is inconsistent and
rationally wrong, since they often got used to it and supposedly
developed some workaround for it while we can rarely predict
how they code with it and cope with it.
Thus we should think of what the word backwards compatibility
does actually mean.
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


  1   2   3   >