Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-05 Thread C.Koy

On 5/5/2012 12:22 AM, Galen Wright-Watson wrote:

  That also ran without error for me. I'm not sure how to account for the
different behavior. Here are the details of the system that I'm using:

$ uname -a

Linux n10 3.2.6mtv10 #1 SMP Wed Mar 14 06:22:06 PDT 2012 x86_64 GNU/Linux
$ php -v
PHP 5.2.17 with Suhosin-Patch 0.9.7 (cli) (built: May  3 2012 12:16:32)
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies
 with Zend Optimizer v3.3.9, Copyright (c) 1998-2009, by Zend
Technologies
 with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH




I've been experimenting with bare-bones PHP I've built from pristine 
sources so far. Don't you think you should do the same, in dealing with 
such a  bug?


Here's the top portion of my 'php -i' output:

~/proj$ php-5.2.17/sapi/cli/php -i|head -28
phpinfo()
PHP Version = 5.2.17

System = Linux trvuntu 2.6.32-41-generic #88-Ubuntu SMP Thu Mar 29 
13:08:43 UTC 2012 i686

Build Date = May  4 2012 20:03:30
Configure Command =  './configure'  '--disable-all' '--enable-cli' 
'--enable-vld'

Server API = Command Line Interface
Virtual Directory Support = disabled
Configuration File (php.ini) Path = /usr/local/lib
Loaded Configuration File = (none)
Scan this dir for additional .ini files = (none)
additional .ini files parsed = (none)
PHP API = 20041225
PHP Extension = 20060613
Zend Extension = 220060519
Debug Build = no
Thread Safety = disabled
Zend Memory Manager = enabled
IPv6 Support = enabled
Registered PHP Streams = php, file, data, http, ftp
Registered Stream Socket Transports = tcp, udp, unix, udg
Registered Stream Filters = string.rot13, string.toupper, 
string.tolower, string.strip_tags, convert.*, consumed



This program makes use of the Zend Scripting Language Engine:
Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-05 Thread Wim Wisselink

On 05/04/2012 11:22 PM, Galen Wright-Watson wrote:

On Fri, May 4, 2012 at 7:01 AM, C.Koycan5...@gmail.com  wrote:


On 5/2/2012 10:03 PM, Galen Wright-Watson wrote:


On Wed, May 2, 2012 at 5:23 AM, C.Koycan5...@gmail.com   wrote:

  On 5/1/2012 9:11 PM, Galen Wright-Watson wrote:

  On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.comwrote:

  As of 5.3.0 this bug does not exist for function names. Only classes
and


interfaces.


  Turns out, if you cause a function to be called dynamically by (e.g.)


using
a variable function, the bug will surface.

 ?php
 setlocale(LC_CTYPE, 'tr_TR');
 function IJK() {}
 # succeeds
 IJK();



If literal function call precedes the function definition, that would
fail
too in 5.2.17, but not in 5.3.0.
What has changed in this regard 5.2-5.3 ?


  Do you mean something like the following?

 ?php
 setlocale(LC_CTYPE, 'tr_TR');
 IJK();
 setlocale(LC_CTYPE, 'en_US');
 function IJK() {echo __FUNCTION__, \n;}

I couldn't get it to generate an error under PHP 5.2.17. What am I
missing?



Try this with 5.2.17:


  ?php
  setlocale(LC_CTYPE, 'tr_TR');
  IJK();
  function IJK() {}



  That also ran without error for me. I'm not sure how to account for the
different behavior. Here are the details of the system that I'm using:

$ uname -a

Linux n10 3.2.6mtv10 #1 SMP Wed Mar 14 06:22:06 PDT 2012 x86_64 GNU/Linux
$ php -v
PHP 5.2.17 with Suhosin-Patch 0.9.7 (cli) (built: May  3 2012 12:16:32)
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies
 with Zend Optimizer v3.3.9, Copyright (c) 1998-2009, by Zend
Technologies
 with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH
Try to var_dump the setLocale and see if it return the specified locale 
or just 'false'. If false try the following:


setlocale(LC_ALL, 'tr_TR.UTF-8');

I had the same issue.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-05 Thread C.Koy

On 5/5/2012 7:01 PM, Wim Wisselink wrote:

Try to var_dump the setLocale and see if it return the specified locale
or just 'false'.


I thought he was way past that control. Anyway, a simple test should 
suffice:


setlocale(LC_CTYPE, 'tr_TR') or exit('setlocale failed\n');







--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-05 Thread Galen Wright-Watson
On Sat, May 5, 2012 at 5:31 AM, C.Koy can5...@gmail.com wrote:


 I've been experimenting with bare-bones PHP I've built from pristine
 sources so far. Don't you think you should do the same, in dealing with
 such a  bug?


My personal system is a BSD derivative; the Turkish locales on these use
latin rather than Turkish case conversion (and installing a proper Turkish
locale is a mess), so I've been testing on another system. I've been
hesitant to use its resources too heavily for professional reasons. Running
a small PHP script is one thing; though time and space required for a PHP
build isn't large on modern systems, I can't justify doing so since it's
not directly related to site operations.

On Sat, May 5, 2012 at 8:59 AM, Wim Wisselink w...@powerassist.nl wrote:

 Try to var_dump the setLocale and see if it return the specified locale or
 just 'false'. If false try the following:

 setlocale(LC_ALL, 'tr_TR.UTF-8');


I had previously tested the locale by using setlower('I'), as it tests
both that the locale exists and uses Turkish-langage case conversion. The
systems where I tested C.Koy's script passed the setlower test. Turned
out to be the Zend optimizer that prevented the error. With it not loaded,
the example script failed with a Fatal error: Call to undefined function
IJK() error message.

Here's a breakdown:

In both PHP 5.2 and 5.3, calling a function before defining it results in a
dynamic call (INIT_FCALL_BY_NAME+DO_FCALL_BY_NAME). Here's the PHP 5.2 dump
of C.Koy's example:

  line # *  op   fetch  ext  return
 operands

-
 2 0 FETCH_CONSTANT   ~0
 'LC_CTYPE'
   1  SEND_VAL
~0
   2  SEND_VAL
'tr_TR'
   3  DO_FCALL  2
 'setlocale'
 3 4  INIT_FCALL_BY_NAME
'IJK'
   5  DO_FCALL_BY_NAME  0
 4 6  NOP
 5 7 RETURN   1
   8*ZEND_HANDLE_EXCEPTION

Here's the 5.3 dump:
  line # *  op   fetch  ext  return
 operands

-
 2 0 EXT_STMT
   1  EXT_FCALL_BEGIN
   2  SEND_VAL 2
   3  SEND_VAL
'tr_TR'
   4  DO_FCALL  2
 'setlocale'
   5  EXT_FCALL_END
 3 6  EXT_STMT
   7  INIT_FCALL_BY_NAME
'ijk', 'IJK'
   8  EXT_FCALL_BEGIN
   9  DO_FCALL_BY_NAME  0
  10  EXT_FCALL_END
 411  EXT_STMT
  12  NOP
 513 RETURN   1

From line 7 in the 5.3 dump, we see 5.3 converts the function name to
lowercase during compilation, but 5.2 doesn't. Examining the source
confirms this: you can see the lowercase conversion in 5.3's
zend_do_begin_dynamic_function_call on lines 1659 (for namespaced calls)
and 1683 (for non-namespaced calls) of zend_compile.c (
http://svn.php.net/viewvc/php/php-src/branches/PHP_5_3_10/Zend/zend_compile.c?revision=323023view=markup#l1683),
while there's no such conversion in the same function in 5.2 (
http://svn.php.net/viewvc/php/php-src/branches/PHP_5_2/Zend/zend_compile.c?view=markuppathrev=302150#l1450
).

5.3 only performs case conversion if the function name is a CONST
expression, which is why defining the function after calling it works but
calling a function with a variable name breaks. Correspondingly, the
ZEND_INIT_FCALL_BY_NAME_SPEC_*_HANDLER (in zend_vm_execute.h) uses the
first operand (which is already lowercased), while the other
INIT_FCALL_BY_NAME opcode handlers (ZEND_INIT_FCALL_BY_NAME_SPEC_*_HANDLER)
use the second, non-lowercased operand.

The 5.2 INIT_FCALL_BY_NAME opcode handlers only ever use the second,
un-lowercased operand.

So, what does this mean for fixing the bug? Not so much when the function
or class is stored in a variable, since these can't be converted to
lowercase at compile time without converting all variables, which is too
wasteful of both time and space (as both the unconverted and converted
strings would need to be stored). For object instantiation,
zend_do_begin_new_object gets the class name ultimately from the
namespace_name rule. zend_do_begin_new_object could then take the resulting
znode and create a second, lowercased copy, storing it as the second
operand. ZEND_NEW_SPEC_HANDLER would then be altered to use the second
operand (if not UNUSED) to instantiate the object. This certainly seems a
valid alternative to a lowercasing version of the namespace_name rule; it's
not as far reaching, which may be good (in that it has less impact) and bad
(in that there may be 

Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-04 Thread C.Koy

On 5/2/2012 10:03 PM, Galen Wright-Watson wrote:

On Wed, May 2, 2012 at 5:23 AM, C.Koycan5...@gmail.com  wrote:


On 5/1/2012 9:11 PM, Galen Wright-Watson wrote:


On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.com   wrote:

  As of 5.3.0 this bug does not exist for function names. Only classes and

interfaces.


  Turns out, if you cause a function to be called dynamically by (e.g.)

using
a variable function, the bug will surface.

 ?php
 setlocale(LC_CTYPE, 'tr_TR');
 function IJK() {}
 # succeeds
 IJK();



If literal function call precedes the function definition, that would fail
too in 5.2.17, but not in 5.3.0.
What has changed in this regard 5.2-5.3 ?



Do you mean something like the following?

 ?php
 setlocale(LC_CTYPE, 'tr_TR');
 IJK();
 setlocale(LC_CTYPE, 'en_US');
 function IJK() {echo __FUNCTION__, \n;}

I couldn't get it to generate an error under PHP 5.2.17. What am I missing?



Try this with 5.2.17:

  ?php
  setlocale(LC_CTYPE, 'tr_TR');
  IJK();
  function IJK() {}




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-04 Thread Galen Wright-Watson
On Fri, May 4, 2012 at 7:01 AM, C.Koy can5...@gmail.com wrote:

 On 5/2/2012 10:03 PM, Galen Wright-Watson wrote:

 On Wed, May 2, 2012 at 5:23 AM, C.Koycan5...@gmail.com  wrote:

  On 5/1/2012 9:11 PM, Galen Wright-Watson wrote:

  On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.com   wrote:

  As of 5.3.0 this bug does not exist for function names. Only classes
 and

 interfaces.


  Turns out, if you cause a function to be called dynamically by (e.g.)

 using
 a variable function, the bug will surface.

 ?php
 setlocale(LC_CTYPE, 'tr_TR');
 function IJK() {}
 # succeeds
 IJK();


 If literal function call precedes the function definition, that would
 fail
 too in 5.2.17, but not in 5.3.0.
 What has changed in this regard 5.2-5.3 ?


  Do you mean something like the following?

 ?php
 setlocale(LC_CTYPE, 'tr_TR');
 IJK();
 setlocale(LC_CTYPE, 'en_US');
 function IJK() {echo __FUNCTION__, \n;}

 I couldn't get it to generate an error under PHP 5.2.17. What am I
 missing?


 Try this with 5.2.17:


  ?php
  setlocale(LC_CTYPE, 'tr_TR');
  IJK();
  function IJK() {}


 That also ran without error for me. I'm not sure how to account for the
different behavior. Here are the details of the system that I'm using:

$ uname -a
 Linux n10 3.2.6mtv10 #1 SMP Wed Mar 14 06:22:06 PDT 2012 x86_64 GNU/Linux
 $ php -v
 PHP 5.2.17 with Suhosin-Patch 0.9.7 (cli) (built: May  3 2012 12:16:32)
 Copyright (c) 1997-2009 The PHP Group
 Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies
 with Zend Optimizer v3.3.9, Copyright (c) 1998-2009, by Zend
 Technologies
 with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH


Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-02 Thread C.Koy

On 5/1/2012 9:11 PM, Galen Wright-Watson wrote:

On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.com  wrote:


As of 5.3.0 this bug does not exist for function names. Only classes and
interfaces.



Turns out, if you cause a function to be called dynamically by (e.g.) using
a variable function, the bug will surface.

 ?php
 setlocale(LC_CTYPE, 'tr_TR');
 function IJK() {}
 # succeeds
 IJK();


If literal function call precedes the function definition, that would 
fail too in 5.2.17, but not in 5.3.0.

What has changed in this regard 5.2-5.3 ?



 $f = 'IJK';
 # causes Fatal error: Call to undefined function IJK()
 $f();

In contrast, if you set the locale for LC_CTYPE on the command line, the
bug doesn't arise at all because the compilation and execution phases both
use the same locale.



So, the bug also arises if a script started in 'tr_TR' env locale sets 
its locale to 'en_US' at runtime.


[...]



I like the idea of using the system default locale for name conversion
(making name resolution independent of the current locale), but am


As I stated above, the locale the script was started in may not always 
be 'en_US' or 'C'. (assuming that's what you mean by system default 
locale)


By the way, I noticed a setlocale(LC_CTYPE, ) call in 
php_module_startup()/main.c, but can't figure if it has any relevance to 
this bug.


regards,





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-02 Thread Galen Wright-Watson
On Wed, May 2, 2012 at 5:23 AM, C.Koy can5...@gmail.com wrote:

 On 5/1/2012 9:11 PM, Galen Wright-Watson wrote:

 On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.com  wrote:

  As of 5.3.0 this bug does not exist for function names. Only classes and
 interfaces.


  Turns out, if you cause a function to be called dynamically by (e.g.)
 using
 a variable function, the bug will surface.

 ?php
 setlocale(LC_CTYPE, 'tr_TR');
 function IJK() {}
 # succeeds
 IJK();


 If literal function call precedes the function definition, that would fail
 too in 5.2.17, but not in 5.3.0.
 What has changed in this regard 5.2-5.3 ?


Do you mean something like the following?

?php
setlocale(LC_CTYPE, 'tr_TR');
IJK();
setlocale(LC_CTYPE, 'en_US');
function IJK() {echo __FUNCTION__, \n;}

I couldn't get it to generate an error under PHP 5.2.17. What am I missing?



 In contrast, if you set the locale for LC_CTYPE on the command line, the
 bug doesn't arise at all because the compilation and execution phases both
 use the same locale.


 So, the bug also arises if a script started in 'tr_TR' env locale sets its
 locale to 'en_US' at runtime.


Yup.

$ LC_CTYPE=tr_TR php
?php
setlocale(LC_CTYPE, 'en_US');
class I {}
$i = new I;
^D
Fatal error: Class 'I' not found in - on line 4

Call Stack:
0.3740 630760   1. {main}() -:0

I should say that the Vulcan Logic Disassembler has been very helpful to me
in exploring this bug. Thank you, Derick Rethans and the rest of the VLD
team. If you haven't tried it, check it out.


 [...]



 I like the idea of using the system default locale for name conversion
 (making name resolution independent of the current locale), but am


 As I stated above, the locale the script was started in may not always be
 'en_US' or 'C'. (assuming that's what you mean by system default locale)


That's indeed what I meant; basically, the locales specified in the
LC_CTYPE c. environment variables.

It shouldn't matter that the default locale isn't en_US or C, as long
as PHP always uses the same locale for identifiers both during compilation
and at run-time. Of course, it also makes a certain amount sense to
explicitly decide that PHP will use a specific locale for identifiers. I
avoided suggesting that route to avoid any issues about what locales will
be universally available.


 By the way, I noticed a setlocale(LC_CTYPE, ) call in
 php_module_startup()/main.c, but can't figure if it has any relevance to
 this bug.


That would set the locale to whatever the platform uses natively. Without
the call, the locale would be POSIX/C, according to the POSIX doc (
http://pubs.opengroup.org/onlinepubs/009604499/functions/setlocale.html).
It doesn't seem terribly relevant to bug 18556, since all that matters
regarding the initial locale is that its lowercase conversion is different
from the locale that's used at run-time. If I had to guess why the locale
is set to the platform native, it's so that numeric, currency and date
formatting will be consistent with the rest of the system.


Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-01 Thread Galen Wright-Watson
On Thu, Apr 26, 2012 at 3:45 AM, C.Koy can5...@gmail.com wrote:

 As of 5.3.0 this bug does not exist for function names. Only classes and
 interfaces.


Turns out, if you cause a function to be called dynamically by (e.g.) using
a variable function, the bug will surface.

?php
setlocale(LC_CTYPE, 'tr_TR');
function IJK() {}
# succeeds
IJK();
$f = 'IJK';
# causes Fatal error: Call to undefined function IJK()
$f();

In contrast, if you set the locale for LC_CTYPE on the command line, the
bug doesn't arise at all because the compilation and execution phases both
use the same locale.



 Could this be a clue for how to fix it for those as well?


Function names are generally resolved at compile time (dynamic function
names are resolved at run time, which is why the bug surfaces for them),
before the call to setlocale in the script has been executed. Class name
resolution is put off until execution time for autoloading and possibly
other purposes. Converting class names to lowercase at compile time may
work. A quick glance at the source shows that class_name,
fully_qualified_class_name and class_name_reference all depend on
namespace_name, which is the rule that is responsible for the parsing of
the class name.

namespace_name:
 T_STRING { $$ = $1; }
 | namespace_name T_NS_SEPARATOR T_STRING {
zend_do_build_namespace_name($$, $1, $3 TSRMLS_CC); }
;

However, static_scalar is also dependent on namespace_name, and I don't
believe that symbol should be made case-insensitive. Creating an additional
symbol for case-independency would allow a more targeted approach. The
various class symbols would then rely on this new symbol, rather than
namespace_name.

lc_namespace_name:
T_STRING { zend_str_tolower($1); $$ = $1; }
 | lc_namespace_name T_NS_SEPARATOR T_STRING { zend_str_tolower($3);
zend_do_build_namespace_name($$, $1, $3 TSRMLS_CC); }
;

Converting class names to lower case early may have additional
consequences. It may affect class names in error messages, for example (I
didn't dig deep enough to determine this). __CLASS__ should be unaffected
(when defining a class, the class name is parsed as a T_STRING; the value
for __CLASS__ comes from this symbol). It also won't resolve the bug for
dynamic names. I suspect that altering variable_class_name and
dynamic_class_name_reference in a manner described previously (use a custom
lowercase conversion or temporarily switch locale) to convert the name
would resolve the bug in the dynamic case for class names. Changing a
number of the production rules for function_call in a similar manner should
resolve the bug for dynamic function call. Again, there will likely be
unintended consequences. Alternatively, updating
zend_do_begin_dynamic_function_call() and zend_do_fetch_class() to use
custom conversion should resolve the bug in the dynamic case.

I like the idea of using the system default locale for name conversion
(making name resolution independent of the current locale), but am
concerned that it will make name lookup slow. Instead, a second set of
locale-independent, unicode-aware conversion functions (basically, iliaa's
original solution, but Unicode compatible) to be used for identifiers would
make name resolution independent of the current locale. Any time an
identifiers needs to be converted, it would use one of these functions. As
a run-time optimization, non-dynamic class names could use the system
locale conversion, but that would be a separate thing from resolving this
bug.


Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-05-01 Thread Galen Wright-Watson
On Tue, May 1, 2012 at 11:11 AM, Galen Wright-Watson ww.ga...@gmail.comwrote:


 [...] Instead, a second set of locale-independent, unicode-aware
 conversion functions (basically, iliaa's original solution, but Unicode
 compatible) to be used for identifiers would make name resolution
 independent of the current locale. [...]


I believe all these functions would need to do is use tolower, rather than
tolower_l. So, perhaps the new functions should get the old names, and the
old functions should get _l appended to their names.


Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-04-26 Thread C.Koy

Hi,
As of 5.3.0 this bug does not exist for function names. Only classes and 
interfaces.


Could this be a clue for how to fix it for those as well?





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-04-24 Thread Ferenc Kovacs
On Tue, Apr 24, 2012 at 1:06 AM, Galen Wright-Watson ww.ga...@gmail.comwrote:

 On Mon, Apr 23, 2012 at 3:22 AM, C.Koy can5...@gmail.com wrote:

  On 4/22/2012 11:32 PM, Galen Wright-Watson wrote:
 
  2012/4/22 C.Koycan5...@gmail.com
 
   On 4/21/2012 4:37 AM, Galen Wright-Watson wrote:
 
 
   But, I did not start this thread to discuss such bug fix, because:
 
  1. It does not take a genius to figure it out, and should take minutes
 to
  implement for someone experienced in the internals. Given the 10 year
  span
  and dozens of comments/complaints on the bug's entry, it's hard to say
  this
  issue went unnoticed. So I had to conclude that such fix has quietly
 been
  overruled for performance and/or other undisclosed reasons.
 
 
  Why does it matter if a solution is simple?
 
 
  It doesn't matter, you've misunderstood.
 

 You've misunderstood me. While you may have set out with the goal of
 discussing making PHP completely case-sensitive, that doesn't preclude
 others from suggesting fixes for the specific bug you mention. Indeed, some
 of the first e-mails were around the bug, and not just in the context of
 case-sensitive PHP.

 I didn't introduce the custom case conversion solution as a
 counter-argument to case-sensitive PHP, and I wasn't asking for feedback on
 that solution in the context of case-sensitive PHP; I was asking for
 reasons why it wouldn't be a suitable solution for the bug. The only place
 case-sensitive PHP enters into it was your statement that:

 As the recent comments on that page indicate, there's not a deterministic
  way to resolve this issue, apart from eliminating tolower() calls for
  function/class names during lookup. Hence totally case-sensitive PHP.


 My proposition shows this is isn't entirely true, and branches off from the
 original discussion at that point. I'm focusing on fixing the bug, which is
 a smaller issue than case-sensitivity. Discussion of case-sensitivity can
 continue without regard to the custom conversion solution. As such, I've
 changed the subject of this e-mail.

 Furthermore, going back to your original e-mail, you explicitly stated it
 was about the bug, making case sensitivity subordinate to it.

 This post is about bug #18556
 (https://bugs.php.net/bug.php?**id=18556
 https://bugs.php.net/bug.php?id=18556)
  which is a decade old.


 I hope you can see why others might take the bug to be the context for
 case-sensitivity, rather than the other way around.

 And that's what makes me curious and confused about why this bug still
  exists. See, I'm drawing a conclusion with what little information I
 have,
  and stating the reasonings it's based on (first two statements).
  Overall, that and the item following it were an explanation of why I'm
  suggesting a major feature change in solution to a specific bug,
 although
  noone directly asked me to.
 
  In other words, you jumped to a conclusion. I wasn't asking about
 possible
 reasons why custom conversion hasn't been accepted as the solution to this
 bug. Neither was I asking why you didn't suggest it. I was (and still am)
 asking for explicit, justifiable reasons as to whether or not it's a
 suitable solution to the bug.


 
  If it's already been rejected privately, it's time to bring the reasons
  into the open (which is why I asked). If not, it should be considered
  publicly.
 
 
  A comment dated 2002-09-26 on bug's page states the bug is fixed. The
 next
  comment dated 2006-02-17 states it reappeared.
  I don't know who did what 10, 6 years ago but it's been revoked. Why?
  That was the main reason I deemed this bug not fixable, hence suggest
  other ways to resolve.
 
  I don't know either, but I'm not about to disregard potential fixes if
 they haven't been publicly discussed. The regression could just as easily
 have been a mistake. From looking at the original fix (revision 97040,
 http://svn.php.net/viewvc?view=revisionrevision=97040, authored by iliaa)
 and the bug comments, something along the lines of what I'm suggesting has
 been suggested and even implemented before, but there's no real discussion
 of it. The original fix (zend_str_tolower_nlc) assumed ASCII, which isn't
 entirely suitable as there are uppercase characters that it doesn't
 convert, which suggests yet another reason for the regression, namely that
 using zend_str_tolower would convert the characters that
 zend_str_tolower_nlc missed.

 As for the real reason why the bug reappeared, we can continue on in our
 historical examination. Revision 99001 (
 http://svn.php.net/viewvc?view=revisionrevision=99001, also authored
 by iliaa) replaced zend_str_tolower with zend_str_tolower_nlc, making all
 internal Zend case conversion use ASCII. iliaa had this to say about the
 change (http://news.php.net/php.zend-engine.cvs/478):

 It appears that there no reason to keep both zend_str_tolower_nlc and
  zend_str_tolower.  zend_str_tolower_nlc can be safely renamed to
  zend_str_tolower. The places it is used in, do not 

Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-04-24 Thread Ferenc Kovacs


 ps: you had a few extra  at the end of the first lines of your sentences,
 I experienced similar problems with gmail, the solution for me was to
 always put an extra new line after the quoted text.


what I meant is the beginning of the first line, not the end.

-- 
Ferenc Kovács
@Tyr43l - http://tyrael.hu


Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-04-24 Thread Hartmut Holzgraefe
On 04/24/2012 01:06 AM, Galen Wright-Watson wrote:

 http://svn.php.net/viewvc?view=revisionrevision=128060, same author) then
 changes zend_str_tolower to use tolower instead of its custom ASCII-based
 conversion. The commit message is: make this faster and sexier. Within
 these revisions, zend_lookup_class is case sensitive. This change, in
 combination with 99001, mask the reason for the custom conversion.

Argh  STERLING!!!111

ok, part of the story seems to be that i can't find the regression test
tests/lang/035.phpt that i mentioned in bug #18556 anywhere. In the 5.x
code base this is a test for some Expection related stuff, and in the
latest 4.x branch the highest test number in test/lang is 034.phpt

So it seems as if i somehow never really committed my test case and
so Sterling, not being aware of the turkish history, unfixed things
during micro optimization withozut anything in place to warn him about
the regression he introduced :(

(AFAIR it was me back then who first stumbled about i!=tolower(I)
in tr_TR after noticing that most of our Image functions don't work
even though the gd extension is active came from Turkey ...)

-- 
hartmut

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)

2012-04-23 Thread Galen Wright-Watson
On Mon, Apr 23, 2012 at 3:22 AM, C.Koy can5...@gmail.com wrote:

 On 4/22/2012 11:32 PM, Galen Wright-Watson wrote:

 2012/4/22 C.Koycan5...@gmail.com

  On 4/21/2012 4:37 AM, Galen Wright-Watson wrote:


  But, I did not start this thread to discuss such bug fix, because:

 1. It does not take a genius to figure it out, and should take minutes to
 implement for someone experienced in the internals. Given the 10 year
 span
 and dozens of comments/complaints on the bug's entry, it's hard to say
 this
 issue went unnoticed. So I had to conclude that such fix has quietly been
 overruled for performance and/or other undisclosed reasons.


 Why does it matter if a solution is simple?


 It doesn't matter, you've misunderstood.


You've misunderstood me. While you may have set out with the goal of
discussing making PHP completely case-sensitive, that doesn't preclude
others from suggesting fixes for the specific bug you mention. Indeed, some
of the first e-mails were around the bug, and not just in the context of
case-sensitive PHP.

I didn't introduce the custom case conversion solution as a
counter-argument to case-sensitive PHP, and I wasn't asking for feedback on
that solution in the context of case-sensitive PHP; I was asking for
reasons why it wouldn't be a suitable solution for the bug. The only place
case-sensitive PHP enters into it was your statement that:

As the recent comments on that page indicate, there's not a deterministic
 way to resolve this issue, apart from eliminating tolower() calls for
 function/class names during lookup. Hence totally case-sensitive PHP.


My proposition shows this is isn't entirely true, and branches off from the
original discussion at that point. I'm focusing on fixing the bug, which is
a smaller issue than case-sensitivity. Discussion of case-sensitivity can
continue without regard to the custom conversion solution. As such, I've
changed the subject of this e-mail.

Furthermore, going back to your original e-mail, you explicitly stated it
was about the bug, making case sensitivity subordinate to it.

This post is about bug #18556
(https://bugs.php.net/bug.php?**id=18556https://bugs.php.net/bug.php?id=18556)
 which is a decade old.


I hope you can see why others might take the bug to be the context for
case-sensitivity, rather than the other way around.

And that's what makes me curious and confused about why this bug still
 exists. See, I'm drawing a conclusion with what little information I have,
 and stating the reasonings it's based on (first two statements).
 Overall, that and the item following it were an explanation of why I'm
 suggesting a major feature change in solution to a specific bug, although
 noone directly asked me to.

 In other words, you jumped to a conclusion. I wasn't asking about possible
reasons why custom conversion hasn't been accepted as the solution to this
bug. Neither was I asking why you didn't suggest it. I was (and still am)
asking for explicit, justifiable reasons as to whether or not it's a
suitable solution to the bug.



 If it's already been rejected privately, it's time to bring the reasons
 into the open (which is why I asked). If not, it should be considered
 publicly.


 A comment dated 2002-09-26 on bug's page states the bug is fixed. The next
 comment dated 2006-02-17 states it reappeared.
 I don't know who did what 10, 6 years ago but it's been revoked. Why?
 That was the main reason I deemed this bug not fixable, hence suggest
 other ways to resolve.

 I don't know either, but I'm not about to disregard potential fixes if
they haven't been publicly discussed. The regression could just as easily
have been a mistake. From looking at the original fix (revision 97040,
http://svn.php.net/viewvc?view=revisionrevision=97040, authored by iliaa)
and the bug comments, something along the lines of what I'm suggesting has
been suggested and even implemented before, but there's no real discussion
of it. The original fix (zend_str_tolower_nlc) assumed ASCII, which isn't
entirely suitable as there are uppercase characters that it doesn't
convert, which suggests yet another reason for the regression, namely that
using zend_str_tolower would convert the characters that
zend_str_tolower_nlc missed.

As for the real reason why the bug reappeared, we can continue on in our
historical examination. Revision 99001 (
http://svn.php.net/viewvc?view=revisionrevision=99001, also authored
by iliaa) replaced zend_str_tolower with zend_str_tolower_nlc, making all
internal Zend case conversion use ASCII. iliaa had this to say about the
change (http://news.php.net/php.zend-engine.cvs/478):

It appears that there no reason to keep both zend_str_tolower_nlc and
 zend_str_tolower.  zend_str_tolower_nlc can be safely renamed to
 zend_str_tolower. The places it is used in, do not appear to depend on
 locale.  For people who do need it there is an alternative php function
 php_strtolower, which they can use, which does respect the locale. So, if
 there are no