Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On 5/5/2012 12:22 AM, Galen Wright-Watson wrote: That also ran without error for me. I'm not sure how to account for the different behavior. Here are the details of the system that I'm using: $ uname -a Linux n10 3.2.6mtv10 #1 SMP Wed Mar 14 06:22:06 PDT 2012 x86_64 GNU/Linux $ php -v PHP 5.2.17 with Suhosin-Patch 0.9.7 (cli) (built: May 3 2012 12:16:32) Copyright (c) 1997-2009 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies with Zend Optimizer v3.3.9, Copyright (c) 1998-2009, by Zend Technologies with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH I've been experimenting with bare-bones PHP I've built from pristine sources so far. Don't you think you should do the same, in dealing with such a bug? Here's the top portion of my 'php -i' output: ~/proj$ php-5.2.17/sapi/cli/php -i|head -28 phpinfo() PHP Version = 5.2.17 System = Linux trvuntu 2.6.32-41-generic #88-Ubuntu SMP Thu Mar 29 13:08:43 UTC 2012 i686 Build Date = May 4 2012 20:03:30 Configure Command = './configure' '--disable-all' '--enable-cli' '--enable-vld' Server API = Command Line Interface Virtual Directory Support = disabled Configuration File (php.ini) Path = /usr/local/lib Loaded Configuration File = (none) Scan this dir for additional .ini files = (none) additional .ini files parsed = (none) PHP API = 20041225 PHP Extension = 20060613 Zend Extension = 220060519 Debug Build = no Thread Safety = disabled Zend Memory Manager = enabled IPv6 Support = enabled Registered PHP Streams = php, file, data, http, ftp Registered Stream Socket Transports = tcp, udp, unix, udg Registered Stream Filters = string.rot13, string.toupper, string.tolower, string.strip_tags, convert.*, consumed This program makes use of the Zend Scripting Language Engine: Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On 05/04/2012 11:22 PM, Galen Wright-Watson wrote: On Fri, May 4, 2012 at 7:01 AM, C.Koycan5...@gmail.com wrote: On 5/2/2012 10:03 PM, Galen Wright-Watson wrote: On Wed, May 2, 2012 at 5:23 AM, C.Koycan5...@gmail.com wrote: On 5/1/2012 9:11 PM, Galen Wright-Watson wrote: On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.comwrote: As of 5.3.0 this bug does not exist for function names. Only classes and interfaces. Turns out, if you cause a function to be called dynamically by (e.g.) using a variable function, the bug will surface. ?php setlocale(LC_CTYPE, 'tr_TR'); function IJK() {} # succeeds IJK(); If literal function call precedes the function definition, that would fail too in 5.2.17, but not in 5.3.0. What has changed in this regard 5.2-5.3 ? Do you mean something like the following? ?php setlocale(LC_CTYPE, 'tr_TR'); IJK(); setlocale(LC_CTYPE, 'en_US'); function IJK() {echo __FUNCTION__, \n;} I couldn't get it to generate an error under PHP 5.2.17. What am I missing? Try this with 5.2.17: ?php setlocale(LC_CTYPE, 'tr_TR'); IJK(); function IJK() {} That also ran without error for me. I'm not sure how to account for the different behavior. Here are the details of the system that I'm using: $ uname -a Linux n10 3.2.6mtv10 #1 SMP Wed Mar 14 06:22:06 PDT 2012 x86_64 GNU/Linux $ php -v PHP 5.2.17 with Suhosin-Patch 0.9.7 (cli) (built: May 3 2012 12:16:32) Copyright (c) 1997-2009 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies with Zend Optimizer v3.3.9, Copyright (c) 1998-2009, by Zend Technologies with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH Try to var_dump the setLocale and see if it return the specified locale or just 'false'. If false try the following: setlocale(LC_ALL, 'tr_TR.UTF-8'); I had the same issue. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On 5/5/2012 7:01 PM, Wim Wisselink wrote: Try to var_dump the setLocale and see if it return the specified locale or just 'false'. I thought he was way past that control. Anyway, a simple test should suffice: setlocale(LC_CTYPE, 'tr_TR') or exit('setlocale failed\n'); -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On Sat, May 5, 2012 at 5:31 AM, C.Koy can5...@gmail.com wrote: I've been experimenting with bare-bones PHP I've built from pristine sources so far. Don't you think you should do the same, in dealing with such a bug? My personal system is a BSD derivative; the Turkish locales on these use latin rather than Turkish case conversion (and installing a proper Turkish locale is a mess), so I've been testing on another system. I've been hesitant to use its resources too heavily for professional reasons. Running a small PHP script is one thing; though time and space required for a PHP build isn't large on modern systems, I can't justify doing so since it's not directly related to site operations. On Sat, May 5, 2012 at 8:59 AM, Wim Wisselink w...@powerassist.nl wrote: Try to var_dump the setLocale and see if it return the specified locale or just 'false'. If false try the following: setlocale(LC_ALL, 'tr_TR.UTF-8'); I had previously tested the locale by using setlower('I'), as it tests both that the locale exists and uses Turkish-langage case conversion. The systems where I tested C.Koy's script passed the setlower test. Turned out to be the Zend optimizer that prevented the error. With it not loaded, the example script failed with a Fatal error: Call to undefined function IJK() error message. Here's a breakdown: In both PHP 5.2 and 5.3, calling a function before defining it results in a dynamic call (INIT_FCALL_BY_NAME+DO_FCALL_BY_NAME). Here's the PHP 5.2 dump of C.Koy's example: line # * op fetch ext return operands - 2 0 FETCH_CONSTANT ~0 'LC_CTYPE' 1 SEND_VAL ~0 2 SEND_VAL 'tr_TR' 3 DO_FCALL 2 'setlocale' 3 4 INIT_FCALL_BY_NAME 'IJK' 5 DO_FCALL_BY_NAME 0 4 6 NOP 5 7 RETURN 1 8*ZEND_HANDLE_EXCEPTION Here's the 5.3 dump: line # * op fetch ext return operands - 2 0 EXT_STMT 1 EXT_FCALL_BEGIN 2 SEND_VAL 2 3 SEND_VAL 'tr_TR' 4 DO_FCALL 2 'setlocale' 5 EXT_FCALL_END 3 6 EXT_STMT 7 INIT_FCALL_BY_NAME 'ijk', 'IJK' 8 EXT_FCALL_BEGIN 9 DO_FCALL_BY_NAME 0 10 EXT_FCALL_END 411 EXT_STMT 12 NOP 513 RETURN 1 From line 7 in the 5.3 dump, we see 5.3 converts the function name to lowercase during compilation, but 5.2 doesn't. Examining the source confirms this: you can see the lowercase conversion in 5.3's zend_do_begin_dynamic_function_call on lines 1659 (for namespaced calls) and 1683 (for non-namespaced calls) of zend_compile.c ( http://svn.php.net/viewvc/php/php-src/branches/PHP_5_3_10/Zend/zend_compile.c?revision=323023view=markup#l1683), while there's no such conversion in the same function in 5.2 ( http://svn.php.net/viewvc/php/php-src/branches/PHP_5_2/Zend/zend_compile.c?view=markuppathrev=302150#l1450 ). 5.3 only performs case conversion if the function name is a CONST expression, which is why defining the function after calling it works but calling a function with a variable name breaks. Correspondingly, the ZEND_INIT_FCALL_BY_NAME_SPEC_*_HANDLER (in zend_vm_execute.h) uses the first operand (which is already lowercased), while the other INIT_FCALL_BY_NAME opcode handlers (ZEND_INIT_FCALL_BY_NAME_SPEC_*_HANDLER) use the second, non-lowercased operand. The 5.2 INIT_FCALL_BY_NAME opcode handlers only ever use the second, un-lowercased operand. So, what does this mean for fixing the bug? Not so much when the function or class is stored in a variable, since these can't be converted to lowercase at compile time without converting all variables, which is too wasteful of both time and space (as both the unconverted and converted strings would need to be stored). For object instantiation, zend_do_begin_new_object gets the class name ultimately from the namespace_name rule. zend_do_begin_new_object could then take the resulting znode and create a second, lowercased copy, storing it as the second operand. ZEND_NEW_SPEC_HANDLER would then be altered to use the second operand (if not UNUSED) to instantiate the object. This certainly seems a valid alternative to a lowercasing version of the namespace_name rule; it's not as far reaching, which may be good (in that it has less impact) and bad (in that there may be
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On 5/2/2012 10:03 PM, Galen Wright-Watson wrote: On Wed, May 2, 2012 at 5:23 AM, C.Koycan5...@gmail.com wrote: On 5/1/2012 9:11 PM, Galen Wright-Watson wrote: On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.com wrote: As of 5.3.0 this bug does not exist for function names. Only classes and interfaces. Turns out, if you cause a function to be called dynamically by (e.g.) using a variable function, the bug will surface. ?php setlocale(LC_CTYPE, 'tr_TR'); function IJK() {} # succeeds IJK(); If literal function call precedes the function definition, that would fail too in 5.2.17, but not in 5.3.0. What has changed in this regard 5.2-5.3 ? Do you mean something like the following? ?php setlocale(LC_CTYPE, 'tr_TR'); IJK(); setlocale(LC_CTYPE, 'en_US'); function IJK() {echo __FUNCTION__, \n;} I couldn't get it to generate an error under PHP 5.2.17. What am I missing? Try this with 5.2.17: ?php setlocale(LC_CTYPE, 'tr_TR'); IJK(); function IJK() {} -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On Fri, May 4, 2012 at 7:01 AM, C.Koy can5...@gmail.com wrote: On 5/2/2012 10:03 PM, Galen Wright-Watson wrote: On Wed, May 2, 2012 at 5:23 AM, C.Koycan5...@gmail.com wrote: On 5/1/2012 9:11 PM, Galen Wright-Watson wrote: On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.com wrote: As of 5.3.0 this bug does not exist for function names. Only classes and interfaces. Turns out, if you cause a function to be called dynamically by (e.g.) using a variable function, the bug will surface. ?php setlocale(LC_CTYPE, 'tr_TR'); function IJK() {} # succeeds IJK(); If literal function call precedes the function definition, that would fail too in 5.2.17, but not in 5.3.0. What has changed in this regard 5.2-5.3 ? Do you mean something like the following? ?php setlocale(LC_CTYPE, 'tr_TR'); IJK(); setlocale(LC_CTYPE, 'en_US'); function IJK() {echo __FUNCTION__, \n;} I couldn't get it to generate an error under PHP 5.2.17. What am I missing? Try this with 5.2.17: ?php setlocale(LC_CTYPE, 'tr_TR'); IJK(); function IJK() {} That also ran without error for me. I'm not sure how to account for the different behavior. Here are the details of the system that I'm using: $ uname -a Linux n10 3.2.6mtv10 #1 SMP Wed Mar 14 06:22:06 PDT 2012 x86_64 GNU/Linux $ php -v PHP 5.2.17 with Suhosin-Patch 0.9.7 (cli) (built: May 3 2012 12:16:32) Copyright (c) 1997-2009 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2010 Zend Technologies with Zend Optimizer v3.3.9, Copyright (c) 1998-2009, by Zend Technologies with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On 5/1/2012 9:11 PM, Galen Wright-Watson wrote: On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.com wrote: As of 5.3.0 this bug does not exist for function names. Only classes and interfaces. Turns out, if you cause a function to be called dynamically by (e.g.) using a variable function, the bug will surface. ?php setlocale(LC_CTYPE, 'tr_TR'); function IJK() {} # succeeds IJK(); If literal function call precedes the function definition, that would fail too in 5.2.17, but not in 5.3.0. What has changed in this regard 5.2-5.3 ? $f = 'IJK'; # causes Fatal error: Call to undefined function IJK() $f(); In contrast, if you set the locale for LC_CTYPE on the command line, the bug doesn't arise at all because the compilation and execution phases both use the same locale. So, the bug also arises if a script started in 'tr_TR' env locale sets its locale to 'en_US' at runtime. [...] I like the idea of using the system default locale for name conversion (making name resolution independent of the current locale), but am As I stated above, the locale the script was started in may not always be 'en_US' or 'C'. (assuming that's what you mean by system default locale) By the way, I noticed a setlocale(LC_CTYPE, ) call in php_module_startup()/main.c, but can't figure if it has any relevance to this bug. regards, -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On Wed, May 2, 2012 at 5:23 AM, C.Koy can5...@gmail.com wrote: On 5/1/2012 9:11 PM, Galen Wright-Watson wrote: On Thu, Apr 26, 2012 at 3:45 AM, C.Koycan5...@gmail.com wrote: As of 5.3.0 this bug does not exist for function names. Only classes and interfaces. Turns out, if you cause a function to be called dynamically by (e.g.) using a variable function, the bug will surface. ?php setlocale(LC_CTYPE, 'tr_TR'); function IJK() {} # succeeds IJK(); If literal function call precedes the function definition, that would fail too in 5.2.17, but not in 5.3.0. What has changed in this regard 5.2-5.3 ? Do you mean something like the following? ?php setlocale(LC_CTYPE, 'tr_TR'); IJK(); setlocale(LC_CTYPE, 'en_US'); function IJK() {echo __FUNCTION__, \n;} I couldn't get it to generate an error under PHP 5.2.17. What am I missing? In contrast, if you set the locale for LC_CTYPE on the command line, the bug doesn't arise at all because the compilation and execution phases both use the same locale. So, the bug also arises if a script started in 'tr_TR' env locale sets its locale to 'en_US' at runtime. Yup. $ LC_CTYPE=tr_TR php ?php setlocale(LC_CTYPE, 'en_US'); class I {} $i = new I; ^D Fatal error: Class 'I' not found in - on line 4 Call Stack: 0.3740 630760 1. {main}() -:0 I should say that the Vulcan Logic Disassembler has been very helpful to me in exploring this bug. Thank you, Derick Rethans and the rest of the VLD team. If you haven't tried it, check it out. [...] I like the idea of using the system default locale for name conversion (making name resolution independent of the current locale), but am As I stated above, the locale the script was started in may not always be 'en_US' or 'C'. (assuming that's what you mean by system default locale) That's indeed what I meant; basically, the locales specified in the LC_CTYPE c. environment variables. It shouldn't matter that the default locale isn't en_US or C, as long as PHP always uses the same locale for identifiers both during compilation and at run-time. Of course, it also makes a certain amount sense to explicitly decide that PHP will use a specific locale for identifiers. I avoided suggesting that route to avoid any issues about what locales will be universally available. By the way, I noticed a setlocale(LC_CTYPE, ) call in php_module_startup()/main.c, but can't figure if it has any relevance to this bug. That would set the locale to whatever the platform uses natively. Without the call, the locale would be POSIX/C, according to the POSIX doc ( http://pubs.opengroup.org/onlinepubs/009604499/functions/setlocale.html). It doesn't seem terribly relevant to bug 18556, since all that matters regarding the initial locale is that its lowercase conversion is different from the locale that's used at run-time. If I had to guess why the locale is set to the platform native, it's so that numeric, currency and date formatting will be consistent with the rest of the system.
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On Thu, Apr 26, 2012 at 3:45 AM, C.Koy can5...@gmail.com wrote: As of 5.3.0 this bug does not exist for function names. Only classes and interfaces. Turns out, if you cause a function to be called dynamically by (e.g.) using a variable function, the bug will surface. ?php setlocale(LC_CTYPE, 'tr_TR'); function IJK() {} # succeeds IJK(); $f = 'IJK'; # causes Fatal error: Call to undefined function IJK() $f(); In contrast, if you set the locale for LC_CTYPE on the command line, the bug doesn't arise at all because the compilation and execution phases both use the same locale. Could this be a clue for how to fix it for those as well? Function names are generally resolved at compile time (dynamic function names are resolved at run time, which is why the bug surfaces for them), before the call to setlocale in the script has been executed. Class name resolution is put off until execution time for autoloading and possibly other purposes. Converting class names to lowercase at compile time may work. A quick glance at the source shows that class_name, fully_qualified_class_name and class_name_reference all depend on namespace_name, which is the rule that is responsible for the parsing of the class name. namespace_name: T_STRING { $$ = $1; } | namespace_name T_NS_SEPARATOR T_STRING { zend_do_build_namespace_name($$, $1, $3 TSRMLS_CC); } ; However, static_scalar is also dependent on namespace_name, and I don't believe that symbol should be made case-insensitive. Creating an additional symbol for case-independency would allow a more targeted approach. The various class symbols would then rely on this new symbol, rather than namespace_name. lc_namespace_name: T_STRING { zend_str_tolower($1); $$ = $1; } | lc_namespace_name T_NS_SEPARATOR T_STRING { zend_str_tolower($3); zend_do_build_namespace_name($$, $1, $3 TSRMLS_CC); } ; Converting class names to lower case early may have additional consequences. It may affect class names in error messages, for example (I didn't dig deep enough to determine this). __CLASS__ should be unaffected (when defining a class, the class name is parsed as a T_STRING; the value for __CLASS__ comes from this symbol). It also won't resolve the bug for dynamic names. I suspect that altering variable_class_name and dynamic_class_name_reference in a manner described previously (use a custom lowercase conversion or temporarily switch locale) to convert the name would resolve the bug in the dynamic case for class names. Changing a number of the production rules for function_call in a similar manner should resolve the bug for dynamic function call. Again, there will likely be unintended consequences. Alternatively, updating zend_do_begin_dynamic_function_call() and zend_do_fetch_class() to use custom conversion should resolve the bug in the dynamic case. I like the idea of using the system default locale for name conversion (making name resolution independent of the current locale), but am concerned that it will make name lookup slow. Instead, a second set of locale-independent, unicode-aware conversion functions (basically, iliaa's original solution, but Unicode compatible) to be used for identifiers would make name resolution independent of the current locale. Any time an identifiers needs to be converted, it would use one of these functions. As a run-time optimization, non-dynamic class names could use the system locale conversion, but that would be a separate thing from resolving this bug.
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On Tue, May 1, 2012 at 11:11 AM, Galen Wright-Watson ww.ga...@gmail.comwrote: [...] Instead, a second set of locale-independent, unicode-aware conversion functions (basically, iliaa's original solution, but Unicode compatible) to be used for identifiers would make name resolution independent of the current locale. [...] I believe all these functions would need to do is use tolower, rather than tolower_l. So, perhaps the new functions should get the old names, and the old functions should get _l appended to their names.
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
Hi, As of 5.3.0 this bug does not exist for function names. Only classes and interfaces. Could this be a clue for how to fix it for those as well? -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On Tue, Apr 24, 2012 at 1:06 AM, Galen Wright-Watson ww.ga...@gmail.comwrote: On Mon, Apr 23, 2012 at 3:22 AM, C.Koy can5...@gmail.com wrote: On 4/22/2012 11:32 PM, Galen Wright-Watson wrote: 2012/4/22 C.Koycan5...@gmail.com On 4/21/2012 4:37 AM, Galen Wright-Watson wrote: But, I did not start this thread to discuss such bug fix, because: 1. It does not take a genius to figure it out, and should take minutes to implement for someone experienced in the internals. Given the 10 year span and dozens of comments/complaints on the bug's entry, it's hard to say this issue went unnoticed. So I had to conclude that such fix has quietly been overruled for performance and/or other undisclosed reasons. Why does it matter if a solution is simple? It doesn't matter, you've misunderstood. You've misunderstood me. While you may have set out with the goal of discussing making PHP completely case-sensitive, that doesn't preclude others from suggesting fixes for the specific bug you mention. Indeed, some of the first e-mails were around the bug, and not just in the context of case-sensitive PHP. I didn't introduce the custom case conversion solution as a counter-argument to case-sensitive PHP, and I wasn't asking for feedback on that solution in the context of case-sensitive PHP; I was asking for reasons why it wouldn't be a suitable solution for the bug. The only place case-sensitive PHP enters into it was your statement that: As the recent comments on that page indicate, there's not a deterministic way to resolve this issue, apart from eliminating tolower() calls for function/class names during lookup. Hence totally case-sensitive PHP. My proposition shows this is isn't entirely true, and branches off from the original discussion at that point. I'm focusing on fixing the bug, which is a smaller issue than case-sensitivity. Discussion of case-sensitivity can continue without regard to the custom conversion solution. As such, I've changed the subject of this e-mail. Furthermore, going back to your original e-mail, you explicitly stated it was about the bug, making case sensitivity subordinate to it. This post is about bug #18556 (https://bugs.php.net/bug.php?**id=18556 https://bugs.php.net/bug.php?id=18556) which is a decade old. I hope you can see why others might take the bug to be the context for case-sensitivity, rather than the other way around. And that's what makes me curious and confused about why this bug still exists. See, I'm drawing a conclusion with what little information I have, and stating the reasonings it's based on (first two statements). Overall, that and the item following it were an explanation of why I'm suggesting a major feature change in solution to a specific bug, although noone directly asked me to. In other words, you jumped to a conclusion. I wasn't asking about possible reasons why custom conversion hasn't been accepted as the solution to this bug. Neither was I asking why you didn't suggest it. I was (and still am) asking for explicit, justifiable reasons as to whether or not it's a suitable solution to the bug. If it's already been rejected privately, it's time to bring the reasons into the open (which is why I asked). If not, it should be considered publicly. A comment dated 2002-09-26 on bug's page states the bug is fixed. The next comment dated 2006-02-17 states it reappeared. I don't know who did what 10, 6 years ago but it's been revoked. Why? That was the main reason I deemed this bug not fixable, hence suggest other ways to resolve. I don't know either, but I'm not about to disregard potential fixes if they haven't been publicly discussed. The regression could just as easily have been a mistake. From looking at the original fix (revision 97040, http://svn.php.net/viewvc?view=revisionrevision=97040, authored by iliaa) and the bug comments, something along the lines of what I'm suggesting has been suggested and even implemented before, but there's no real discussion of it. The original fix (zend_str_tolower_nlc) assumed ASCII, which isn't entirely suitable as there are uppercase characters that it doesn't convert, which suggests yet another reason for the regression, namely that using zend_str_tolower would convert the characters that zend_str_tolower_nlc missed. As for the real reason why the bug reappeared, we can continue on in our historical examination. Revision 99001 ( http://svn.php.net/viewvc?view=revisionrevision=99001, also authored by iliaa) replaced zend_str_tolower with zend_str_tolower_nlc, making all internal Zend case conversion use ASCII. iliaa had this to say about the change (http://news.php.net/php.zend-engine.cvs/478): It appears that there no reason to keep both zend_str_tolower_nlc and zend_str_tolower. zend_str_tolower_nlc can be safely renamed to zend_str_tolower. The places it is used in, do not
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
ps: you had a few extra at the end of the first lines of your sentences, I experienced similar problems with gmail, the solution for me was to always put an extra new line after the quoted text. what I meant is the beginning of the first line, not the end. -- Ferenc Kovács @Tyr43l - http://tyrael.hu
Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On 04/24/2012 01:06 AM, Galen Wright-Watson wrote: http://svn.php.net/viewvc?view=revisionrevision=128060, same author) then changes zend_str_tolower to use tolower instead of its custom ASCII-based conversion. The commit message is: make this faster and sexier. Within these revisions, zend_lookup_class is case sensitive. This change, in combination with 99001, mask the reason for the custom conversion. Argh STERLING!!!111 ok, part of the story seems to be that i can't find the regression test tests/lang/035.phpt that i mentioned in bug #18556 anywhere. In the 5.x code base this is a test for some Expection related stuff, and in the latest 4.x branch the highest test number in test/lang is 034.phpt So it seems as if i somehow never really committed my test case and so Sterling, not being aware of the turkish history, unfixed things during micro optimization withozut anything in place to warn him about the regression he introduced :( (AFAIR it was me back then who first stumbled about i!=tolower(I) in tr_TR after noticing that most of our Image functions don't work even though the gd extension is active came from Turkey ...) -- hartmut -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP)
On Mon, Apr 23, 2012 at 3:22 AM, C.Koy can5...@gmail.com wrote: On 4/22/2012 11:32 PM, Galen Wright-Watson wrote: 2012/4/22 C.Koycan5...@gmail.com On 4/21/2012 4:37 AM, Galen Wright-Watson wrote: But, I did not start this thread to discuss such bug fix, because: 1. It does not take a genius to figure it out, and should take minutes to implement for someone experienced in the internals. Given the 10 year span and dozens of comments/complaints on the bug's entry, it's hard to say this issue went unnoticed. So I had to conclude that such fix has quietly been overruled for performance and/or other undisclosed reasons. Why does it matter if a solution is simple? It doesn't matter, you've misunderstood. You've misunderstood me. While you may have set out with the goal of discussing making PHP completely case-sensitive, that doesn't preclude others from suggesting fixes for the specific bug you mention. Indeed, some of the first e-mails were around the bug, and not just in the context of case-sensitive PHP. I didn't introduce the custom case conversion solution as a counter-argument to case-sensitive PHP, and I wasn't asking for feedback on that solution in the context of case-sensitive PHP; I was asking for reasons why it wouldn't be a suitable solution for the bug. The only place case-sensitive PHP enters into it was your statement that: As the recent comments on that page indicate, there's not a deterministic way to resolve this issue, apart from eliminating tolower() calls for function/class names during lookup. Hence totally case-sensitive PHP. My proposition shows this is isn't entirely true, and branches off from the original discussion at that point. I'm focusing on fixing the bug, which is a smaller issue than case-sensitivity. Discussion of case-sensitivity can continue without regard to the custom conversion solution. As such, I've changed the subject of this e-mail. Furthermore, going back to your original e-mail, you explicitly stated it was about the bug, making case sensitivity subordinate to it. This post is about bug #18556 (https://bugs.php.net/bug.php?**id=18556https://bugs.php.net/bug.php?id=18556) which is a decade old. I hope you can see why others might take the bug to be the context for case-sensitivity, rather than the other way around. And that's what makes me curious and confused about why this bug still exists. See, I'm drawing a conclusion with what little information I have, and stating the reasonings it's based on (first two statements). Overall, that and the item following it were an explanation of why I'm suggesting a major feature change in solution to a specific bug, although noone directly asked me to. In other words, you jumped to a conclusion. I wasn't asking about possible reasons why custom conversion hasn't been accepted as the solution to this bug. Neither was I asking why you didn't suggest it. I was (and still am) asking for explicit, justifiable reasons as to whether or not it's a suitable solution to the bug. If it's already been rejected privately, it's time to bring the reasons into the open (which is why I asked). If not, it should be considered publicly. A comment dated 2002-09-26 on bug's page states the bug is fixed. The next comment dated 2006-02-17 states it reappeared. I don't know who did what 10, 6 years ago but it's been revoked. Why? That was the main reason I deemed this bug not fixable, hence suggest other ways to resolve. I don't know either, but I'm not about to disregard potential fixes if they haven't been publicly discussed. The regression could just as easily have been a mistake. From looking at the original fix (revision 97040, http://svn.php.net/viewvc?view=revisionrevision=97040, authored by iliaa) and the bug comments, something along the lines of what I'm suggesting has been suggested and even implemented before, but there's no real discussion of it. The original fix (zend_str_tolower_nlc) assumed ASCII, which isn't entirely suitable as there are uppercase characters that it doesn't convert, which suggests yet another reason for the regression, namely that using zend_str_tolower would convert the characters that zend_str_tolower_nlc missed. As for the real reason why the bug reappeared, we can continue on in our historical examination. Revision 99001 ( http://svn.php.net/viewvc?view=revisionrevision=99001, also authored by iliaa) replaced zend_str_tolower with zend_str_tolower_nlc, making all internal Zend case conversion use ASCII. iliaa had this to say about the change (http://news.php.net/php.zend-engine.cvs/478): It appears that there no reason to keep both zend_str_tolower_nlc and zend_str_tolower. zend_str_tolower_nlc can be safely renamed to zend_str_tolower. The places it is used in, do not appear to depend on locale. For people who do need it there is an alternative php function php_strtolower, which they can use, which does respect the locale. So, if there are no