> On 19/06/07, Tomas Kuliavas <[EMAIL PROTECTED]> wrote: >> I don't care about Unicode support, because it breaks things. I suspect >> that PHP6 Unicode extension won't give me controls that I have in PHP5 >> and >> PHP4 strings. PHP6 Unicode support is not designed for international >> environment. It is designed for nationalized environments and allows PHP >> script developers to code in their native language. Code written in >> French, Russian, Arabic, Japanese or Chinese is not international. Only >> some people can read it. Only some people can see difference between ァ() >> and ィ(). If I have to debug code written in Japanese or Arabic, language >> is the main barrier in understanding the code. > > You're spreading FUD about PHP6s unicode support. Writing code in your > own native language has nothing to do with the unicode support in > PHP6. You can already do that in PHP4 if you use utf-8, since any > sequence of codepoints > 127 translates to a byte-sequence that is a > valid identifier for php. > Have a look at http://www.icu-project.org/ to see what the unicode > features in php6 are really about. > > Regards, > Stefan (who also thinks that the switch is of no use, > unicode_semantics should be on all the time. And at least it shouldn't > be off in php.ini-dist and php.ini-recommended)
And you are trying to make sure that new features that you like are turned on by default even when they break things for others. Some people are proposing changes that will enforce your preferred options without leaving any options to others. If I don't complain, PHP developers might do what you are proposing just like they switched unicode_semantics to PHP_INI_SYSTEM. I can't understand performance reasons without numbers that prove that, but I can see when somebody is trying to break things in my code and leaves me without any good options for fixing the code. It is possible that I am not correct. Writing functions in native language is one of key points in Andrei's presentation. http://www.gravitonic.com/talks/, "Unicoding With PHP 6" php|tek 2007 Chicago. Slides 49, 50 Other key points are about evaluating string length (slide 25) and offsets (slide 47), collation (slide 60,61) and strtoupper/strtolower/strcasecmp (slide 66). String length and offsets can be implemented with PHP5 mbstring extension. If I use PHP5 strtolower/strtoupper/strcasecmp, I must assume that they are locale aware. These functions don't follow LC_CTYPE=C rules, when locale is not C. It is possible that I am not correct and I will be able to update code and make it work in PHP6, but in order to do that I will have to use language constructs that are not backwards compatible with older PHP versions (slide 26, 30). I won't be able to mix PHP6 code with PHP5 code and will have to maintain two different library versions for lots of string functions and lots of stream operations. I will have to spend my time in order to make sure that code works, when I am not the one who broke it. I have other bugs to fix and they have higher priority than fixing code broken by others. I have already tested code in PHP6 unicode_semantics=on. Thing broke on password encryption and fix was to do something with binary typecasting. I need more than information currently available in http://www.php.net/manual in order to fix it. Then fputs calls freak out with notices about downcoded buffers and I can't leave those notices unfixed due to error_reporting = 2047 + display_errors = on coding requirements. And I still haven't reached the point when code does 8bit string decoding. We are working on different code. You have code with some specific character set and you can control all strings. My code works with different character sets, different sources of 8bit data and I don't controls those 8bit strings. My experience with PHP4/5 shows that I can work with 8bit strings better than PHP interpreter. Interpreter wins only when I have to work with large mapping tables and even then it is not stable (iconv), not enabled by default (recode), limited (mbstring) or very limited (utf8_decode). Some day I will take some standalone library and will try to make it work in PHP6 unicode_semantics=on. Maybe then I'll stop spreading the FUD. But for now don't expect that I will remain silent, if you propose changes that break things in PHP5 - PHP6 backwards compatibility. I know that unicode_semantics=on breaks things in drastic ways or my experience is based only on unstable PHP6 development code and RC versions will be better. -- Tomas -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php