> On 19/06/07, Tomas Kuliavas <[EMAIL PROTECTED]> wrote:
>> I don't care about Unicode support, because it breaks things. I suspect
>> that PHP6 Unicode extension won't give me controls that I have in PHP5
>> and
>> PHP4 strings. PHP6 Unicode support is not designed for international
>> environment. It is designed for nationalized environments and allows PHP
>> script developers to code in their native language. Code written in
>> French, Russian, Arabic, Japanese or Chinese is not international. Only
>> some people can read it. Only some people can see difference between ァ()
>> and ィ(). If I have to debug code written in Japanese or Arabic, language
>> is the main barrier in understanding the code.
>
> You're spreading FUD about PHP6s unicode support. Writing code in your
> own native language has nothing to do with the unicode support in
> PHP6. You can already do that in PHP4 if you use utf-8, since any
> sequence of codepoints > 127 translates to a byte-sequence that is a
> valid identifier for php.
> Have a look at http://www.icu-project.org/ to see what the unicode
> features in php6 are really about.
>
> Regards,
> Stefan (who also thinks that the switch is of no use,
> unicode_semantics should be on all the time. And at least it shouldn't
> be off in php.ini-dist and php.ini-recommended)

And you are trying to make sure that new features that you like are turned
on by default even when they break things for others. Some people are
proposing changes that will enforce your preferred options without leaving
any options to others. If I don't complain, PHP developers might do what
you are proposing just like they switched unicode_semantics to
PHP_INI_SYSTEM. I can't understand performance reasons without numbers
that prove that, but I can see when somebody is trying to break things in
my code and leaves me without any good options for fixing the code.

It is possible that I am not correct. Writing functions in native language
is one of key points in Andrei's presentation.
http://www.gravitonic.com/talks/, "Unicoding With PHP 6" php|tek 2007
Chicago. Slides 49, 50

Other key points are about evaluating string length (slide 25) and offsets
(slide 47), collation (slide 60,61) and strtoupper/strtolower/strcasecmp
(slide 66). String length and offsets can be implemented with PHP5
mbstring extension. If I use PHP5 strtolower/strtoupper/strcasecmp, I must
assume that they are locale aware. These functions don't follow LC_CTYPE=C
rules, when locale is not C.

It is possible that I am not correct and I will be able to update code and
make it work in PHP6, but in order to do that I will have to use language
constructs that are not backwards compatible with older PHP versions
(slide 26, 30). I won't be able to mix PHP6 code with PHP5 code and will
have to maintain two different library versions for lots of string
functions and lots of stream operations. I will have to spend my time in
order to make sure that code works, when I am not the one who broke it. I
have other bugs to fix and they have higher priority than fixing code
broken by others.

I have already tested code in PHP6 unicode_semantics=on. Thing broke on
password encryption and fix was to do something with binary typecasting. I
need more than information currently available in
http://www.php.net/manual in order to fix it. Then fputs calls freak out
with notices about downcoded buffers and I can't leave those notices
unfixed due to error_reporting = 2047 + display_errors = on coding
requirements. And I still haven't reached the point when code does 8bit
string decoding.

We are working on different code. You have code with some specific
character set and you can control all strings. My code works with
different character sets, different sources of 8bit data and I don't
controls those 8bit strings. My experience with PHP4/5 shows that I can
work with 8bit strings better than PHP interpreter. Interpreter wins only
when I have to work with large mapping tables and even then it is not
stable (iconv), not enabled by default (recode), limited (mbstring) or
very limited (utf8_decode).

Some day I will take some standalone library and will try to make it work
in PHP6 unicode_semantics=on. Maybe then I'll stop spreading the FUD. But
for now don't expect that I will remain silent, if you propose changes
that break things in PHP5 - PHP6 backwards compatibility. I know that
unicode_semantics=on breaks things in drastic ways or my experience is
based only on unstable PHP6 development code and RC versions will be
better.

-- 
Tomas

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to