Re: [PHP-DEV] Re: PHP Unicode support design document-keeping existing functionality

2005-08-25 Thread Makoto Tozawa
> If we don't make the functions provide reasonable behavior for unicode, then every program needs to be rewritten to change function names. I agree. I asked it because the Backwards Compatibility section states the following: "... the upgrade to Unicode-enabled PHP has to be transparent. Thi

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-25 Thread Makoto Tozawa
Tex Texin wrote: I wrote a little test php program that has 2 identical forms. You enter text in either form and it posts and displays the hex codes for the bytes. The first form does not set accept-charset, so it defaults to utf-8. The second form overrides the page encoding and sets accept-cha

RE: [PHP-DEV] Re: PHP Unicode support design document- encoding negotiation

2005-08-25 Thread Tex Texin
5 8:06 AM > To: Andrei Zmievski > Cc: Makoto Tozawa; [EMAIL PROTECTED]; PHP > Developers Mailing List > Subject: Re: [PHP-DEV] Re: PHP Unicode support design document > > > On Wed, 24 Aug 2005, Andrei Zmievski wrote: > > > I took a closer look at this today and RFC

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-25 Thread Adam Maccabee Trachtenberg
On Wed, 24 Aug 2005, Andrei Zmievski wrote: > I took a closer look at this today and RFC 2616 does not specify > whether user agents are supposed to send a charset parameter in the > Content-Type header of the POST request. I did not see any of my > browsers doing so. I think we can safely disrega

RE: [PHP-DEV] Re: PHP Unicode support design document-keeping existing functionality

2005-08-25 Thread Tex Texin
t; Subject: Re: [PHP-DEV] Re: PHP Unicode support design document > > > Andrei Zmievski wrote: > > >> Is there any way to keep the byte semantics (in oppose to unicode > >> semantics) > >> only for the existing functions? For example, the Oracle 8 > functi

RE: [PHP-DEV] Re: PHP Unicode support design document

2005-08-25 Thread Tex Texin
TECTED] > Sent: Wednesday, August 24, 2005 4:23 PM > To: Makoto Tozawa > Cc: [EMAIL PROTECTED]; PHP Developers Mailing List > Subject: Re: [PHP-DEV] Re: PHP Unicode support design document > > > Hi, > > On Aug 23, 2005, at 7:30 PM, Makoto Tozawa wrote: > > > &q

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-24 Thread Makoto Tozawa
Andrei Zmievski wrote: Is there any way to keep the byte semantics (in oppose to unicode semantics) only for the existing functions? For example, the Oracle 8 functions can be configured to use utf-8 for the character encoding of strings. In order for them to work properly, fundamental functio

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-24 Thread Andrei Zmievski
Hi, On Aug 23, 2005, at 7:30 PM, Makoto Tozawa wrote: "HTTP Input Encoding ... If the HTTP request contains the encoding specification in the headers, then it will be used instead of this setting." With my best knowledge there isn't such http request header which specifies the encoding of the

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-16 Thread Andrei Zmievski
Where should we save the script encoding from which an oparray was built? In the oparray itself? -Andrei On Aug 15, 2005, at 3:13 PM, Andi Gutmans wrote: If you want to optimize then I guess "remembering" the script_encoding is the only way to do it. We could do it similar to the way we "cach

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-16 Thread Andrei Zmievski
We certainly could, but we lose some speed, especially when script_encoding == output_encoding (where we don't really need to transcode HTML blocks). Are we up for that? -Andrei On Aug 15, 2005, at 3:03 PM, Andi Gutmans wrote: Wouldn't it be easiest to have inline html become IS_UNICODE and t

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-15 Thread Andi Gutmans
If you want to optimize then I guess "remembering" the script_encoding is the only way to do it. We could do it similar to the way we "cache" script file names. Another option is to just optimize for UTF-8 and use BOMs for UTF-8/UTF-16... Andi At 03:09 PM 8/15/2005 -0700, Rasmus Lerdorf wrote:

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-15 Thread Rasmus Lerdorf
I think the main issue here is that if your script encoding is set to UTF-8 and you do everything in UTF-8 then these large blocks of UTF-8 are going to make a UTF-8 -> UTF-16 -> UTF-8 conversion roundtrip on every request. It would be nice if we could somehow avoid that. -Rasmus Andi Gutmans wr

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-15 Thread Andi Gutmans
Wouldn't it be easiest to have inline html become IS_UNICODE and then not deal with the problem of remember what the script encoding was? I thought that's what we already do today. Andi At 12:37 PM 8/10/2005 -0700, Andrei Zmievski wrote: I did not have time to write the full reply earlier so

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Andrei Zmievski
On Aug 10, 2005, at 3:54 AM, Antony Dovgal wrote: Do we really need such kind of magic? I think it may be pretty confusing when after echo'ing or print'ing a variable you can see one output, but after writing the very same variable into a file you can see something completely different. Abs

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Andrei Zmievski
On Aug 10, 2005, at 3:45 AM, Ron Korving wrote: This looks very promising, I'm impressed by the work you guys have done (big thumbs up). Thanks. What about the other functions that output to stdout directly, such as readfile() and passthru()? readfile() uses streams so it would rely on st

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Adam Maccabee Trachtenberg
On Wed, 10 Aug 2005, Marcus Boerger wrote: > i had a chat with Andi about __toString() and i hope that he finally > undestood why a lot of ppl wanted it right from the beginning. To me the > current situation is simply the worst case because noone understnds when it > works and when not (. vs ,)

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Andrei Zmievski
I did not have time to write the full reply earlier so here goes. Even if we modify the output layer to be aware of various types of strings coming down the pipe, it would still need to know the encoding of IS_STRING's in order to convert them to the output encoding. This presents a particular

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Ron Korving
Sounds absolutely great :) Ron "Marcus Boerger" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Hello Ron, > > i had a chat with Andi about __toString() and i hope that he finally > undestood why a lot of ppl wanted it right from the beginning. To me the > current situation is si

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Marcus Boerger
Hello Ron, i had a chat with Andi about __toString() and i hope that he finally undestood why a lot of ppl wanted it right from the beginning. To me the current situation is simply the worst case because noone understnds when it works and when not (. vs ,). Since we are doing a drastic change in

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Ron Korving
I firmly believe though, that all outputting functions should act the same. It's the same problem otherwise as with __toString(). I would use __toString if it wasn't just restricted to echo and print, but right now it's pretty useless to me. I hope that behavior can change in a major version update

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Rasmus Lerdorf
Andrei Zmievski wrote: > We have not changed the underlying output mechanism. The transcoding is > done by zend_make_printable_zval(). Ok, but all the non-stream based output functions pass through that. Not that we have very many, but it is more than just echo/print. -Rasmus -- PHP Internals

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Andrei Zmievski
We have not changed the underlying output mechanism. The transcoding is done by zend_make_printable_zval(). -Andrei On Aug 10, 2005, at 7:30 AM, Rasmus Lerdorf wrote: Yeah, print/echo was just a way of describing the underlying output stuff. It wasn't meant to be taken literally. -Rasmus

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Andrei Zmievski
On Aug 10, 2005, at 7:26 AM, Andi Gutmans wrote: We need to automatically convert the output as internally we will be storing UTF-16 which is not what you want to send to the user. The SAPI output mechanism does the conversion, I don't think it's print & echo. It will actually save people a

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Antony Dovgal
Ah. Ok, then I'm happy =) On Wed, 10 Aug 2005 07:30:38 -0700 Rasmus Lerdorf <[EMAIL PROTECTED]> wrote: > Yeah, print/echo was just a way of describing the underlying output > stuff. It wasn't meant to be taken literally. -- Wbr, Antony Dovgal -- PHP Internals - PHP Runtime Development Ma

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Ron Korving
Exactly. That's how I understood it too: "Ah, the __toString behavior". I'm very glad this is not the case. Ron "George Schlossnagle" <[EMAIL PROTECTED]> schreef in bericht news:[EMAIL PROTECTED] > > On Aug 10, 2005, at 10:30 AM, Rasmus Lerdorf wrote: > > > Yeah, print/echo was just a way of des

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread George Schlossnagle
On Aug 10, 2005, at 10:30 AM, Rasmus Lerdorf wrote: Yeah, print/echo was just a way of describing the underlying output stuff. It wasn't meant to be taken literally. Given the __toString fiasco, it's understandable that this would be confusing though. George -- PHP Internals - PHP Runti

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Rasmus Lerdorf
Yeah, print/echo was just a way of describing the underlying output stuff. It wasn't meant to be taken literally. -Rasmus Andi Gutmans wrote: > We need to automatically convert the output as internally we will be > storing UTF-16 which is not what you want to send to the user. The SAPI > output

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Andi Gutmans
We need to automatically convert the output as internally we will be storing UTF-16 which is not what you want to send to the user. The SAPI output mechanism does the conversion, I don't think it's print & echo. It will actually save people a lot of headache that this is done automatically. As f

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Christian Schneider
Derick Rethans wrote: On Wed, 10 Aug 2005, Ron Korving wrote: "In order to create binary string literals, a new syntax is necessary: prefixing a string literal with letter 'b' creates a binary string." The b-prefix for binary strings is great, but how does that work with a function like file_ge

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Derick Rethans
On Wed, 10 Aug 2005, Ron Korving wrote: > "In order to create binary string literals, a new syntax is necessary: > prefixing a string literal with letter 'b' creates a binary string." > > The b-prefix for binary strings is great, but how does that work with a > function like file_get_contents() o

Re: [PHP-DEV] Re: PHP Unicode support design document

2005-08-10 Thread Antony Dovgal
On Wed, 10 Aug 2005 12:45:27 +0200 "Ron Korving" <[EMAIL PROTECTED]> wrote: > This looks very promising, I'm impressed by the work you guys have done (big > thumbs up). > > There are a few issues/questions I have after reading your document: > > > "Therefore, command such as 'print' and 'echo'