On Monday 09 July 2007, Stanislav Malyshev wrote: > > Do _I_ like that horrible IS_STRING/IS_UNICODE mess we have atm? No. > > I don't think there's any way of having both unstructured character data > and Unicode text represented without having two distinct types. Either > that or you'd have to tell on each step which one it is, and that would > suck much more. > > > I would love to have clean and easy PHP6 without all the > > "compatibility", which creates gazillion problems to both users and > > developers. > > Fixing unicode=on does not remove the IS_STRING/IS_UNICODE duality. We > still have two kinds of data - unstructured bit stream and structured > text. If we want strlen("превед") to return 6 - since that Russian word > has 6 characters - then we have no way but recognize that it's not just > a collection of bits but Unicode text, and that would require separate > type, as I see it. And as I see it, this is the source of the problems > when people try to operate on text as on bit stream and vice versa. > > Unless I totally missed what mess you are referring to...
I am coming into this discussion decidedly late here, so please thwap me gently if this is a FAQ. Do we have any idea of what percentage of strings in the "wild" would break if treated as Unicode vs. not? If 90% of the strings in use would work fine if treated as unicode, then it would make sense to just always assume Unicode unless explicitly specified otherwise. If 90% of the strings in use would die if treated as Unicode, then Unicode should probably be the exception and only when explicitly defined. I'm not liking the ghosts of magic_quotes I'm seeing implied here with different modes for the server to be in. That sounds like it would make writing code that works the same everywhere and is not ugly to read (crapload of markers or lots of conditionals) quite difficult. As I said, feel free to assuage my fear if appropriate. :-) -- Larry Garfield AIM: LOLG42 [EMAIL PROTECTED] ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php