On 01/03/2015 20:59, Yasuo Ohgaki wrote:
However, I don't mind too much allowing any encoding stored in "Text"/ "UString" object. IIRC, Ruby does this and have not much problem.
As I understand it, Ruby's string type is actually a whole bunch of overloaded types, each responsible for re-implementing the various methods available. This leads to a whole bunch of "partially supported" encodings/codepages, which is a big pile of "leaky abstraction" for the small benefit of removing re-encoding operations in a few scenarios.
Unicode is explicitly designed to supersede all previous encodings, so it makes much perfect sense to me to use it to internally represent what the user just wants to think of as "text". The fact that within that internal representation you need some byte-level encoding then leads to the optimisation of using a byte-level encoding the user is likely to use as input and output, i.e. UTF-8.
Regards, -- Rowan Collins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php