On 14/07/17 15:32, Michael Torrie wrote:
On 07/14/2017 08:05 AM, Rhodri James wrote:
On 14/07/17 14:31, Marko Rauhamaa wrote:
Of course, UTF-8 in a bytes object doesn't make the situation any
better, but does it make it any worse?

Speaking as someone who has been up to his elbows in this recently, I
would say emphatically that it does make things worse.  It adds an extra
layer of complexity to all of the questions you were asking, and more.
A single codepoint is a meaningful thing, even if its meaning may be
modified by combining.  A single byte may or may not be meaningful.

Are you saying that dealing with Unicode in Google Go, which uses UTF-8
in memory, is adding an extra layer of complexity and makes things worse
than they might be in Python?

I'm not familiar with Go. If the programmer has to be aware that the she is using UTF-8 under the hood, then yes, it does add an extra layer of complexity. You have to remember the rules of UTF-8 as well as everything else.

--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to