On Thursday, 6 September 2018 at 07:23:57 UTC, Chris wrote:
On Wednesday, 5 September 2018 at 22:00:27 UTC, H. S. Teoh wrote:


//

Seriously, people need to get over the fantasy that they can just use Unicode without understanding how Unicode works. Most of the time, you can get the illusion that it's working, but actually 99% of the time the code is actually wrong and will do the wrong thing when given an unexpected (but still valid) Unicode string. You can't drive without a license, and even if you try anyway, the chances of ending up in a nasty accident is pretty high. People *need* to learn how to use Unicode properly before complaining about why this or that doesn't work the way they thought it should work.


T

Python 3 gives me this:

print(len("รก"))
1

and so do other languages.

The same Python 3 that people criticize for having unintuitive unicode string handling?

https://learnpythonthehardway.org/book/nopython3.html

Is it asking too much to ask for `string` (not `dstring` or `wstring`) to behave as most people would expect it to behave in 2018 - and not like Python 2 from days of yore? But of course, D users should have a "Unicode license" before they do anything with strings. (I wonder is there a different license for UTF8 and UTF16 and UTF32, Big / Little Endian, BOM? Just asking.)

Yes and no, unicode is a clusterf***, so every programming language is having problems with it.

So again, for the umpteenth time, it's the users' fault. I see. Ironically enough, it was the language developers' lack of understanding of Unicode that led to string handling being a nightmare in D in the first place. Oh lads, if you were politicians I'd say that with this attitude you're gonna the next election. I say this, because many times the posts by (core) developers remind me so much of politicians who are completely detached from the reality of the people. Right oh!

You have a point that it was D devs' ignorance of unicode that led to the current auto-decoding problem. But let's have some nuance here, the problem ultimately is unicode.

Reply via email to