Re: Dicebot on leaving D: It is anarchy driven development in all its glory.

Chris via Digitalmars-d Thu, 06 Sep 2018 03:40:43 -0700

On Thursday, 6 September 2018 at 10:22:22 UTC, ag0aep6g wrote:

On 09/06/2018 09:23 AM, Chris wrote:
Python 3 gives me this:
print(len("á"))
1
Python 3 also gives you this:

print(len("á"))
2
(The example might not survive transfer from me to you ifUnicode normalization happens along the way.)
That's when you enter the 'á' as 'a' followed by U+0301(combining acute accent). So Python's `len` counts in codepoints, like D's std.range does (auto-decoding).

To avoid this you have to normalize and recompose any decomposedcharacters. I remember that Mac OS X used (and still uses?)decomposed characters by default, so when you typed 'á' into yourcli, it would automatically decompose it to 'a' + acute. `string`however returns len=2 for composed characters too. If you do alot of string handling it will come back to bite you sooner orlater.

Re: Dicebot on leaving D: It is anarchy driven development in all its glory.

Reply via email to