On Thursday, 13 October 2022 at 08:35:50 UTC, bauss wrote:
On Thursday, 13 October 2022 at 08:30:04 UTC, rikki cattermole
wrote:
On 13/10/2022 9:27 PM, bauss wrote:
This doesn't actually work properly in all languages. It will
probably work in most, but it's not entirely correct.
Ex. Turkish will not work with it properly.
Very interesting article:
http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html
Yes turkic languages, they require a state machine and quite a
bit of LUTs to work correctly.
You also need to provide a language and it has to operate on
the whole string, not individual characters.
I didn't think it was relevant since Ascii was in the original
post ;)
I think it's relevant when it comes to D since D is arguably a
unicode language, not ascii.
D should strive to be correct, rather than fast.
Oh and to add onto this, IFF you have to do it the hacky way,
then converting to uppercase instead of lowercase should be
preferred, because not all lowercase characters can perform round
trip, although a small group of characters, then using uppercase
fixes it, so that's a relatively easy fix. A round trip is
basically converting characters from one culture to another and
then back. It's impossible with some characters when converting
to lowercase, but should always be possible when converting to
uppercase.