On Sunday, 1 April 2018 at 01:19:08 UTC, auto wrote:
What is auto decoding and why it is a problem?

Auto-decoding is essentially related to UTF representation of Unicode strings. In D, `char[]` and `string` represent UTF8 strings, `wchar[]` and `wstring` represent UTF16 strings and `dchar[]` and `dstring` represent UTF32 strings. You need to know how UFT works in order to understand auto-decoding. Since in practice most code deals with UTF8 I'll explain wrt that. Essentially, the problem comes down to the fact that not all the Unicode characters are representable by 8 bit `char`s (for UTF8). Only the ASCII stuff is represented by the "normal" way. UTF8 uses the fact that the first few buts in a char are never used in ASCII, to tell how many more `char`s ahead that character is encoded in. You can watch this video for a better understanding[0]. By default though, if one were to traverse a `char` looking for characters, they would get unexpected results with Unicode data

Auto-decoding tries to solve this by automatically applying the algorithm to decode the characters to Unicode "Code-Points". This is where my knowledge ends though. I'll give you pros and cons of auto-decoding.

Pros:
* It makes Unicode string handeling much more easier for beginners.
 * Much less effort in general, it seems to "just work™"

Cons:
 * It makes string handling slow by default
* It may be the wrong thing, since you may not want Unicode code-points, but graphemes instead. * Auto-decoding throws exceptions on reaching invalid code-points, so all string
handling code in general throws exceptions.

If you want to stop auto-decoding, you can use std.string.representation like this:

import std.string : representation;
auto no_decode = some_string.representation;

Now no_decode wont be auto-decoded, and you can use it in place of some_string. You can also use std.utf to decode by graphemes instead.

You should also read this blog post: https://jackstouffer.com/blog/d_auto_decoding_and_you.html

And this forum post: https://forum.dlang.org/post/eozguhavggchzzruz...@forum.dlang.org

[0]: https://www.youtube.com/watch?v=MijmeoH9LT4

Reply via email to