Re: The Case Against Autodecode

Jack Stouffer via Digitalmars-d Thu, 26 May 2016 09:36:50 -0700

On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescuwrote:

instead, it should use standard library algorithms forsearching,matching etc. When needed, iterating every code unit istrivially
done through indexing.

For an example where the std.algorithm/range functions don't cutit, my random format date string parser first breaks up the givencharacter range into tokens. Once it has the tokens, it checksseveral known formats. One piece of that is checking if some ofthe tokens are in AAs of month and day names for fast tests ofpresence. Because the AAs are int[string], and it's unknowablethe encoding of string (it's complicated), during tokenization,the character range must be forced to UTF-8 with byChar with allisSomeString!R == true inputs to avoid the auto-decoding andsubsequent AA key mismatch.

Agreed. This is probably the most glaring mistake. I think weshould open a discussion no fixing this everywhere in thestdlib, even at the cost of breaking code.

See the discussion here:https://issues.dlang.org/show_bug.cgi?id=14519


I think some of the proposals there are interesting.

Overall, I think the one way to make real steps forward inimproving string processing in the D language is to give aclear answer of what char, wchar, and dchar mean.

If you agree that iterating over code units and code points isn'twhat people want/need most of the time, then I will quotesomething from my article on the subject:

"I really don't see the benefit of the automatic behaviorfulfilling this one specific corner case when you're going tomake everyone else call a range generating function when theywant to iterate over code units or graphemes. Just make everyonecall a range generating function to specify the type of iterationand save a lot of people the trouble!"

I think the only clear way forward is to not make strings rangesand force people to make a decision when passing them to rangefunctions. The HUGE problem is the code this will break, which isjust about all of it.

Re: The Case Against Autodecode

Reply via email to