On Tuesday, 11 October 2016 at 14:16:54 UTC, Andrei Alexandrescu
wrote:
On 10/11/2016 04:57 AM, Stefan Koch wrote:
Yours runs with 790 us best time.
bsr is a real timetaker :)
What inputs did you test it on?
https://github.com/minimaxir/big-list-of-naughty-strings/blob/master/blns.txt
Here's what I think would be a good set of requirements:
* The ASCII case should be short and fast: a comparison and a
branch, followed by return. This would improve a very common
case and address the main issue with autodecoding.
Already done
* For the multibyte case, the main requirement is the code must
be small. This is because it gets inlined all over the place
and seldom used.
* For the multibyte case, the fewer bytes in the encoding the
less work. This is because more frequent multi-byte characters
have generally lower codes.
That is why I had the branches, generally only the first one is
taken
Currently front() - the other time spender in autodecoding -
issues a function call on the multibyte case. That makes the
code of front() itself small, at the cost of more expensive
multibyte handling.
I think at some point we have to cache the length of the last
decoded char,
Otherwise we are throwing work away.
However that will only work within a RangeWrapper-Struct