On Tuesday, 11 October 2016 at 14:16:54 UTC, Andrei Alexandrescu wrote:
On 10/11/2016 04:57 AM, Stefan Koch wrote:
Yours runs with 790 us best time.
bsr is a real timetaker :)

What inputs did you test it on?

https://github.com/minimaxir/big-list-of-naughty-strings/blob/master/blns.txt

Here's what I think would be a good set of requirements:

* The ASCII case should be short and fast: a comparison and a branch, followed by return. This would improve a very common case and address the main issue with autodecoding.
Already done
* For the multibyte case, the main requirement is the code must be small. This is because it gets inlined all over the place and seldom used.

* For the multibyte case, the fewer bytes in the encoding the less work. This is because more frequent multi-byte characters have generally lower codes.
That is why I had the branches, generally only the first one is taken
Currently front() - the other time spender in autodecoding - issues a function call on the multibyte case. That makes the code of front() itself small, at the cost of more expensive multibyte handling.
I think at some point we have to cache the length of the last decoded char,
Otherwise we are throwing work away.

However that will only work within a RangeWrapper-Struct

Reply via email to