On Monday, 25 August 2014 at 22:40:00 UTC, Ola Fosheim Grøstad
wrote:
On Monday, 25 August 2014 at 21:53:50 UTC, Ola Fosheim Grøstad
wrote:
I presume you can load 16 bytes and do BITWISE-AND on the MSB,
then match against string-end and carefully use this to boost
performance of simultanous UTF validation, escape-scanning,
and string-end scan. A bit tricky, of course.
I think it is doable and worth it…
https://software.intel.com/sites/landingpage/IntrinsicsGuide/
e.g.:
__mmask16 _mm_cmpeq_epu8_mask (__m128i a, __m128i b)
__mmask32 _mm256_cmpeq_epu8_mask (__m256i a, __m256i b)
__mmask64 _mm512_cmpeq_epu8_mask (__m512i a, __m512i b)
__mmask16 _mm_test_epi8_mask (__m128i a, __m128i b)
etc.
So you can:
1. preload registers with "\\\\\\\\…" , "\"\"…" and "\0\0\0…"
2. then compare signed/unsigned/equal whatever.
3. then load 16,32 or 64 bytes of data and stream until the
masks trigger
4. tests masks
5. resolve any potential issues, goto 3
D:YAML uses a similar approach, but with 8 bytes (plain ulong -
portable) to detect how many ASCII chars are there before the
first non-ASCII UTF-8 sequence, and it significantly improves
performance (didn't keep any numbers unfortunately, but it
decreases decoding overhead to a fraction for most inputs (since
YAML (and JSON) files tend to be mostly-ASCII with non-ASCII from
time to time in strings), if we know that we have e.g. 100 chars
incoming that are plain ASCII, we can use a fast path for them
and only consider decoding after that))
See the countASCII() function in
https://github.com/kiith-sa/D-YAML/blob/master/source/dyaml/reader.d
However, this approach is useful only if you decode the whole
buffer at once, not if you do something like foreach(dchar ch;
"asdsššdfáľäô") {}, which is the most obvious way to decode in D.
FWIW, decoding _was_ a significant overhead in D:YAML (again,
didn't keep numbers, but at a time it was around 10% in the
profiler), and I didn't like the fact that it prevented making my
code @nogc - I ended up copying chunks of std.utf and making them
@nogc nothrow (D:YAML as a whole is not @nogc but I use @nogc in
some parts basically as "@noalloc" to ensure I don't allocate
anything)