Re: Challenge: write a really really small front() for UTF8

dnspies Mon, 24 Mar 2014 10:16:42 -0700

It seems like mine wasn't being inlined because I had carelesslyreplaced char[] with const(char)[] in the function signature (Idon't know why that should make a difference, but it does)

Going back to my original version (with Andre's trick to avoidletting the loop unroll), I get

dnspies_orig   159 - 16 labels = 143 instructions

However, MFortin1 can be made to do better by taking successiveslices instead of indexing (and commenting out the check, sinceit no longer helps optimize).

MFortin1_slice 137 - 13 labels = 124 instructions

http://goo.gl/JAgVK1

On Monday, 24 March 2014 at 13:39:45 UTC, Michel Fortin wrote:

On 2014-03-24 06:08:30 +0000, "safety0ff"<[email protected]> said:
Everything seems to work in your corrected versions:
http://dpaste.dzfl.pl/3b32535559ba
Andrei's version doesn't compile on recent compiler versionsdue to goto skipping initialization.
Ok, so let's check what happens in actual usage. In otherwords, what happens when front is inlined and used in a loop.I'm counting the number of assembly lines of this main functionfor looping through each possible input using the variousimplementations:
http://goo.gl/QjnA2k

implementation   line count of main in assembly
andralex         285 -  9 labels = 276 instructions
MFortin1         135 - 13 labels = 122 instructions
MFortin2         150 - 14 labels = 136 instructions
safety0ff         -- not inlined (too big?) --
dnspies          161 - 15 labels = 146 instructions
CWilliams        233 - 25 labels = 208 instructions
For compactness, my first implementation seems to be thewinner, both with and without inlining.
That said, I think front should only assume a non-empty char[]and thus should check the length before accessing anythingafter the first byte to protect against incomplete multi-bytesequences such as [0x1000_0000]. So I added this line to my twoimplementations:
   if (1+tailLength >= s.length) return dchar.init;

Now lets see what happens with those changes:

http://goo.gl/XPCGYE

implementation   line count of main in assembly
MFortin1Check    103 - 11 labels =  92 instructions
MFortin2Check    135 - 13 labels = 122 instructions
Now I'm baffled. Adding a test makes the code shorter? Itactually make the standalone functions longer by 8 instructions(as expected), but the inlined version is shorter. My guess isthat the optimizer creates something entirely different and itturns out that this different version optimises better afterinlining.
That said, my test "main" function isn't very representative ofthe general case because the length is statically known by theoptimizer. Let's see what it does when the length is notstatically known:
http://goo.gl/E2Q0Yu

implementation   line count of main in assembly
andralex         384 -  9 labels = 375 instructions
MFortin1         173 - 19 labels = 154 instructions
MFortin2         182 - 18 labels = 164 instructions
safety0ff         -- not inlined --
dnspies           -- not inlined --
CWilliams        229 - 23 labels = 206 instructions
MFortin1Check    211 - 24 labels = 187 instructions
MFortin2Check    218 - 21 labels = 197 instructions
So again MFortin1 is the winner for compactness. Still, Imaintain that we ought to at least check the length of thearray for multi-byte sequences to protect against incompletesequences that could lay at the end of the string, so I'd favorMFortin1Check.

Re: Challenge: write a really really small front() for UTF8

Reply via email to