On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
Here's a baseline: http://goo.gl/91vIGc. Destroy!

Andrei

On a bigendian machine with loose alignment requirements (1 byte), you can do this, which is down to 13 instructions on x86 (which is of course meaningless, what with it being the wrong endianess):

uint front(char[] s) {
  if(!(s[0] & 0b1000_0000)) return s[0]; //handle ASCII
  assert(s[0] & 0b0100_0000);
        
  if(s[0] & 0b0010_0000)
  {
    if(s[0] & 0b0001_0000)
    {
      assert(s.length >=4 && !(s[0] & 0b1000)
             && s[1] <= 0b1011_1111
             && s[2] <= 0b1011_1111
             && s[3] <= 0b1011_1111);
      return *(cast(dchar*)(s.ptr));
    }
assert(s.length >= 3 && s[1] <= 0b1011_1111 && s[2] <= 0b1011_1111);
    return *(cast(dchar*)(s.ptr)) >> 8;
  }
        
  assert(s.length >= 2 && s[1] <= 0b1011_1111);
  return *(cast(wchar*)(s.ptr));
}

http://goo.gl/Kf6RZJ


There may be architectures that can benefit from this.

Reply via email to