On Sunday, 29 May 2016 at 20:40:52 UTC, qznc wrote:
On Sunday, 29 May 2016 at 18:15:16 UTC, qznc wrote:
On Sunday, 29 May 2016 at 17:38:17 UTC, Jonathan M Davis wrote:
And if you're not simply comparing for equality, what are you looking to figure out? Without more information about what you're trying to do, it's kind of hard to help you.

If I write the comparison naively, the assembly clearly shows a "movzbl" [0]. It loads a single byte! The other single byte load is encoded in the address mode of "cmp". Implementation:

bool stringcmp(string x, string y) {
  foreach(i; 0..x.length) {
    if (x[i] != y[i]) // byte compare
      return false;
  }
  return true;
}

It makes no sense to load single bytes here. Since we only want to check for equality, we could load two full words and compare four or eight bytes in one go.

Ok, to answer my own question, this looks good:

bool string_cmp_opt(immutable(ubyte)[] x, immutable(ubyte)[] y) {
    pragma(inline, false);
    if (x.length != y.length) return false;
    int i=0;
    // word-wise compare is faster than byte-wise
    if (x.length > size_t.sizeof)
        for (; i < x.length - size_t.sizeof; i+=size_t.sizeof) {
            size_t* xw = cast(size_t*) &x[i];
            size_t* yw = cast(size_t*) &x[i];
            if (*xw != *yw) return false;
        }
    // last sub-word part
    for (; i < x.length; i+=1) {
        if (x[i] != y[i]) // byte compare
            return false;
    }
    return true;
}

Any comments or recommendations?

I don't know if this would be faster, but here is my attempt.

It assumes the arrays start at an address multiple of 8.

if (x is y)
    return true;
if (x.length != y.length)
    return false;
size_t l = x.length;
ubyte* a = x.ptr, b = y.ptr;
for (size_t n = l>>3; n != 0; --n, a+=8, b+=8)
    if (*cast(long*)a ^ *cast(long*)b)
        return false;
if (l & 4)
{
    if (*cast(int*)a ^ *cast(int*)b)
        return false;
    a+= 4;
    b+= 4;
}
if (l & 2)
{
    if (*cast(short*)a ^ *cast(short*)b)
        return false;
    a+=2;
    b+=2;
}
return (l & 1) && (*a ^ *b) ? false : true;

If the pointers are not on an address multiple of 8, one has to inverse the trailing tests to consume the bytes in front of the array until the address becomes a multiple of 8.

The trailing tests could eventually be replaced by a simple sequential byte compare. I don't know which is faster.

Reply via email to