On Friday, 19 August 2016 at 20:28:13 UTC, Cauterite wrote:
Regarding the MurmurHash3 implementation in core.internal.hash, it is my understanding that:
    // assuming a and b are uints
    bytesHash([a, b], 0) == bytesHash([b], bytesHash([a], 0))
Is this correct?
I'm just not quite certain of this property when I try to read the code myself, and I don't know much about hash algorithms.

Phobos nightly contains MurmurHash3 in std.digest (I don't know why there are two implementations though), but I think this one will be easier to use ;-)

http://dlang.org/phobos-prerelease/std_digest_murmurhash.html

The API for bytesHash is

size_t bytesHash(const(void)* buf, size_t len, size_t seed)

so shouldn't be?

auto arr = [a,b];
bytesHash(&arr, arr.length, 42);


I'm just not quite certain of this property when I try to read

AFAIK the output of a hash isn't generally equal to the its upcoming seed as pre & postpreprocessing may apply. In the case of Murmurhash3 its post-processing (h1 ^= len):

void main() {
    import std.stdio;
    import core.internal.hash;
    uint a = 4, b = 23;
    auto arr = [a, b];

    auto b1 = bytesHash(&arr, arr.length, 42);

    auto l = arr[0..1], r = arr[1..$];
    auto b2a = bytesHash(&l, 1, 42);
    auto b2b= bytesHash(&r, 1, b2a);
    assert(b1 == b2b);
}

However with std.digest you can "save" the current state of a hash function and continue to give new items to the hasher:

void main()
{

    import std.digest.murmurhash; // required Phobos nightly
    uint a = 4, b = 23;
    auto arr = [a, b];
    auto l = arr[0..1], r = arr[1..$];

    MurmurHash3!32 hasher;
    hasher.put(cast(ubyte[]) l);
    hasher.put(cast(ubyte[]) r);

    auto v1 = hasher.finish();

    hasher = MurmurHash3!32();
    hasher.put(cast(ubyte[]) arr);
    auto v2 = hasher.finish();

    assert(v1 == v2);
}


Reply via email to