Re: MurmurHash3 behaviour

2016-08-20 Thread Ilya Yaroshenko via Digitalmars-d-learn
On Saturday, 20 August 2016 at 09:15:00 UTC, Ilya Yaroshenko 
wrote:

On Friday, 19 August 2016 at 20:28:13 UTC, Cauterite wrote:
Regarding the MurmurHash3 implementation in 
core.internal.hash, it is my understanding that:

// assuming a and b are uints
bytesHash([a, b], 0) == bytesHash([b], bytesHash([a], 0))
Is this correct?
I'm just not quite certain of this property when I try to read 
the code myself, and I don't know much about hash algorithms.


DRuntime has Murmurhash2, not 3.


Oh, maybe I am wrong. Anyway 128bit Murmurhash3 hash from 
std.digest would much faster on 64 bit CPUs.


Re: MurmurHash3 behaviour

2016-08-20 Thread Ilya Yaroshenko via Digitalmars-d-learn

On Friday, 19 August 2016 at 20:28:13 UTC, Cauterite wrote:
Regarding the MurmurHash3 implementation in core.internal.hash, 
it is my understanding that:

// assuming a and b are uints
bytesHash([a, b], 0) == bytesHash([b], bytesHash([a], 0))
Is this correct?
I'm just not quite certain of this property when I try to read 
the code myself, and I don't know much about hash algorithms.


DRuntime has Murmurhash2, not 3.


Re: MurmurHash3 behaviour

2016-08-20 Thread Seb via Digitalmars-d-learn

On Friday, 19 August 2016 at 21:29:43 UTC, Cauterite wrote:

On Friday, 19 August 2016 at 21:03:22 UTC, Seb wrote:

http://dlang.org/phobos-prerelease/std_digest_murmurhash.html


Ah great, I just finished writing my own murmurhash digest 
module ( 
https://github.com/Cauterite/phobos/blob/murmur/std/digest/murmur.d ), and now I discover someone's already done it.

Glad I wasted the last 3 hours of my life ;_;


Well, we all learn for life?
For example you could start to keep your fork up to date with 
upstream. It was merged two months ago ;-)


Re: MurmurHash3 behaviour

2016-08-19 Thread Cauterite via Digitalmars-d-learn

On Friday, 19 August 2016 at 21:03:22 UTC, Seb wrote:

http://dlang.org/phobos-prerelease/std_digest_murmurhash.html


Ah great, I just finished writing my own murmurhash digest module 
( 
https://github.com/Cauterite/phobos/blob/murmur/std/digest/murmur.d ), and now I discover someone's already done it.

Glad I wasted the last 3 hours of my life ;_;


Re: MurmurHash3 behaviour

2016-08-19 Thread Seb via Digitalmars-d-learn

On Friday, 19 August 2016 at 20:28:13 UTC, Cauterite wrote:
Regarding the MurmurHash3 implementation in core.internal.hash, 
it is my understanding that:

// assuming a and b are uints
bytesHash([a, b], 0) == bytesHash([b], bytesHash([a], 0))
Is this correct?
I'm just not quite certain of this property when I try to read 
the code myself, and I don't know much about hash algorithms.


Phobos nightly contains MurmurHash3 in std.digest (I don't know 
why there are two implementations though), but I think this one 
will be easier to use ;-)


http://dlang.org/phobos-prerelease/std_digest_murmurhash.html

The API for bytesHash is

size_t bytesHash(const(void)* buf, size_t len, size_t seed)

so shouldn't be?

auto arr = [a,b];
bytesHash(, arr.length, 42);



I'm just not quite certain of this property when I try to read


AFAIK the output of a hash isn't generally equal to the its 
upcoming seed as pre & postpreprocessing may apply. In the case 
of Murmurhash3 its post-processing (h1 ^= len):


void main() {
import std.stdio;
import core.internal.hash;
uint a = 4, b = 23;
auto arr = [a, b];

auto b1 = bytesHash(, arr.length, 42);

auto l = arr[0..1], r = arr[1..$];
auto b2a = bytesHash(, 1, 42);
auto b2b= bytesHash(, 1, b2a);
assert(b1 == b2b);
}

However with std.digest you can "save" the current state of a 
hash function and continue to give new items to the hasher:


void main()
{

import std.digest.murmurhash; // required Phobos nightly
uint a = 4, b = 23;
auto arr = [a, b];
auto l = arr[0..1], r = arr[1..$];

MurmurHash3!32 hasher;
hasher.put(cast(ubyte[]) l);
hasher.put(cast(ubyte[]) r);

auto v1 = hasher.finish();

hasher = MurmurHash3!32();
hasher.put(cast(ubyte[]) arr);
auto v2 = hasher.finish();

assert(v1 == v2);
}




MurmurHash3 behaviour

2016-08-19 Thread Cauterite via Digitalmars-d-learn
Regarding the MurmurHash3 implementation in core.internal.hash, 
it is my understanding that:

// assuming a and b are uints
bytesHash([a, b], 0) == bytesHash([b], bytesHash([a], 0))
Is this correct?
I'm just not quite certain of this property when I try to read 
the code myself, and I don't know much about hash algorithms.