*** For details on how to be removed from this list visit the ***
*** CCP4 home page http://www.ccp4.ac.uk ***
A while ago, I had the idea of defining a crystallographic test test
with a simple function that produces the same test flag for a given HKL,
independent of how they are assigned. An integer hash function can do this.
A 32-bit integer hash is enough for most cases, but I decided that a
64-bit hash is better. This allows for 3 16-bit HKL indices, plus one
'seed' value. The test assignment procedure is to pack {seed,H,K,L} as a
64-bit integer, and pass this through a 64-bit hash. If the hash
function is good, the result is a 64-bit number that appears random, but
is precisely defined for a given input value. We convert this to a
floating-point number in the range 0-1, and convert that into a test-set
value depending on the percentage choice.
The result is a well defined test array, based on the seed value and
test fraction, that is trivial to reproduce, and in fact never has to be
written to a reflection file. This makes it easy to reliably maintain
the same test set for multiple data sets.
In the case of significant NCS-related reflections, the thin-shell
selection method can also be written as an equation rather than as an
array of values. So, in both cases, it should be possible to utilize
Free-R sets by definition rather than writing out values.
Here is an example, using "Thomas Wang's 64-bit Mix" hash function. It
appears to work well.
Joe Krahn
------------------------------------------------------------------------
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
// Thomas Wang's 64 bit Mix Hash Function
uint64_t hash64(uint64_t key) {
key += ~(key << 32);
key ^= (key >> 22);
key += ~(key << 13);
key ^= (key >> 8);
key += (key << 3);
key ^= (key >> 15);
key += ~(key << 27);
key ^= (key >> 31);
return key;
}
double hash_hkl( uint16_t seed, int16_t h, int16_t k, int16_t l){
uint64_t n;
n = hash64( ((uint64_t)seed) << 48 | (((uint64_t)h) << 32)
| (((uint64_t)k) << 16) | (((uint64_t)l)) );
// Divide n by the max-value of a 64-bit unsigned int plus one.
return (double)n / ( ((double) ~((uint64_t)0)) + 1.0);
}