[ccp4bb]: Free-R hash function

Joe Krahn Fri, 18 Nov 2005 08:59:16 -0800

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***

A while ago, I had the idea of defining a crystallographic test testwith a simple function that produces the same test flag for a given HKL,independent of how they are assigned. An integer hash function can do this.

A 32-bit integer hash is enough for most cases, but I decided that a64-bit hash is better. This allows for 3 16-bit HKL indices, plus one'seed' value. The test assignment procedure is to pack {seed,H,K,L} as a64-bit integer, and pass this through a 64-bit hash. If the hashfunction is good, the result is a 64-bit number that appears random, butis precisely defined for a given input value. We convert this to afloating-point number in the range 0-1, and convert that into a test-setvalue depending on the percentage choice.

The result is a well defined test array, based on the seed value andtest fraction, that is trivial to reproduce, and in fact never has to bewritten to a reflection file. This makes it easy to reliably maintainthe same test set for multiple data sets.

In the case of significant NCS-related reflections, the thin-shellselection method can also be written as an equation rather than as anarray of values. So, in both cases, it should be possible to utilizeFree-R sets by definition rather than writing out values.

Here is an example, using "Thomas Wang's 64-bit Mix" hash function. Itappears to work well.


Joe Krahn
------------------------------------------------------------------------

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>

// Thomas Wang's 64 bit Mix Hash Function
uint64_t hash64(uint64_t key) {
  key += ~(key << 32);
  key ^= (key >> 22);
  key += ~(key << 13);
  key ^= (key >> 8);
  key += (key << 3);
  key ^= (key >> 15);
  key += ~(key << 27);
  key ^= (key >> 31);
  return key;
}

double hash_hkl( uint16_t seed, int16_t h, int16_t k, int16_t l){
  uint64_t n;
  n = hash64( ((uint64_t)seed) << 48 | (((uint64_t)h) << 32)
      | (((uint64_t)k) << 16) | (((uint64_t)l)) );

  // Divide n by the max-value of a 64-bit unsigned int plus one.
  return (double)n / ( ((double) ~((uint64_t)0)) + 1.0);
}

[ccp4bb]: Free-R hash function

Reply via email to