I just found this vector distance idea in a technical paper: Create a space defined by X random vectors. For you data vectors, take the cosine distance to each random vector and save the sign of the value as a bit. This gives a bit set of X bits. There could be another distance and algorithm for picking the bit value.
The effect is to cease using numerical vectors as a "carrier signal" for the concept of "positions and distances". This is a different, more focused representation. And, Hamming distance is somewhat faster than Euclidean :) Of course, picking enough bits is a problem. However, I lost the paper. Does this ring a bell with anyone? -- Lance Norskog [email protected]
