From  MAHOUT-344 from the patch author:

The idea behind keyGroups is to concatenate hashes from multiple hash functions 
reduce the probability of collision between 2 users that agreed on 1 or more 
individual hash values. This essentially improves the average similarity of 
users in a cluster.

-Grant

On Nov 7, 2011, at 8:54 PM, Suneel Marthi wrote:

> Do we have an answer for this?
> 
> Sent from my iPhone
> 
> On Nov 2, 2011, at 7:20 AM, Grant Ingersoll <[email protected]> wrote:
> 
>> What's the Minhash key groups value used for in the MinhashDriver?  I mean, 
>> I see it is used for building up the key out of the hashed values, but 
>> what's the significance of different values for it?  The default is 2, what 
>> does it mean practically speaking if I choose, say, 10?  AFAICT, it would 
>> mean that I would have more clusters, assuming that we still meet the 
>> minimum cluster size imposed by the reducer?
>> 
>> Thanks,
>> Grant


Reply via email to