Really? I'd have thought the remainders would be equally dispersed among 0,1,2...35. In fact, for divisions of the numbers 1 through 1000, there are 84 that have a remainder of each of the numbers 1 through 12, and 84 of each of the other numbers, including zero. You can demonstrate it in an Excel spreadsheet, in about 2 minutes.
David de Jongh -----Original Message----- From: IBM Mainframe Assembler List [mailto:[email protected]] On Behalf Of John Gilmore Sent: Thursday, November 01, 2012 5:15 PM To: [email protected] Subject: Re: Curosity Question If a hashing scheme is working well there is almost no clustering. Suppose we divide by 17, a prime, i.e., use it, in the jargon, as our hashing modulus.. Remainders will have one of the 17 values 0, 1, 2, . . . , 16. Then some goodly number of hashing operations the same or about the same number of of the hash values 0, 1, 2, . . . , 16 are generated, clustering does not occur. For concreteness, suppose we do 170 divisions. Then if clustering does not occur there are about ten remainders having the value 0, about 10 having the value 1, about 10 having the value 2, etc., etc. What happens when the divisor used is composite is that hash values that are prime factors of the divisor occur more frequently than others. For 36 we have 36 = 2 x 2 x 3 x 3, which is usually written 2^2 x 3^2 or 2**2 x 3**2. Its prime factors are 2 and 3; and when it is used as a divisor there are more remainders having the value 2 and the value 3 than there are having other pairs of values. 37, on the other hand, is prime, divisible only by 1 and itself. Its use as a divisor yields no clustering of remainders. Never hesitate to ask notional gurus such questions. A request for a further explanation is always in order. --jg
