Re: [SPARK ML] Minhash integer overflow

2018-07-07 Thread Kazuaki Ishizaki
Of course, the hash value can just be negative. I thought that it would be after computation without overflow. When I checked another implementation, it performs computations with int. https://github.com/ALShum/MinHashLSH/blob/master/LSH.java#L89 By copy to Xjiayuan, did you compare the hash

Re: [SPARK ML] Minhash integer overflow

2018-07-07 Thread Sean Owen
I think it probably still does its.job; the hash value can just be negative. It is likely to be very slightly biased though. Because the intent doesn't seem to be to allow the overflow it's worth changing to use longs for the calculation. On Fri, Jul 6, 2018, 8:36 PM jiayuanm wrote: > Hi