subject:"\[SPARK ML\] Minhash integer overflow"

Re: [SPARK ML] Minhash integer overflow

2018-07-07 Thread Kazuaki Ishizaki

value generated by Spark with it generated by other implementations? Regards, Kazuaki Ishizaki From: Sean Owen To: jiayuanm Cc: dev@spark.apache.org Date: 2018/07/07 15:46 Subject:Re: [SPARK ML] Minhash integer overflow I think it probably still does its.job; the hash

Re: [SPARK ML] Minhash integer overflow

2018-07-07 Thread Sean Owen

I think it probably still does its.job; the hash value can just be negative. It is likely to be very slightly biased though. Because the intent doesn't seem to be to allow the overflow it's worth changing to use longs for the calculation. On Fri, Jul 6, 2018, 8:36 PM jiayuanm wrote: > Hi

Re: [SPARK ML] Minhash integer overflow

2018-07-06 Thread jiayuanm

Sure. JIRA ticket is here: https://issues.apache.org/jira/browse/SPARK-24754. I'll create the PR. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail:

Re: [SPARK ML] Minhash integer overflow

2018-07-06 Thread Kazuaki Ishizaki

@spark.apache.org Date: 2018/07/07 10:36 Subject:[SPARK ML] Minhash integer overflow Hi everyone, I was playing around with LSH/Minhash module from spark ml module. I noticed that hash computation is done with Int (see https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache

[SPARK ML] Minhash integer overflow

2018-07-06 Thread jiayuanm

Hi everyone, I was playing around with LSH/Minhash module from spark ml module. I noticed that hash computation is done with Int (see https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala#L69). Since "a" and "b" are from a uniform

Re: [SPARK ML] Minhash integer overflow

Re: [SPARK ML] Minhash integer overflow

Re: [SPARK ML] Minhash integer overflow

Re: [SPARK ML] Minhash integer overflow

[SPARK ML] Minhash integer overflow

5 matches

Site Navigation

Mail list logo

Footer information