[GitHub] [spark] srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large

2019-12-05 Thread GitBox
srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-562385030 I'm OK with putting it in 2.4, I think. It's a minor behavior change, but, also appears to be

[GitHub] [spark] srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large

2019-12-04 Thread GitBox
srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-561703811 PS does this also solve your problem? this change sounds OK to me.

[GitHub] [spark] srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large

2019-12-02 Thread GitBox
srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-560431648 Ah right, disregard my previous comment. Am I right that the original implementation, being

[GitHub] [spark] srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large

2019-12-01 Thread GitBox
srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-560191791 Hm, that value isn't negative though, just very small. The next line, perhaps accidentally,

[GitHub] [spark] srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large

2019-12-01 Thread GitBox
srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-560173675 Hm, that's a crazy result. Something is wrong, to be sure. I can't imagine why just 5

[GitHub] [spark] srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large

2019-12-01 Thread GitBox
srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-560125533 BTW what are the exponents on these figures -- can you print more? or print their magnitude?

[GitHub] [spark] srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large

2019-11-30 Thread GitBox
srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-560018694 I see, so are you saying the weights are effectively N times larger with N partitions than 1?

[GitHub] [spark] srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large

2019-11-30 Thread GitBox
srowen commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-560012547 Hm, I don't think it's only cosine similarity that matters; these are often used in general in