ZhongYu created SPARK-24666:
-------------------------------

             Summary: Word2Vec generate infinity vectors when numIterations are 
large
                 Key: SPARK-24666
                 URL: https://issues.apache.org/jira/browse/SPARK-24666
             Project: Spark
          Issue Type: Bug
          Components: ML, MLlib
    Affects Versions: 2.3.1
         Environment:  2.0.X, 2.1.X, 2.2.X, 2.3.X
            Reporter: ZhongYu


We found that Word2Vec generate large absolute value vectors when numIterations 
are large, and if numIterations are large enough (>20), the vector's value many 
be *infinity(or -**infinity)***, resulting in useless vectors.

In normal situations, vectors values are mainly around -1.0~1.0 when 
numIterations = 1.

The bug is shown on spark 2.0.X, 2.1.X, 2.2.X, 2.3.X.

There are already issues report this bug: 
https://issues.apache.org/jira/browse/SPARK-5261 , but the bug fix works seems 
missing.

Other people's reports:

[https://stackoverflow.com/questions/49741956/infinity-vectors-in-spark-mllib-word2vec]

[http://apache-spark-user-list.1001560.n3.nabble.com/word2vec-outputs-Infinity-Infinity-vectors-with-increasing-iterations-td29020.html]

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to