[ https://issues.apache.org/jira/browse/SPARK-24666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16885302#comment-16885302 ]
ZhongYu commented on SPARK-24666: --------------------------------- Not any update for one year? > Word2Vec generate infinity vectors when numIterations are large > --------------------------------------------------------------- > > Key: SPARK-24666 > URL: https://issues.apache.org/jira/browse/SPARK-24666 > Project: Spark > Issue Type: Bug > Components: ML, MLlib > Affects Versions: 2.3.1 > Environment: 2.0.X, 2.1.X, 2.2.X, 2.3.X > Reporter: ZhongYu > Priority: Critical > > We found that Word2Vec generate large absolute value vectors when > numIterations are large, and if numIterations are large enough (>20), the > vector's value many be *infinity(or -**infinity)***, resulting in useless > vectors. > In normal situations, vectors values are mainly around -1.0~1.0 when > numIterations = 1. > The bug is shown on spark 2.0.X, 2.1.X, 2.2.X, 2.3.X. > There are already issues report this bug: > https://issues.apache.org/jira/browse/SPARK-5261 , but the bug fix works > seems missing. > Other people's reports: > [https://stackoverflow.com/questions/49741956/infinity-vectors-in-spark-mllib-word2vec] > [http://apache-spark-user-list.1001560.n3.nabble.com/word2vec-outputs-Infinity-Infinity-vectors-with-increasing-iterations-td29020.html] > > -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org