[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-12685: ------------------------------------ Assignee: Apache Spark (was: yuhao yang) > word2vec trainWordsCount gets overflow > -------------------------------------- > > Key: SPARK-12685 > URL: https://issues.apache.org/jira/browse/SPARK-12685 > Project: Spark > Issue Type: Bug > Components: MLlib > Affects Versions: 1.6.0 > Reporter: yuhao yang > Assignee: Apache Spark > Priority: Minor > Fix For: 2.0.0 > > > the log of word2vec reports > trainWordsCount = -785727483 > during computation over a large dataset. > I'll also add vocabsize to the log. > Update the priority as it will affects the computation process. > alpha = > learningRate * (1 - numPartitions * wordCount.toDouble / (trainWordsCount + > 1)) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org