[ https://issues.apache.org/jira/browse/SPARK-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136113#comment-15136113 ]
holdenk commented on SPARK-13226: --------------------------------- Also update the links since the PDF linked to in the model has a new home http://www.cs.cmu.edu/~wcohen/postscript/icml2010-pic-final.pdf > MLLib PowerIteration Clustering depends on deprecated KMeans setRuns API > ------------------------------------------------------------------------ > > Key: SPARK-13226 > URL: https://issues.apache.org/jira/browse/SPARK-13226 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: holdenk > Priority: Trivial > > The current MLLib PowerIteration clustering implementation sets the number of > runs inside of the kmeans call to 5 (apparently arbitrary). This should > likely be replaced with a specific tolerance. > The reference implementation also appears to use a tolerance, so this would > also be moving closer to the reference implementation ( > http://www.cs.cmu.edu/~wcohen/ ) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org