This shouldn't be happening, do you have an example to reproduce it? On Thu, May 7, 2015 at 4:17 PM, rbolkey <rbol...@gmail.com> wrote:
> Hi, > > I have a question regarding one of the oddities we encountered while > running > mllib's column similarities operation. When we examine the output, we find > duplicate matrix entries (the same i,j). Sometimes the entries have the > same > value/similarity score, but they're frequently different too. > > Is this a known issue? An artifact of the probabilistic nature of the > output? Which output score should we trust (lower vs higher one when > different)? We're using a threshold of 0.3, and running Spark 1.3.1 on a 10 > node cluster. > > Thanks > Rick > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Duplicate-entries-in-output-of-mllib-column-similarities-tp22807.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >