Hi Jamal, I don't believe there are pre-written algorithms for Cosine similarity or Pearson Porrelation in PySpark that you can re-use. If you end up writing your own implementation of the algorithm though, the project would definitely appreciate if you shared that code back with the project for future users to leverage!
Andrew On Thu, May 22, 2014 at 10:49 AM, jamal sasha <jamalsha...@gmail.com> wrote: > Hi, > I have bunch of vectors like > [0.1234,-0.231,0.23131] > .... and so on. > > and I want to compute cosine similarity and pearson correlation using > pyspark.. > How do I do this? > Any ideas? > Thanks >