Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20520#discussion_r167082120
  
    --- Diff: python/pyspark/ml/tests.py ---
    @@ -1620,6 +1621,23 @@ def test_kmeans_summary(self):
             self.assertEqual(s.k, 2)
     
     
    +class KMeansTests(SparkSessionTestCase):
    +
    +    def test_kmeans_cosine_distance(self):
    +        data = [(Vectors.dense([1.0, 1.0]),), (Vectors.dense([10.0, 
10.0]),),
    +                (Vectors.dense([1.0, 0.5]),), (Vectors.dense([10.0, 
4.4]),),
    +                (Vectors.dense([-1.0, 1.0]),), (Vectors.dense([-100.0, 
90.0]),)]
    +        df = self.spark.createDataFrame(data, ["features"])
    +        kmeans = KMeans(k=3, seed=1)
    +        kmeans.setDistanceMeasure("cosine")
    --- End diff --
    
    it was just to test that this method is working. Do you think it is better 
to switch to what you suggested?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to