Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19340
  
    I think some of the 'objections' in that link won't matter here. For 
example some point out that k-means inherently implies Euclidean distance; 
fine, we should really call this an instance of Lloyd's algorithm, but it 
doesn't matter much. Cosine distance isn't a distance metric either, and it's 
not obvious that Lloyd's converges when you pretend it is. I am not actually 
sure, though I have the impression it satisfies enough properties that it does 
in practice.
    
    That link also mentions that Matlab allows cosine distance. 
http://www.mathworks.com/help/stats/kmeans.html?s_tid=gn_loc_drop
    
    This aspect doesn't worry me so much.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to