Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/19340
@srowen honestly I don't think that we should change current
implementation. Rapidminer, ELKI and nltk work like this. Matlab instead works
differently and does what suggested by @Kevin-Ferret and @zhengruifeng.
Anyway, it looks like a majority (@viirya, @Kevin-Ferret, @zhengruifeng )
think that the other solution is better. So I think that if we change it, we
should do basically the change suggested by @zhengruifeng and the normalization
of the centers, otherwise we would come out with an hybrid and unclear solution.
I can submit a follow up PR with this second solution and maybe we can
continue the discussion there. What do you think?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]